Data Import Splitting Feature test report (Orchid) Cornell Perf Env(cptf2)

Overview

The Data Import Task Force (DITF) implements a feature that splits large input MARC files into smaller ones, resulting in smaller jobs, so that the big files could be imported and be imported consistently. Previous tests for this feature were conducted (see report) but this particular set of tests were performed on the Cornell Perf environment which has proven to be challenging for Data Import jobs in the past.  We performed the baseline tests and tests with this feature enabled using the most optimal RECORDS_PER_SPLIT_FILE(RPSF) value found in previous experiments. 

Summary

  • Differences in processing time Before the deployment of DITF's file-splitting feature and after DITF's file-splitting feature was deployed are:
    • File_7518   records 2 minutes 40 seconds (more than baseline);
    • File_67522 records 11 minutes (more);
    • File_7518  records during CICO test (25VU)  3 minutes 7 seconds (more)

These are not huge degradations from the baseline tests.

  • Instance CPU utilization for all 3 tests, with two DITF's file-splitting feature configurations, were the same, Maximal CPU utilization was about 43% after deployment of DITF's file-splitting feature for Test 3.
  • Service CPU utilization for all 3 tests, with two DITF's file-splitting feature configurations, were the same. Some of the services after deployment of DITF's file-splitting feature have less CPU utilization than before deployment  for about 5%.  nod-invoice-b service reached about 300 percent of CPU utilization (due to health checking requests).
  • Memory utilization. No memory leaks are suspected for all instances and modules.
  • Approximately DB CPU usage is up to 96%.
  • No negative impact on CICO processes after  DITF's file-splitting feature was deployed.

Recommendations and Jiras

  • In addition to the above information, it is advisable to follow the recommendations provided in the test report Data Import Splitting Feature test report (Orchid) ocp3 + retesting Poppy FSF and RTR;
  • Investigate the problem related to the "mod-invoice-b" service. During the DI tests, it was observed that the CPU utilization of the service reached approximately 300 percent and did not return to normal even after the tests were completed. Upon analyzing the logs, it was discovered that the service was still responding to health-checking requests. This behavior indicates an issue with the service's performance or resource management. It is recommended to conduct a thorough investigation to identify the root cause of this abnormal CPU utilization and ensure that appropriate measures are taken to rectify the problem. The issue with the "mod-invoice-b" service's CPU utilization does not impact the results of the performance testing, since this service is not engaged to the import date process. 


Results


Before deploy of DITF's file-splitting feature After deploy of DITF's file-splitting feature 

Profile

Duration Status
Profile

Duration Status
Test1. DI File_7518 RecordsEBSCO ebooks new and updated28 min 11 secCompleted*EBSCO ebooks new and updated30 min 50 secCompleted*

Test2. DI File_67522 RecordsEBSCO ebooks new and updated1 hour 4 minCompleted**EBSCO ebooks new and updated1 hour 15 minCompleted
Test 3. CICO test (25 users) + DI File_7518 RecordsEBSCO ebooks new and updated35 min 12 secCompletedEBSCO ebooks new and updated38 min 19 secCompleted

*During the test, 2 errors occurred

 Click here to expand...

*Data import of  File_7518 Records

**Data import of  File_67522 Records

All the errors have the same message "org.folio.processing.exceptions.MatchingException: Found multiple records matching specified conditions"
The error message "org.folio.processing.exceptions.MatchingException: Found multiple records matching specified conditions" is most likely a functional issue rather than a performance issue. This error suggests that there are multiple records that meet the specified conditions, which could indicate a problem with records in Data Import files.
The number of errors encountered during the testing process was relatively small, and it did not have a significant impact on the overall test results. These errors were minor in nature and did not affect the accuracy or validity of the performance test outcomes.

Instance CPU Utilization 

Testing before deployment of DITF's file-splitting feature

Test 1.1. Maximal CPU utilization on instance was about 28%

Test 1.2. Maximal CPU utilization on instance was about 26%

Test 1.3. Maximal CPU utilization on instance was about 40%

 

Testing after deployment of DITF's file-splitting feature 

Test 1.1. Maximal CPU utilization on instance was about 27%

Test 1.2. Maximal CPU utilization on instance was about 25%

Test 1.3. Maximal CPU utilization on instance was about 43%

Service CPU Utilization 

After DI tests started the mod-invoice-b service reached about 300 percent of CPU utilization and didn`t stop after the tests were finished. Logs analysis showed that the service was responding on health-checking requests. 

Testing before deployment of DITF's file-splitting feature 


Testing after deployment of DITF's file-splitting feature 

Service Memory Utilization

Memory usage was stable over three tests no memory leak is suspected for DI modules, except mod-inventory-b service.

Testing before deployment of DITF's file-splitting feature 


Testing after deployment of DITF's file-splitting feature 

RDS CPU Utilization

Approximately DB CPU usage is up to 95%

RDS Database Connections



Test 3 With CI/CO 25 users and DI 7.5k records on 1 tenant


Response time (Average)
 DI  without CICO

Before Splitting Feature Deployed

Response time (Average)
CICO + DI

Before Splitting Feature Deployed

Response time (Average)
 DI  without CICO

After Splitting Feature Deployed

Response time (Average)
CICO + DI

After Splitting Feature Deployed

Check-In0.525s1.23s0.571s1.21s
Check-Out1.01s1.97s1.1s1.93s





Additional information "time of file splitting" 

 Test 1) During CICO a 7500K record file was uploaded at "2023-10-05 12:15:40.744+00" The first split file started processing at "2023-10-05 12:15:43.03+00"
Response time graph

Service CPU utilization



Test 2) During CICO a 67522 record file was uploaded at "2023-10-06 12:41:18.392+00" The first split file started processing at "2023-10-06 12:41:57.204+00"

Response time graph


Service CPU utilization

Appendix

Records count :

  • fs00001034_mod_source_record_storage.marc_records_lb = 17901545
  • fs00001034_mod_source_record_storage.raw_records_lb = 17954316
  • fs00001034_mod_source_record_storage.records_lb = 17959758
  • fs00001034_mod_source_record_storage.marc_indexers =  1001685192
  • fs00001034_mod_inventory_storage.authority = 0
  • fs00001034_mod_inventory_storage.holdings_record = 11401620
  • fs00001034_mod_inventory_storage.instance = 10566424
  • fs00001034_mod_inventory_storage.item = 10013571

PTF -environment CPTF2 

  • 11 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1 
  • 2 database  instances, one reader, and one writer

    NameAPI NameMemory GIBvCPUsmax_connections
    R6G Extra Largedb.r6g.xlarge32 GiB4 vCPUs2731
  • MSK tenant
    • 2 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
      log.retention.minutes=480
      default.replication.factor=3


ModuleTask Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSizeR/W split enabled
cptf2-pvt
Fri Oct 06 11:47:16 UTC 2023
mod-remote-storage12mod-remote-storage:2.0.324920447210243960512512TRUE
mod-agreements7mod-agreements:5.5.2215921488128968384512FALSE
mod-data-import8mod-data-import:2.7.2-SNAPSHOT.1521204818442561292384512FALSE
mod-inventory-storage5mod-inventory-storage:26.0.024096369020483076384512TRUE
mod-user-import5mod-user-import:3.7.22102489612876888128FALSE
mod-circulation-storage5mod-circulation-storage:16.0.122880259215361814384512TRUE
mod-calendar5mod-calendar:2.4.22102489612876888128FALSE
mod-source-record-storage22mod-source-record-storage:5.6.1025600500020483500384512FALSE
mod-event-config5mod-event-config:2.5.02102489612876888128FALSE
mod-courses5mod-courses:1.4.72102489612876888128FALSE
mod-inventory7mod-inventory:20.0.622880259210241814384512FALSE
mod-pubsub5mod-pubsub:2.9.12153614401024922384512TRUE
mod-circulation6mod-circulation:23.5.6228802592153676888128FALSE
mod-di-converter-storage6mod-di-converter-storage:2.0.52102489612876888128FALSE
mod-users5mod-users:19.1.12102489612876888128TRUE
mod-patron-blocks5mod-patron-blocks:1.8.021024896102476888128TRUE
mod-source-record-manager25mod-source-record-manager:3.7.0-SNAPSHOT.25825600500020483500384512FALSE
nginx-edge10nginx-edge:2023.06.1421024896128000FALSE
mod-quick-marc5mod-quick-marc:3.0.01228821761281664384512FALSE
nginx-okapi10nginx-okapi:2023.06.1421024896128000FALSE
okapi-b12okapi:5.0.13168414401024922384512FALSE
mod-patron5mod-patron:5.5.22102489612876888128FALSE
mod-data-export5mod-data-export:4.7.111024896102476888128FALSE
mod-notes5mod-notes:5.0.121024896128952384512FALSE
pub-okapi10pub-okapi:2023.06.142102489612876800FALSE

Methodology/Approach

During the performance testing of DITF's file-splitting feature modules, the tests with the following configurations were run: 

1 primary tenant, 3 baseline tests performed one after the other in sequence
    Test 1) CICO 25 VU, 250 sec of ramp-up period, 3600-sec test duration + DI file with 7518 records (20230307monosLoad.mrc) about 20 minutes after CICO started on one tenant only with profile "EBSCO ebooks new and updated".

    Test 2) Small DI file with 7518 records (20230307monosLoad.mrc) on one tenant only with profile "EBSCO ebooks new and updated".

    Test 3) Large DI file with 67522 records (202301monos_me_processed.mrc).

After DITF's file-splitting feature was deployed and enabled, the number for RECORDS_PER_SPLIT_FILE was set to 1000, and the baseline tests were repeated.