[Quesnelia] [ECS] [Data import] Create MARC authority Records
Overview
In the scope of PERF-854: [Quesnelia] [ECS] [Data import] Create MARC authority RecordsClosed [Quesnelia] [ECS] [Data import] Create MARC authority Records
it's needed to run tests to answer questions:
Determine the time it takes to complete Data-import on QCON env
Determine the main modules that are involved in the process (if obvious or if known)
Test specific settings or items or scenarios:
- Check-in and Checkout (CICO) is in progress and there are 5 concurrent users,
- Run DI for 50K-25K-10K-5k-1k on 3 tenants in parallel in the same cluster.
Summary
Test 1. Data import of 1K, 5K, 10K, 25K and 50K and Test 2 Data import of 1K, 5K, 10K, 25K and 50K + 5 Virtual users CICO finished successfully without errors or issues. Test 3. MultitenantDI failed to finis jobs in 12 hours for 50K records on 2nd and 3rd tenants.
General summary. Data Import MARC Authorities work on Quesnelia ECS like on Quesnelia non-ECS and Poppy. DI duration is about the same and resource utilization has the same behavior as in the previous releases except mod-entitylink service that consume 80% of CPU in comparison to 35% in Poppy.
Data-import of 1K, 5K, 10K, 25K and 50K records during test 1 and test 2(without CICO and with CICO) have about the same duration. So no significant impact of CICO(5VU) on DI proccess.
On Poppry release environment was stable to process 5 DI jobs in parallel on 3 tenants, on Quesnelia - ECS duration of completed jobs were 4-7 time longer. 50K DI failed to finished on 2nd and 3rd tenants, it was stopped manually after 12 hours.
Recommendations & Jiras (Optional)
Retest after this issue will be fixed MODDATAIMP-1064: Data import Create MARC authority doesn't finish during multitenant testingClosed . Retesting results were added to the ticket as a commet.
Test Runs
Important! Before each test, data-import MRC files were prepared using python script ensure creation of MARC Authority records.
Test 1. DI Central tenant 1k-5K-10K-25K-50K. Data-import of MARC authority Records on the central tenant in order 1k-5k-10k-25k-50k using the created job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q
Test 2. DI Central tenant 1k-5K-10K-25K-50K + CI/CO 5VU. Run CICO for 5 concurrent users, duration about 1 hour, rump period 50 seconds. Data-import 1K, 5K, 10K, 25K, and 50K started on the central tenant in parallel with the same profile.
Test 3. Multitenant DI 50K-25K-10K-5K-1K. Run Data-import of MARC authority Records in parallel on the central, second, and third tenants in order from 50k, 25k, 10k, 5k, 1k
using job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q
Test Results and Comparison
Reports for results comparison:
Data Import MARC Authorities [Poppy] - Data Import MARC Authorities [Poppy]
Data Import MARC Authorities [Quesnelia NON-ECS] - [Quesnelia] [non-ECS] [Data import] Create MARC authority Records
Test 1. DI Central tenant 1k-5K-10K-25K-50K.
Test with 1k, 10k, 25k and 50k records files DI started on one tenant only(qcp1-00), and comparative results between Poppy and Quesnelia.
# of records | % creates | DI duration | DI duration | DI duration | DI duration | DI duration | DI duration |
---|---|---|---|---|---|---|---|
1,000 | 100 | 24 s | 27 s | 41 sec | 29 sec | 22 sec | 25 sec |
5,000 | 100 | 1 min 21 s | 1 min 15 s | 1min 21s | 1 min 38 sec | 1 min 19 sec | 1 min 23 sec |
10,000 | 100 | 2 min 32 s | 2 min 31 s | 2min 53s | 2 min 53 sec | 2 min 36 sec | 2 min 43 sec |
25000 | 100 | 11 min 14 s | 7 min 7 s | 5 min 42s | 6 min 24 sec | 6 min 19 sec | 6 min 27 sec |
50,000 | 100 | 22 min | 11 min 24 s | 11 min 11s | 13 min 48 sec | 11 min 59 sec | 11 min 45s |
Test 2. DI Central tenant 1k-5K-10K-22K-50K + CI/CO 5VU.
Number of records | DI Duration with CICO Poppy | DI Duration with CICO | DI Duration with CICO | CI Avg time | CI, Avg time without DI | CO time Avg | CO, Avgtime without |
---|---|---|---|---|---|---|---|
1,000 | 35 sec | 20 sec | 21 sec | 0.870 sec |
600 ms
| 1.361 sec |
1100 ms
|
5,000 | 1 min 41 sec | 1 min 19 sec | 1 min 09 sec | 0.878 sec | 1.772 sec | ||
10,000 | 3 min 4 sec | 2 min 35 se | 2 min 17 se | 0.955 sec | 1.905 sec | ||
25,000 | 6 min 32 sec | 6 min 26 sec | 6 min 20 sec | 0.970 sec | 1.920 sec | ||
50,000 | 13 min 48 sec | 12 min 16 sec | 13 min 49 sec | 1.040 sec | 1.907 sec |
Test 3. Multitenant DI 50K-25K-10K-5K-1K.
Test3 | Num of records | Tenant 1 | Tenant 2 | Tenant 3 |
---|---|---|---|---|
1,000 | 2 hr 50 min | 1 hr 44 min | 17 min | |
5,000 | 2 hr 56 min | 1 hr 29 min | 18 min | |
10,000 | 3 hr 07 min | 2 hr 28 min | 28 min | |
25,000 | 6 hr 20 min | 4 hr 57 min | 2 hr 02 min | |
50,000 | 6 hr 25 min | Stopped after 14 hours | Stopped after 14 hours |
Resource utilization
Test 1. DI Central tenant 1k-5K-10K-25K-50K
CPU Utilization.
Memory utilization.
Memory utilization was stable during the tests. No memory leaks
RDS utilization
During 1K DI maximal CPU utilization was about 45%, 5K- 86%, 10K, 25K and 50K - 95%
DB connections
DB connections were changing values from 550 connection to 870
Database load
Top SQL
Open Search metrics
Master node CPU utilization
CPU utilization percentage for the master node reached maximum 35%
Maximum CPU utilization percentage for all data nodes.
Maximum memory usage percentage for all data nodes.
Amazon Managed Streaming
Test 2. DI Central tenant 1k-5K-10K-25K-50K + CI/CO 5VU
CPU Utilization.
Memory utilization.
Memory utilization was stable during the tests. No memory leaks.
RDS utilization
During 1K DI maximal CPU utilization was about 35%, 5K- 76%, 10K, 25K and 50K - 95%
DB connections
Database load
Top SQL
Maximum CPU utilization percentage for all data nodes.
Maximum memory usage percentage for all data nodes.
Amazon Managed Streaming
Test 3. Multitenant DI 50K-22K-10K-5K-1K.
Top 10 CPU consumption services
Memory utilization.
Memory utilization was stable during the tests. No memory leaks
RDS utilization
During the test DB CPU utilization grow up to 100% and in 8 hours came back to its “Before test state“
DB connections
At he begging of the test number of DB connection growed up from 580 to 1000 and during next 8 hours was going wown to it started value.
Database load
Top SQL
Open Search metrics
Master node CPU utilization
CPU utilization percentage for the master node reached maximum 47%
Maximum CPU utilization percentage for all data nodes.
Maximum memory usage percentage for all data nodes.
Amazon Managed Streaming
Appendix
Infrastructure
PTF - environment Quesnelia (QCON)
10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1 ]
1 instance of db.r6.xlarge database instance: Writer instance
OpenSearch
domain: fse
Number of nodes: 9
Version: OpenSearch_2_7_R20240502
MSK - tenat
4 kafka.m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
Kafka consolidated topics enabled
Quesnelia modules memory and CPU parameters
Module | Task Def. Revision | Module Version | Task Count | Mem Hard Limit | Mem Soft limit | CPU units | Xmx | MetaspaceSize | MaxMetaspaceSize |
---|---|---|---|---|---|---|---|---|---|
qcon-pvt |
|
|
|
|
|
|
|
|
|
pub-okapi | 1 | 2 | 1024 | 896 | 128 | 768 | 0 | 0 | |
pub-edge | 1 | 2 | 1024 | 896 | 128 | 768 | 0 | 0 | |
okapi-b | 1 | 3 | 1684 | 1440 | 1024 | 922 | 384 | 512 | |
nginx-okapi | 1 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 | |
nginx-edge | 1 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 | |
mod-users-bl | 1 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 | |
mod-users | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-user-import | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-template-engine | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-tags | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-source-record-storage | 1 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | |
mod-source-record-manager | 1 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | |
mod-settings | 1 | 2 | 1024 | 896 | 200 | 768 | 88 | 128 | |
mod-service-interaction | 1 | 2 | 2048 | 1844 | 256 | 1290 | 384 | 512 | |
mod-serials-management | 1 | 2 | 2480 | 2312 | 128 | 1792 | 384 | 512 | |
mod-sender | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-search | 9 | 2 | 2592 | 2480 | 2048 | 1440 | 512 | 1024 | |
mod-rtac | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-remote-storage | 1 | 2 | 4920 | 4472 | 1024 | 3960 | 512 | 512 | |
mod-quick-marc | 1 | 1 | 2288 | 2176 | 128 | 1664 | 384 | 512 | |
mod-pubsub | 1 | 2 | 1536 | 1440 | 1024 | 922 | 384 | 512 | |
mod-permissions | 2 | 2 | 1684 | 1544 | 512 | 1024 | 384 | 512 | |
mod-patron-blocks | 1 | 2 | 1024 | 896 | 1024 | 768 | 88 | 128 | |
mod-patron | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-password-validator | 1 | 2 | 1440 | 1298 | 128 | 768 | 384 | 512 | |
mod-organizations-storage | 1 | 2 | 1024 | 896 | 128 | 700 | 88 | 128 | |
mod-organizations | 1 | 2 | 1024 | 896 | 128 | 700 | 88 | 128 | |
mod-orders-storage | 1 | 2 | 1024 | 896 | 512 | 700 | 88 | 128 | |
mod-orders | 1 | 2 | 2048 | 1440 | 1024 | 1024 | 384 | 512 | |
mod-oai-pmh | 1 | 2 | 4096 | 3690 | 2048 | 3076 | 384 | 512 | |
mod-notify | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-notes | 1 | 2 | 1024 | 896 | 128 | 952 | 384 | 512 | |
mod-ncip | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-login-saml | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-login | 1 | 2 | 1440 | 1298 | 1024 | 768 | 384 | 512 | |
mod-lists | 1 | 2 | 3000 | 2600 | 128 | 2048 | 384 | 512 | |
mod-licenses | 1 | 2 | 2480 | 2312 | 128 | 1792 | 384 | 512 | |
mod-kb-ebsco-java | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-invoice-storage | 1 | 2 | 1872 | 1536 | 1024 | 1024 | 384 | 512 | |
mod-invoice | 1 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 | |
mod-inventory-update | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-inventory-storage | 2 | 2 | 4096 | 3690 | 2048 | 3076 | 384 | 512 | |
mod-inventory | 3 | 2 | 2880 | 2592 | 1024 | 1814 | 384 | 512 | |
mod-graphql | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-gobi | 1 | 2 | 1024 | 896 | 128 | 700 | 88 | 128 | |
mod-fqm-manager | 1 | 2 | 3000 | 2600 | 128 | 2048 | 384 | 512 | |
mod-finance-storage | 1 | 2 | 1024 | 896 | 1024 | 700 | 88 | 128 | |
mod-finance | 1 | 2 | 1024 | 896 | 128 | 700 | 88 | 128 | |
mod-feesfines | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-event-config | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-eusage-reports | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-erm-usage-harvester | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-erm-usage | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-entities-links | 1 | 2 | 2592 | 2480 | 400 | 1440 | 0 | 1024 | |
mod-email | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-ebsconet | 1 | 2 | 1248 | 1024 | 128 | 700 | 128 | 256 | |
mod-di-converter-storage | 2 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-dcb | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-data-import | 1 | 1 | 2048 | 1844 | 256 | 1292 | 384 | 512 | |
mod-data-export-worker | 1 | 2 | 3072 | 2048 | 1024 | 2048 | 384 | 512 | |
mod-data-export-spring | 1 | 1 | 2048 | 1844 | 256 | 1536 | 384 | 512 | |
mod-data-export | 1 | 1 | 2048 | 1524 | 1024 | 0 | 0 | 0 | |
mod-courses | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-copycat | 1 | 2 | 1024 | 512 | 128 | 768 | 88 | 128 | |
mod-consortia | 1 | 2 | 3072 | 2048 | 128 | 2048 | 512 | 1024 | |
mod-configuration | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-circulation-storage | 1 | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 | |
mod-circulation-item | 1 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 | |
mod-circulation | 2 | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 | |
mod-calendar | 1 |
| 2 | 1024 | 896 | 128 | 768 | 88 | |
mod-bulk-operations | 1 | 2 | 3072 | 2600 | 1024 | 1536 | 384 | 512 | |
mod-batch-print | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-authtoken | 3 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 | |
mod-audit | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
mod-agreements | 1 | 2 | 1592 | 1488 | 128 | 0 | 0 | 0 | |
edge-rtac | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
edge-patron | 1 | 2 | 1024 | 896 | 256 | 768 | 88 | 128 | |
edge-orders | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
edge-oai-pmh | 1 | 2 | 1512 | 1360 | 1024 | 1440 | 384 | 512 | |
edge-ncip | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
edge-fqm | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
edge-dematic | 1 | 1 | 1024 | 896 | 128 | 768 | 88 | 128 | |
edge-dcb | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
edge-courses | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
edge-connexion | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | |
edge-caiasoft | 1 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 |
Methodology/Approach
Before each test files for Marc Aytority creation must created using python script https://github.com/folio-org/perf-testing/tree/master/workflows-scripts/data-import/MARC_authority
Test 1: Manually tested 1k, 10k, 25k and 50k records files DI started on the central tenant only.
Test 2: Manually tested 1k, 10k, 25k and 50k records files DI started on central tenant. In parralel was started Check-in and Checkout (CICO) for 5 concurrent users.
Test 3: Manually tested DI started on 3 tenants concurrently. Order for load file without pause between files: 50k, 25k, 10k, 5k, and 1k for order tenants : Tenant 1-3.
Test results were obtained using SQL querySELECT (completed_date-started_date) as duration, * FROM {tenant}_mod_source_record_manager.job_execution where subordination_type = 'COMPOSITE_PARENT' order by started_date desc limit 10
please add OS metrics and MSK