PTF - Performance testing of moving Parsed Records Flattening From Database To Module Space (Quesnelia) [non-ECS]
Overview
- This document contains comparing results of testing Data Import for MARC Bibliographic records with create and update jobs on the Quesnelia [non-ECS] release on qcp1 environment with two different versions of mod-source-record-storage 5.8.0and5.9.0-SNAPSHOT.387 (moved json parsing function from DB to module).
- PERF-924Getting issue details... STATUS
Summary
- Data Import tests finished successfully on qcp1 environment using the PTF - Create 2 and PTF - Updates Success - 6 profiles with 10k, 100k and 200k file records.
- DI duration growth correlates to the number of records imported.
- Comparing between mod-source-record-storage 5.8.0 and mod-source-record-storage 5.9.0-SNAPSHOT.387
- Data Import durations have performance degradation of around 35% for all Data-imports jobs.
- Server and Database metrics at the same level for both versions of mod-source-record-storage module.
- No memory leaks are observed.
Test Runs and Results
This table contains durations for Data Import.
Profile | Tenant | MARC File | Test # | DI Duration | Test # | DI Duration mod-source-record-storage 5.9.0-SNAPSHOT.387 (hh:mm:ss) | Results |
---|---|---|---|---|---|---|---|
DI MARC Bib Create (PTF - Create 2) | fs09000000 | 25K.mrc | 1 | 0:11:14 | 7 | 0:14:30 | Completed |
fs09000000 | 100K.mrc | 2 | 0:46:02 | 8 | 1:03:37 | Completed | |
fs09000000 | 200K.mrc | 3 | 1:30:05 | 9 | 2:10:52 | Completed | |
DI MARC Bib Update (PTF - Updates Success - 6) | fs09000000 | 25K.mrc | 4 | 0:37:11 | 10 | 0:26:37 | Completed |
fs09000000 | 100K.mrc | 5 | 1:21:57 | 11 | 1:51:49 | Completed | |
fs09000000 | 200K.mrc | 6 | 2:43:16 | 12 | 3:38:51 | Completed |
Comparison
This table contains durations comparison between mod-source-record-storage 5.8.0 and mod-source-record-storage 5.9.0-SNAPSHOT.387.
Profile | MARC File | DI Duration | DI Duration mod-source-record-storage 5.9.0-SNAPSHOT.387 (hh:mm:ss) | Delta (hh:mm:ss / %) |
---|---|---|---|---|
DI MARC Bib Create (PTF - Create 2) | 25K.mrc | 0:11:14 | 0:14:30 | 0:03:1 +29% |
100K.mrc | 0:46:02 | 1:03:37 | 0:17:35 +38% | |
200K.mrc | 1:30:05 | 2:10:52 | 0:40:47 +45% | |
DI MARC Bib Update (PTF - Updates Success - 6) | 25K.mrc | 0:37:11 | 0:26:37 | 0:10:34 −28% |
100K.mrc | 1:21:57 | 1:51:49 | 0:29:52 +36% | |
200K.mrc | 2:43:16 | 3:38:51 | 0:55:35 +34% |
Resource utilization for Test Set №1
Service CPU Utilization
Here we can see that mod-data-import used 250% CPU in spike and mod-inventory used 115% CPU.
Service Memory Utilization
Here we can see that all modules show a stable trend except mod-source-record-manager.
DB CPU Utilization
DB CPU was 90%.
CPU (User) usage by broker
Disk usage by broker
DB Connections
Max number of DB connections was 918.
DB load
Top SQL-queries
# | TOP 5 SQL statements |
---|---|
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
Resource utilization for Test Set №2
Service CPU Utilization
Here we can see that mod-data-import used 240% CPU in spike and mod-inventory used 111% CPU.
Service Memory Utilization
Here we can see that all modules show a stable trend except mod-source-record-manager.
CPU (User) usage by broker
Disk usage by broker
DB CPU Utilization
DB CPU was 95%.
DB Connections
Max number of DB connections was 942.
DB load
Top SQL-queries
# | TOP 5 SQL statements |
---|---|
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
Appendix
Infrastructure
PTF -environment qcp1
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instance, writer
Name Memory GIB vCPUs max_connections db.r6g.xlarge
32 GiB 4 vCPUs 2731 - Number of records in DB:
- fs09000000
- instances - 25901331
- items - 27074913
- holdings - 25871735
- fs09000000
- Open Search ptf-test
- Data nodes
- Instance type - r6g.2xlarge.search
- Number of nodes - 4
- Version: OpenSearch_2_7_R20240502
- Dedicated master nodes
- Instance type - r6g.large.search
- Number of nodes - 3
- Data nodes
- MSK tenant
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Methodology/Approach
DI tests scenario (DI MARC Bib Create\Update) were started from UI.
Test set №1:
- Test 1: Manually tested 25k records files DI MARC Bib Create started on one tenant (fs09000000) with version of module mod-source-record-storage:5.8.0.
- Test 2: Manually tested 100k records files DI MARC Bib Create started on one tenant (fs09000000) with version of module mod-source-record-storage:5.8.0.
- Test 3: Manually tested 200k records files DI MARC Bib Create started on one tenant (fs09000000) with version of module mod-source-record-storage:5.8.0.
- Test 4: Manually tested 25k records files DI MARC Bib Update started on one tenant (fs09000000) with version of module mod-source-record-storage:5.8.0.
- Test 5: Manually tested 100k records files DI MARC Bib Update started on one tenant (fs09000000) with version of module mod-source-record-storage:5.8.0.
- Test 6: Manually tested 200k records files DI MARC Bib Update started on one tenant (fs09000000) with version of module mod-source-record-storage:5.8.0.
Test set №2:
- Test 7: Manually tested 25k records files DI MARC Bib Update started on one tenant (fs09000000) with version of module mod-source-record-storage:5.9.0-SNAPSHOT.387.
- Test 8: Manually tested 100k records files DI MARC Bib Update started on one tenant (fs09000000) with version of module mod-source-record-storage:5.9.0-SNAPSHOT.387.
- Test 9: Manually tested 200k records files DI MARC Bib Update started on one tenant (fs09000000) with version of module mod-source-record-storage:5.9.0-SNAPSHOT.387.
- Test 10: Manually tested 25k records files DI MARC Bib Update started on one tenant (fs09000000) with version of module mod-source-record-storage:5.9.0-SNAPSHOT.387.
- Test 11: Manually tested 100k records files DI MARC Bib Update started on one tenant (fs09000000) with version of module mod-source-record-storage:5.9.0-SNAPSHOT.387.
- Test 12: Manually tested 200k records files DI MARC Bib Update started on one tenant (fs09000000) with version of module mod-source-record-storage:5.9.0-SNAPSHOT.387.
To get status and time range for import jobs the query used:
select file_name, job_Profile_name, started_date,completed_date, completed_date - started_date as duration ,status from fs09000000_mod_source_record_manager.job_execution order by started_date desc limit 2000;