Data Import test report (Orchid)
Overview
This document contains the results of testing Data Import for MARC Bibliographic records in the Orchid release to detect performance trends. - PERF-562Getting issue details... STATUS
Summary
- Duration for Orchid is increased almost twice due to fixing differences in the database schemas, mostly adding triggers functions. For example, 50K MARC Create took 39 min 27 sec for Orchid compared to 21 min 11 s, and 21 min 37s for Nolana and Morning Glory but for Lotus, it was 32 min 28 s. So we can assume that the trigger was missing for the previous 2 releases in our database.
- Memory utilization increasing due to previous modules restarting (everyday cluster shot down process) no memory leak is suspected for all of the modules due to investigation in the scope of the ticket - PERF-541Getting issue details... STATUS .
- Average CPU usage did not exceed 130 % for all modules. We can observe spikes in CPU usage of mod-data-import at the beginning of the Data Import jobs up to 270%.
- Approximately DB CPU usage is up to 97%.
- We can not compare R/W split-enabled DI test to the first test from 13/06/2023 because of the different versions of modules tested. However, there are no significant improvements in DI duration for R/W split enabled if compared between new module versions. Details are in the table below and in the Data Import series of tests to understand a range of duration of each DI test (Orchid). However, R/W splitting could decrease the database load for the writer instance as a result other processes with the data import could be performed faster.
Recommendations and Jiras
Implement database schemas comparison process to each cluster deployment to avoid database misconfiguration.
Results
Duration with new versions of modules Orchid with R/W split disabled (07/09/2023) | Duration with new versions of modules Orchid with R/W split enabled (07/09/2023) | Duration with new versions of modules Orchid with R/W split enabled (08/09/2023) | Duration Orchid (First test 13/06/2023) | Duration Nolana | Duration Morning Glory | Duration Lotus | ||
---|---|---|---|---|---|---|---|---|
5K MARC Create | PTF - Create 2 | 2 min 50 sec | 2 min 23 sec | 2 min 3 sec | 4 min 30 sec | 2m 8 s | 2 min 20s | 3 min 54 s |
5K MARC Update | PTF - Updates Success - 1 | 2 min 48 sec | 2 min 45 sec | 4 min 2 sec | 2 min 10 s | 3 min 4 s | 4 min 12 s | |
10K MARC Create | PTF - Create 2 | 4 min 43 sec | 5 min 12 sec | 3 min 58 sec | 9 min 25 sec | 4 min 20 s | 4 min 33 s | 6 min 45 s |
10K MARC Update | PTF - Updates Success - 1 | 5 min 23 sec | 5 min 23 sec | 8 min 10 sec | 4 min 8 s | 5 min 29 s | 8 min 4 s | |
25K MARC Create | PTF - Create 2 | 11 min 52 sec | 11 min 45 sec | 10 min 5 sec | 22 min 16 sec | 10 min 41 s | 10 min 55 s | 16 min 8s |
25K MARC Update | PTF - Updates Success - 1 | 14 min 12 sec | 14 min 19 sec | 19 min 39 sec | 10 min 40 s | 13 min 37 s | 19 min 50s | |
50K MARC Create | PTF - Create 2 | 23 min 20 sec | 23 min 36 sec | 20 min 46 sec | 39 min 27 sec | 21 min 11 s | 21 min 37 s | 32 min 28 s |
50K MARC Update | PTF - Updates Success - 1 | 27 min 52 sec | 28 min | 38 min 30 sec Completed or Completed with errors (1 item discarded) * | 20 min 57 s | 26 min 10 s | 39 min 5 s | |
100K MARC Create | PTF - Create 2 | 48 min 46 sec | 49 min 28 sec | 44 min 18 sec | 1 hour 38 min | 42 min 35 s | 44 min 4 s | 1 hr 11 min |
100K MARC Update | PTF - Updates Success - 1 | 57 min 41 sec | 55 min | 1 hour 33 min | 41 min 56 s | 55 min 33 s | 1 hr 19 min |
Orchid with R/W split enabled (07/09/2023) enabled for:
- mod-data-import
- mod-source-record-storage
- mod-source-record-manager
- mod-di-converter-storage
Orchid with R/W split enabled (08/09/2023) enabled for:
- mod-data-import
- mod-source-record-storage
- mod-source-record-manager
- mod-di-converter-storage
- mod-inventory-storage
There 2 modules that affect R/W split the most due to the number of read queries to the database:
- mod-inventory-storage (mostly)
- mod-source-record-manager
All other modules almost or at all do not have read queries to the database.
* - io.vertx.core.impl.NoStackTraceThrowable: Current retry number 1 exceeded or equal given number 1 for the Item update for jobExecutionId '8fde78a8-2450-44c7-83ac-c98376a90491'
Memory Utilization
This has memory utilization increasing due to previous modules restarting (everyday cluster shot down process) no memory leak is suspected for all of the modules due to investigation in the scope of the ticket - PERF-541Getting issue details... STATUS .
MARC BIB CREATE
Service CPU Utilization
MARC BIB CREATE
Average CPU usage did not exceed 130 % for all modules. We can observe spikes in CPU usage of mod-data-import at the beginning of the Data Import jobs up to 270%.
RDS CPU Utilization
MARC BIB CREATE
Approximately DB CPU usage is up to 97%
RDS Database Connections
MARC BIB CREATE
For DI job Create- 560 connections count.
Appendix
Infrastructure
Records count :
- mod_source_record_storage.marc_records_lb = 22618121
- mod_source_record_storage.raw_records_lb = 22650140
- mod_source_record_storage.records_lb = 22650140
- mod_source_record_storage.marc_indexers = 98256911(all records)
- mod_source_record_storage.marc_indexers with field_no 010 = 139135
- mod_source_record_storage.marc_indexers with field_no 035 = 4272473
- mod_inventory_storage.authority = 7402975
- mod_inventory_storage.holdings_record = 22027125
- mod_inventory_storage.instance = 20986866
- mod_inventory_storage.item = 22130108
PTF -environment ncp5
- 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, one reader, and one writer
Name API Name Memory GIB vCPUs max_connections R6G Extra Large db.r6g.xlarge 32 GiB 4 vCPUs 2731 - MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
- Kafka topics partitioning: - 2 partitions for DI topics
Modules memory and CPU parameters
Modules | Version | Task Definition | Running Tasks | CPU | Memory | MemoryReservation | MaxMetaspaceSize | Xmx |
---|---|---|---|---|---|---|---|---|
mod-inventory-storage | 26.0.0 | 10 | 2 | 1024 | 2208 | 1952 | 512 | 1440 |
mod-inventory | 20.0.4 | 8 | 2 | 1024 | 2880 | 2592 | 512 | 1814 |
mod-source-record-storage | 5.6.5 | 24 | 2 | 2048 | 4096 | 3688 | 512 | 3076 |
mod-quick-marc | 3.0.0 | 5 | 1 | 128 | 2288 | 2176 | 512 | 1664 |
mod-source-record-manager | 3.6.2 | 16 | 2 | 1024 | 4096 | 3688 | 512 | 3076 |
mod-di-converter-storage | 2.0.2 | 5 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-data-import | 2.7.1 | 8 | 1 | 256 | 2048 | 1844 | 512 | 1292 |
okapi | 5.0.1 | 6 | 3 | 1024 | 1684 | 1440 | 512 | 922 |
nginx-okapi | 2022.03.02 | 6 | 2 | 128 | 1024 | 896 | - | - |
pub-okapi | 2022.03.02 | 6 | 2 | 128 | 1024 | 896 | - | 768 |
Methodology/Approach
To test Baseline for DI JMeter scripts were used.
- 5 min pauses between the tests