Data Import test report (Orchid)
Overview
This document contains the results of testing Data Import for MARC Bibliographic records in the Orchid release to detect performance trends. PERF-562: [Orchid] [Data Import] Rerun MARC BIB importsClosed
Summary
Duration for Orchid is increased almost twice due to fixing differences in the database schemas, mostly adding triggers functions. For example, 50K MARC Create took 39 min 27 sec for Orchid compared to 21 min 11 s, and 21 min 37s for Nolana and Morning Glory but for Lotus, it was 32 min 28 s. So we can assume that the trigger was missing for the previous 2 releases in our database.
Memory utilization increasing due to previous modules restarting (everyday cluster shot down process) no memory leak is suspected for all of the modules due to investigation in the scope of the ticket
PERF-541: Investigate potential memory leak for DI modulesClosed.
Average CPU usage did not exceed 130 % for all modules. We can observe spikes in CPU usage of mod-data-import at the beginning of the Data Import jobs up to 270%.
Approximately DB CPU usage is up to 97%.
We can not compare R/W split-enabled DI test to the first test from 13/06/2023 because of the different versions of modules tested. However, there are no significant improvements in DI duration for R/W split enabled if compared between new module versions. Details are in the table below and in the Data Import series of tests to understand a range of duration of each DI test (Orchid). However, R/W splitting could decrease the database load for the writer instance as a result other processes with the data import could be performed faster.
Recommendations and Jiras
Implement database schemas comparison process to each cluster deployment to avoid database misconfiguration.
Results
| Duration with new versions of modules Orchid with R/W split disabled (07/09/2023) | Duration with new versions of modules Orchid with R/W split enabled (07/09/2023) | Duration with new versions of modules Orchid with R/W split enabled (08/09/2023) | Duration Orchid (First test 13/06/2023) | Duration Nolana | Duration Morning Glory | Duration Lotus | |
|---|---|---|---|---|---|---|---|---|
5K MARC Create | PTF - Create 2 | 2 min 50 sec | 2 min 23 sec | 2 min 3 sec | 4 min 30 sec | 2m 8 s | 2 min 20s | 3 min 54 s |
5K MARC Update | PTF - Updates Success - 1 |
| 2 min 48 sec | 2 min 45 sec | 4 min 2 sec | 2 min 10 s | 3 min 4 s | 4 min 12 s |
10K MARC Create | PTF - Create 2 | 4 min 43 sec | 5 min 12 sec | 3 min 58 sec | 9 min 25 sec | 4 min 20 s | 4 min 33 s | 6 min 45 s |
10K MARC Update | PTF - Updates Success - 1 |
| 5 min 23 sec | 5 min 23 sec | 8 min 10 sec | 4 min 8 s | 5 min 29 s | 8 min 4 s |
25K MARC Create | PTF - Create 2 | 11 min 52 sec | 11 min 45 sec | 10 min 5 sec | 22 min 16 sec | 10 min 41 s | 10 min 55 s | 16 min 8s |
25K MARC Update | PTF - Updates Success - 1 |
| 14 min 12 sec | 14 min 19 sec | 19 min 39 sec | 10 min 40 s | 13 min 37 s | 19 min 50s |
50K MARC Create | PTF - Create 2 | 23 min 20 sec | 23 min 36 sec | 20 min 46 sec | 39 min 27 sec | 21 min 11 s | 21 min 37 s | 32 min 28 s |
50K MARC Update | PTF - Updates Success - 1 |
| 27 min 52 sec | 28 min | 38 min 30 sec Completed or Completed with errors (1 item discarded) * | 20 min 57 s | 26 min 10 s | 39 min 5 s |
100K MARC Create | PTF - Create 2 | 48 min 46 sec | 49 min 28 sec | 44 min 18 sec | 1 hour 38 min | 42 min 35 s | 44 min 4 s | 1 hr 11 min |
100K MARC Update | PTF - Updates Success - 1 |
| 57 min 41 sec | 55 min | 1 hour 33 min | 41 min 56 s | 55 min 33 s | 1 hr 19 min |
Orchid with R/W split enabled (07/09/2023) enabled for:
mod-data-import
mod-source-record-storage
mod-source-record-manager
mod-di-converter-storage
Orchid with R/W split enabled (08/09/2023) enabled for:
mod-data-import
mod-source-record-storage
mod-source-record-manager
mod-di-converter-storage
mod-inventory-storage
There 2 modules that affect R/W split the most due to the number of read queries to the database:
mod-inventory-storage (mostly)
mod-source-record-manager
All other modules almost or at all do not have read queries to the database.
* - io.vertx.core.impl.NoStackTraceThrowable: Current retry number 1 exceeded or equal given number 1 for the Item update for jobExecutionId '8fde78a8-2450-44c7-83ac-c98376a90491'
From ncp5/mod-inventory/
11:34:53 [] [] [] [] WARN dateItemEventHandler OL error updating Item - ERROR: Cannot update record 955f8ba1-76ac-4931-8236-1c2fb4775379 because it has been changed (optimistic locking): Stored _version is 5, _version of request is 3 (23F09), status code 409. Retry UpdateItemEventHandler handler...
Memory Utilization
This has memory utilization increasing due to previous modules restarting (everyday cluster shot down process) no memory leak is suspected for all of the modules due to investigation in the scope of the ticketPERF-541: Investigate potential memory leak for DI modulesClosed.
MARC BIB CREATE
MARC BIB UPDATE
Service CPU Utilization
MARC BIB CREATE
Average CPU usage did not exceed 130 % for all modules. We can observe spikes in CPU usage of mod-data-import at the beginning of the Data Import jobs up to 270%.
MARC BIB UPDATE
RDS CPU Utilization
MARC BIB CREATE
Approximately DB CPU usage is up to 97%
MARC BIB UPDATE
CPU Utilization for BIB Update 100k records R/W split enabled
RDS Database Connections
MARC BIB CREATE
For DI job Create- Maximum 560 connections count.
MARC BIB UPDATE
For DI job Update-- Maximum 500 connections count.
Average connection usage
Maximum connection usage
BIB Update 100k records R/W split enabled
BIB Update 100k records R/W split enabled
From reader DB instance
From writer DB instance
Appendix
Infrastructure
Records count :
mod_source_record_storage.marc_records_lb = 22618121
mod_source_record_storage.raw_records_lb = 22650140
mod_source_record_storage.records_lb = 22650140
mod_source_record_storage.marc_indexers = 98256911(all records)
mod_source_record_storage.marc_indexers with field_no 010 = 139135
mod_source_record_storage.marc_indexers with field_no 035 = 4272473
mod_inventory_storage.authority = 7402975
mod_inventory_storage.holdings_record = 22027125
mod_inventory_storage.instance = 20986866
mod_inventory_storage.item = 22130108
PTF -environment ncp5
9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, one reader, and one writer
MSK ptf-kakfa-3
4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
Kafka topics partitioning: - 2 partitions for DI topics
Modules memory and CPU parameters
Modules | Version | Task Definition | Running Tasks | CPU | Memory | MemoryReservation | MaxMetaspaceSize | Xmx |
|---|---|---|---|---|---|---|---|---|
mod-inventory-storage | 26.0.0 | 10 | 2 | 1024 | 2208 | 1952 |
| 1440 |
mod-inventory | 20.0.4 | 8 | 2 | 1024 | 2880 | 2592 | 512 | 1814 |
mod-source-record-storage | 5.6.5 | 24 | 2 | 2048 | 4096 | 3688 | 512 | 3076 |
mod-quick-marc | 3.0.0 | 5 | 1 | 128 | 2288 | 2176 | 512 | 1664 |
mod-source-record-manager | 3.6.2 | 16 | 2 |
|
|
| 512 | 3076 |
mod-di-converter-storage | 2.0.2 | 5 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-data-import | 2.7.1 | 8 | 1 | 256 | 2048 | 1844 | 512 | 1292 |
okapi | 5.0.1 | 6 | 3 | 1024 | 1684 | 1440 | 512 | 922 |
nginx-okapi | 2022.03.02 | 6 | 2 | 128 | 1024 | 896 | - | - |
pub-okapi | 2022.03.02 | 6 | 2 | 128 | 1024 | 896 | - | 768 |
Methodology/Approach
To test Baseline for DI JMeter scripts were used.
5 min pauses between the tests