Data Import Create MARC holdings records [Nolana]
It's been found after testing that the actual durations of the imports performed were about 2 (two) times longer than what was reported. This is due to the PTF environment missing a DB trigger that, when restored, doubled the imports' durations.
Overview
This document contains the results of testing Data Import Create MARC holdings records in pre-Nolana to detect performance trends.
https://folio-org.atlassian.net/browse/PERF-343
Software versions
mod-data-import:2.6.1
mod-data-import-converter-storage:1.15.1
mod-source-record-storage:5.5.2
mod-source-record-manager:3.5.4
mod-inventory:19.0.1
mod-inventory-storage:25.0.1
mod-search:1.8.0
mod-quick-marc:2.5.0
Infrastructure
10 m6i.2xlarge EC2 instances
2 instances of 'db.r6g.xlarge' database instances: one reader, and one writer
MSK
Broker type: kafka.m5.2xlarge
Total number of brokers: 4
Number of zones: 2
Brokers per zone: 2
auto.create-topics.enable = true
log.retention.minutes=480
default.replication.factor=3
2 partitions per DI topics
mod-data-import:2.6.1
256 CPU Units
2048/1844 Hard/Soft memory limits (MiB)
mod-data-import-converter-storage:1.15.1
128 CPU Units
1024/896 Hard/Soft memory limits (MiB)
mod-source-record-storage:5.5.2
1024 CPU Units
1536/1440 Hard/Soft memory limits (MiB)
mod-source-record-manager:3.5.4
1024 CPU Units
4096/3688 Hard/Soft memory limits (MiB)
mod-inventory:19.0.1
1024 CPU Units
2880/2592 Hard/Soft memory limits (MiB)
inventory.kafka.DataImportConsumerVerticle.instancesNumber=10
inventory.kafka.MarcBibInstanceHridSetConsumerVerticle.instancesNumber=10
kafka.consumer.max.poll.records=10
1024 CPU Units
2880/2592 Hard/Soft memory limits (MiB)
inventory.kafka.DataImportConsumerVerticle.instancesNumber=10
inventory.kafka.MarcBibInstanceHridSetConsumerVerticle.instancesNumber=10
kafka.consumer.max.poll.records=10
mod-inventory-storage:25.0.1
1024 CPU Units
2208/1952 Hard/Soft memory limits (MiB)
1024 CPU Units
2208/1952 Hard/Soft memory limits (MiB)
mod-search:1.8.0
400 CPU Units
2592/2480 Hard/Soft memory limits (MiB)
mod-quick-marc:2.5.0
128 CPU Units
2288/2176 Hard/Soft memory limits (MiB)
Summary
The duration of bigger files (80k) 'MARC Holdings' data import operations for the Nolana release looks faster (- 7 m 43 s) than the Morning Glory release results.
However, the Nolana release peak average CPU (mod-inventory - 73.9%) looks higher than Morning Glory (is not higher than 60% for all related modules).
Results
More detailed information about the duration time of the data import operation is in the following table:
Test | File | Duration: Morning_Glory | Duration: Nolana | Diff_absolute | Diff_percentage |
|---|---|---|---|---|---|
1 | 1k | 28s | 32.7 s | +5 s | + 18% |
2 | 5k | 1 m 48s | 4 m 20.8 s | +2 m 33 s | + 142% |
3 | 10k | 4 m 4s | 3 m 24.9 s | - 39 s | - 16% |
4 | 80k | 29 m 6 s | 21 m 22.6 s | - 7 m 43 s | - 27% |
Resources usage
Comparing Nolana's numbers against Morning Glory's.
| Morning Glory | Nolana |
|---|---|---|
CPU | Here CPU usage is not higher than 60% for all related modules. | The highest average CPU (mod-inventory) value is 73.9% at peak. |
Memory | Concerning behaviour on :
However the last test for 80K didn't show any memory growth for any of the modules, so maybe the growth of mem usage can be explained by the working conditions of these modules.
| Increased memory usage was detected for:
|
RDS CPU | RDS CPU usage reached 80% maximum during the test. | ncp3-db-3fv7zu8sfdn5-auroracluster-nysenxyhpwrd
|