Data Import MARC Authorities (Nolana)
It's been found after testing that the actual durations of the imports performed were about 2 (two) times longer than what was reported. This is due to the PTF environment missing a DB trigger that, when restored, doubled the imports' durations.
Overview
This document contains the results of testing Data Import MARC Authorities in Nolana release to detect performance trends. PERF-344
Infrastructure
- 10 m6i.2xlarge EC2 instances
- 2 instances of db.r6.xlarge database instances, one reader and one writer
- MSK
- 4 m5.2xlarge brokers in 2 zones
- auto.create-topics.enable = true
- log.retention.minutes=480
- 2 partitions per DI topics
- default.replication.factor=3
- mod-inventory memory
- 1024 CPU units, 2592MB mem
- inventory.kafka.DataImportConsumerVerticle.instancesNumber=10
- inventory.kafka.MarcBibInstanceHridSetConsumerVerticle.instancesNumber=10
- kafka.consumer.max.poll.records=10
- mod-inventory-storage
- 1024 CPU units, 1962MB mem
- mod-source-record-storage
- 1024 CPU units, 1440MB mem
- mod-source-record-manager
- 1024 CPU units, 3688MB mem
- mod-data-import
- 256 CPU units, 1844MB mem
- mod-data-import-cs
- 128 CPU units, 896MB mem
Software versions
- mod-data-import v2.6.1
- mod-data-import-converter-storage v1.15.1
- mod-source-record-manager v3.5.4
- mod-source-record-storage v5.5.2
- mod-inventory v19.0.1
- mod-inventory-storage v25.0.1
Results
Summary
MARC Authorities import test set was done on warmed up modules (before test set - several MARC BIB's imports were performed).
All of a tests was done successfully without errors and issues. For all of tests duration of import is smaller that it was for Morning Glory release
No memory leaks found.
- R/W split enabled:
MARC Authority durations is ± the same for R/W split enabled and disabled.
RDS CPU usage on reader node was only 7% (this can explain almost same durations for imports)
Duration Nolana | Duration Morning Glory | ||
---|---|---|---|
1K Default - Create SRS MARC Authority | Default - Create SRS MARC Authority | 27 s | 24 s |
5K Default - Create SRS MARC Authority | Default - Create SRS MARC Authority | 1 min 15 s | 1 min 21 s |
10K Default - Create SRS MARC Authority | Default - Create SRS MARC Authority | 2 min 31 s | 2 min 32 s |
25K Default - Create SRS MARC Authority | Default - Create SRS MARC Authority | 7 min 7 s | 11 min 14 s |
50K Default - Create SRS MARC Authority | Default - Create SRS MARC Authority | 11 min 24 s | 22 min |
Resource Usages
Service CPU sage
Service Memory usage
DB CPU usage
Note: Each spike on this chart corresponding to each DI MARC Authorities test performed.
Instance level CPU usage
Read Write Split enabled
Duration Nolana R/W split enabled | Duration Nolana | ||
---|---|---|---|
1K Default - Create SRS MARC Authority | Default - Create SRS MARC Authority | 25 s | 27 s |
5K Default - Create SRS MARC Authority | Default - Create SRS MARC Authority | 1 min 20 s | 1 min 15 s |
10K Default - Create SRS MARC Authority | Default - Create SRS MARC Authority | 2 min 38 s | 2 min 31 s |
25K Default - Create SRS MARC Authority | Default - Create SRS MARC Authority | 6 min 3 s | 7 min 7 s |
50K Default - Create SRS MARC Authority | Default - Create SRS MARC Authority | 12 min 36 s | 11 min 24 s |
Resource Usages
Note: resource usage with R/W split enabled and without it is more or less the same (it's about 30-40% for most used modules). RDS CPU usage with R/W split is ±7%, while without R/W split it was 3-4%.
Note: on reader node load only 7%. This is explaining why results (duration ) is almost the same as they was without R/W split.
Note: Each spike on this chart corresponding to each DI MARC Authorities test performed.