Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Overview
This document contains the results of testing Data Import for MARC Bibliographic records at Quesnelia release [non-ECS].
Ticket: https://folio-org.atlassian.net/browse/PERF-858 on QCON environment.
Summary
All Data-imports jobs finished successfully without errors.
The PTF - Updates Success - 2 profile(based on qcp1: PTF - Updates Success - 6 ) was created for the QCON Quesnelai release on tenant: cs00000int_0001.
DI duration growth correlates to the number of records imported.Ā
No memory leak is suspected for DI modules.
Approximate DB CPU usage is close to 95% and this number goes for all jobs with files of more than 10k records.Ā
...
Duration for Data-import create better for for files with smaller size and the same for file with 500k records.
Duration for Data-import update better for for files with smaller size and slower on 20% for files with 100k and 500k records.
Services CPU utilization, Service memory utilization, and DB CPU utilization have the same utilization trend and values as in the Poppy release.
Results
Test # | Data-import test | Duration PoppyĀ | Duration Quesnelia (qcp1) | Duration Quesnelia (qcon) | Difference, % | Results | |
---|---|---|---|---|---|---|---|
1. | 1k MARC BIB Create | PTF - Create 2 | 39 sec | 54 sec | 31 sec | -42% | Completed |
2. | 5k MARC BIB Create | PTF - Create 2 | 2 min 22 sec | 3 min 20 sec | Not tested | ||
3. | 10k MARC BIB Create | PTF - Create 2 | 4 min 29 sec | 6 minutes | 4 min 14 sec | -29% | Completed |
4. | 25k MARC BIB Create | PTF - Create 2 | 10 min 38 sec | 13 min 41 sec | 9 min 41 sec | -29% | CompletedĀ |
5. | 50k MARC BIB Create | PTF - Create 2 | 20 min 26 sec | 21 min 59 sec | 18 min 18 sec | -16% | CompletedĀ |
6. | 100k MARC BIB Create | PTF - Create 2 | 2 hours 46 min Cancelled | 40 min 16 sec | 38 min 36 sec | -4% | Completed |
7. | 500k MARC BIB Create | PTF - Create 2 | Not Tested | 3 hours 27 min | 3 hours 30 min | +1.84% | Completed |
8. | 1k MARC BIB Update | PTF - Updates Success - 6 | 34 sec (PTF - Updates Success - 1) | 1 min 59 sec | 44 sec | -63% | Completed |
9 | 2k MARC BIB Update | PTF - Updates Success - 6 | 1 min 09 sec (PTF - Updates Success - 1) | 2 min 43 sec | Not tested | ||
10 | 5k MARC BIB Update | PTF - Updates Success - 6 | 2 min 31 sec (PTF - Updates Success - 1) | 7 min 10 sec | Not tested | ||
11 | 10k MARC BIB Update | PTF - Updates Success - 6 | 5 min 13 sec (PTF - Updates Success - 1) | 10 min 27 sec | 5 min 59 sec | -42% | Completed |
12 | 25k MARC BIB Update | PTF - Updates Success - 6 | 12 min 27 sec (PTF - Updates Success - 1) | 23 min 16 sec | 19 min 52 sec | -14% | Completed |
13 | 50k MARC BIB Update | PTF - Updates Success - 6 | Not tested | 40 min 52 sec | 37 min 53 sec | -7% | Completed |
14 | 100k MARC BIB Update | PTF - Updates Success - 6 | Not tested | 1 hrs 2 min | 1 hrs 14 min | +19% | Completed |
15 | 500k MARC BIB Update | PTF - Updates Success - 6 | Not tested | 5 hrs 31 min | 6 hrs 39 min | +21% | Completed |
Service CPU UtilizationĀ
MARC BIB CREATE
Expand | ||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||
|
...
1k, 10k, 25k, 50k, 100k, 500k records
Memory Utilization
No memory leak is suspected for DI modules.
...
1k, 10k, 25k, 50k, 100k, 500k records
...
RDS CPU UtilizationĀ
MARC BIB CREATE
Average 90% for DI jobs with more than 10k records for Create and Update profiles
MARC BIB UPDATE
...
RDS Database Connections
MARC BIB CREATE
DB connections was 1400 in average
...
DB connections was 1400 in average
...
Average active sessions (AAS)
MARC BIB CREATE
...
Top SQL
...
MARC BIB UPDATE
Top SQL
...
OpenSearch Service
Cluster status was green during the tests
Master nodes
1. CPU utilization MasterCPUUtilization
https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#metricsV2?graph=~(metrics~(~(~'AWS2fES~'MasterCPUUtilization~'DomainName~'fse~'ClientId~'054267740449))~view~'timeSeries~stacked~false~region~'us-east-1~title~'CPU20utilization2028Percent*29~period~60~stat~'Maximum~yAxis~(left~(showUnits~false)))
MARC BIB Create
CPU utilization was 20% in average
...
Maximum memory utilization was 94% in average
...
Managed Streaming for Apache Kafka
CPU (User) usage by broker
MARC BIB Create
MARC BIB Update
...
Appendix
Infrastructure
11 m6i.2xlarge EC2 instances located inĀ US EastĀ (N. Virginia)us-east-1 [Number of ECS instances, instance type, location region]
1 instance of db.r6.xlarge database instance: Writer instance
OpenSearch
domain: fse
Number of nodes: 9
Version: OpenSearch_2_7_R20240502
MSK - tenat
4 kafka.m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
Kafka consolidated topics enabled
Methodology
Pregenerated files were used for DI Create job profile
1K, 10K, 25K, 50K, 100K and 500K files.
Run DI Create on a single tenant(cs00000int_0001) one by one with the delay with files using PTF - Create 2 profile.
Prepare files for DI Update with the Data export app, using previously imported items
Run DI Update on a single tenant(cs00000int_0001) one by one with the delay with prepared files using PTF - Update Success 2 profile
...