Overview
This document contains the results of testing Data Import for MARC Bibliographic records at Poppy release. - PERF-712Getting issue details... STATUS
Summary
- Duration for DI increases correlates to the number of the records imported.
- The increase in memory utilization was due to the scheduled cluster shutdown. No memory leak is suspected for DI modules.
- Average CPU utilization of modules for all Create and Update jobs did not exceed 150 %. Spikes at the beginning because of mod-data-import module are expected.
- Approximate DB CPU usage is close to 95% and this numbers goes for all jobs with files more than 10k records.
- Poppy release has higher average CPU resource utilization of DI related services comparing with Orchid. Specially mod-inventory.
- Errors occurred for DI create job with 100k file and for DI update jobs with 25k file because of timeout issue (
Opening SQLConnection failed: Timeout)
Recommendations and Jiras
- Investigate Timeout issues. Ticket created - MODINV-924Getting issue details... STATUS
- Check memory trends for mod-source-record-storage-b and mod-inventory during additional DI tests without cluster night shut down
- Increase CPU units allocation for mod-inventory, mod-di-converter-storage, mod-quick-marc services
- Use higher DB instance type (scale up from db.r6g.xlarge to db.r6g.2xlarge).
Results
Test # | Duration Orchid with R/W split enabled (07/09/2023) | Duration Poppy with R/W split enabled | Difference, % / sec | Results | ||
---|---|---|---|---|---|---|
1 | 1k MARC BIB Create | PTF - Create 2 | 39 sec | Completed | ||
2 | 2k MARC BIB Create | PTF - Create 2 | 1 min 01 sec | Completed | ||
3 | 5k MARC BIB Create | PTF - Create 2 | 2 min 23 sec | 2 min 22 sec | ↓ 0.88% / 1 sec | Completed |
4 | 10k MARC BIB Create | PTF - Create 2 | 5 min 12 sec | 4 min 29 sec | ↓ 18.86% / 43 sec | Completed |
5 | 25k MARC BIB Create | PTF - Create 2 | 11 min 45 sec | 10 min 38 sec | ↓ 11.38% / 67 sec | Completed |
6 | 50k MARC BIB Create | PTF - Create 2 | 23 min 36 sec | 20 min 26 sec | ↓ 15.18% / 190 sec | Completed |
7 | 100k MARC BIB Create | PTF - Create 2 | 49 min 28 sec | 2 hours 46 min | Cancelled (stopped by user) * | |
8 | 1k MARC BIB Update | PTF - Updates Success - 1 | 34 sec | Completed | ||
9 | 2k MARC BIB Update | PTF - Updates Success - 1 | 1 min 09 sec | Completed | ||
10 | 5k MARC BIB Update | PTF - Updates Success - 1 | 2 min 48 sec | 2 min 31 sec | ↓ 6.66% / 17 sec | Completed |
11 | 10k MARC BIB Update | PTF - Updates Success - 1 | 5 min 23 sec | 5 min 13 sec | ↓ 1.84% / 10 sec | Completed |
12 | 25k MARC BIB Update | PTF - Updates Success - 1 | 14 min 12 sec | 12 min 27 sec | ↓ 14% / 105 sec | Completed with errors * |
13 | 25k MARC BIB Update | PTF - Updates Success - 1 | 2 min 15 sec | Completed with errors * | ||
14 | 25k MARC BIB Update | PTF - Updates Success - 1 | 12 min | Cancelled (stopped by user) * |
* - for all jobs completed with errors or cancelled there was the same issue in UI: io.vertx.core.impl.NoStackTraceThrowable: Timeout
Test #14 was stopped manually from UI. 2 tests with 25k MARC BIB Update were carried out to confirm that 25k doesn't work properly and has the same issue.
Memory Utilization
The increase in memory utilization was due to the scheduled cluster shutdown. no memory leak is suspected for DI modules.
MARC BIB CREATE
Tests #1-7
1k, 2k, 5k, 10k, 25k, 50k, 100k records
MARC BIB UPDATE
Tests #8-14
1k, 2k, 5k, 10k, 25k, 25k, 25k records
Service CPU Utilization
MARC BIB CREATE
Tests #1-7
1k, 2k, 5k, 10k, 25k, 50k, 100k records
CPU utilization for all modules came back to by default numbers after all tests. The highest 170% of resource utilization was observed for mod-quick-marc-b module in 5k DI create job.
Average for mod-inventory-b - 125%, mod-inventory-storage-b - 25%, mod-source-record-storage-b - 60%, mod-source-record-manager-b - 35%, mod-di-converter-storage-b - 80%, , mod-data-import - 200% spike for 100k job.
MARC BIB UPDATE
Tests #8-14
1k, 2k, 5k, 10k, 25k, 25k, 25k records
Average for mod-inventory-b - 220%, mod-inventory-storage-b - 25%, mod-source-record-storage-b - 50%, mod-source-record-manager-b - 45%, mod-di-converter-storage-b - 90%, , mod-data-import - 96% spike for 25k job.
RDS CPU Utilization
MARC BIB CREATE
Average 95% for DI jobs with more than 10k records
MARC BIB UPDATE
RDS Database Connections
MARC BIB CREATE
For DI job Create 275 and for Update - 260 connections
Average active sessions (AAS)
MARC BIB CREATE
Top SQL
MARC BIB UPDATE
Top SQL
INSERT INTO fs09000000_mod_source_record_manager.events_processed
INSERT INTO fs09000000_mod_source_record_manager.journal_records
MSK CPU utilization (Percent) OpenSearch
Utilization is not higher than 20%
CPU (User) usage by broker
Errors
|
|
|
Additional tests were analysed to investigate "(optimistic locking)" issue
Issue found only in create jobs in mod-inventory-storage and mod-inventory modules - it happens in completed and failed jobs as well and also depends on files.
Table shows the amount of optimistic locking
messages in tests. Hyphen '-' means no test performed.
Date\File | 1k | 2k | 5k | 10k | 25k | 50k | 100k (failed) | 250k |
2023.10.26 (additional with other files) | 9 | - | - | 39 | 148 | 137 | 3393 | - |
2023.10.27 Testing | 1 | No | No | 6 | 9 | 7 | 4 | - |
2023.11.01 (additional with other files) | No | - | - | - | - | - | - | 10 |
Appendix
Infrastructure
PTF -environment pcp1
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, writer/reader
Name Memory GIB vCPUs max_connections db.r6g.xlarge
32 GiB 4 vCPUs 2731 - MSK tenant
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Module | Task Def. Revision | Module Version | Task Count | Mem Hard Limit | Mem Soft limit | CPU units | Xmx | MetaspaceSize | MaxMetaspaceSize | R/W split enabled |
pcp1-pvt | ||||||||||
mod-inventory-storage-b | 10 | mod-inventory-storage:27.0.0 | 2 | 4096 | 3690 | 2048 | 3076 | 384 | 512 | FALSE |
mod-data-import-b | 11 | mod-data-import:3.0.1 | 1 | 2048 | 1844 | 256 | 1292 | 384 | 512 | FALSE |
mod-source-record-storage-b | 10 | mod-source-record-storage:5.7.0 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | FALSE |
mod-inventory-b | 9 | mod-inventory:20.1.0 | 2 | 2880 | 2592 | 1024 | 1814 | 384 | 512 | FALSE |
mod-source-record-manager-b | 9 | mod-source-record-manager:3.7.0 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | FALSE |
mod-di-converter-storage-b | 13 | mod-di-converter-storage:2.1.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | FALSE |
Methodology
Prepare files for DI Create job
- 1K, 2K, 5K, 10K, 25K, 50K, 100K files.
- Run DI Create on a single tenant one by one with delay with files using PTF - Create 2 profile.
- Prepare files for DI Update with Data export app
- Run DI Update on a single tenant one by one with delay with prepared files using PTF - Update Success 1 profile