Data Import test report (Poppy)
Overview
This document contains the results of testing Data Import for MARC Bibliographic records at Poppy release. - PERF-712Getting issue details... STATUS
Summary
- Duration for DI increases correlates to the number of the records imported.
- The increase in memory utilization was due to the scheduled cluster shutdown. No memory leak is suspected for DI modules.
- Average CPU utilization of modules for all Create and Update jobs did not exceed 150 %. Spikes at the beginning because of mod-data-import module are expected.
- Approximate DB CPU usage is close to 95% and this numbers goes for all jobs with files more than 10k records.
- Poppy release has higher average CPU resource utilization of DI related services comparing with Orchid. Specially mod-inventory.
- Errors occurred for DI create job with 100k file and for DI update jobs with 25k file because of timeout issue (
Opening SQLConnection failed: Timeout)
There was no observed reasonable difference in Data Import duration for tests with 1 and 5 ASYNC_PROCESSOR_MAX_WORKERS_COUNT
Recommendations and Jiras
- Investigate Timeout issues. Ticket created - MODINV-924Getting issue details... STATUS
- Check memory trends for mod-source-record-storage-b and mod-inventory during additional DI tests without cluster night shut down
- Increase CPU units allocation for mod-inventory, mod-di-converter-storage, mod-quick-marc services
- Use higher DB instance type (scale up from db.r6g.xlarge to db.r6g.2xlarge).
Results
Test # | Duration Orchid with R/W split enabled (07/09/2023) | Duration Poppy | Difference, % / sec | Results | ||
---|---|---|---|---|---|---|
1 | 1k MARC BIB Create | PTF - Create 2 | 39 sec | Completed | ||
2 | 2k MARC BIB Create | PTF - Create 2 | 1 min 01 sec | Completed | ||
3 | 5k MARC BIB Create | PTF - Create 2 | 2 min 23 sec | 2 min 22 sec | ↓ 0.88% / 1 sec | Completed |
4 | 10k MARC BIB Create | PTF - Create 2 | 5 min 12 sec | 4 min 29 sec | ↓ 18.86% / 43 sec | Completed |
5 | 25k MARC BIB Create | PTF - Create 2 | 11 min 45 sec | 10 min 38 sec | ↓ 11.38% / 67 sec | Completed |
6 | 50k MARC BIB Create | PTF - Create 2 | 23 min 36 sec | 20 min 26 sec | ↓ 15.18% / 190 sec | Completed |
7 | 100k MARC BIB Create | PTF - Create 2 | 49 min 28 sec | 2 hours 46 min | Cancelled (stopped by user) * | |
8 | 1k MARC BIB Update | PTF - Updates Success - 1 | 34 sec | Completed | ||
9 | 2k MARC BIB Update | PTF - Updates Success - 1 | 1 min 09 sec | Completed | ||
10 | 5k MARC BIB Update | PTF - Updates Success - 1 | 2 min 48 sec | 2 min 31 sec | ↓ 6.66% / 17 sec | Completed |
11 | 10k MARC BIB Update | PTF - Updates Success - 1 | 5 min 23 sec | 5 min 13 sec | ↓ 1.84% / 10 sec | Completed |
12 | 25k MARC BIB Update | PTF - Updates Success - 1 | 14 min 12 sec | 12 min 27 sec | ↓ 14% / 105 sec | Completed with errors * |
13 | 25k MARC BIB Update | PTF - Updates Success - 1 | 2 min 15 sec | Completed with errors * | ||
14 | 25k MARC BIB Update | PTF - Updates Success - 1 | 12 min | Cancelled (stopped by user) * |
* - for all jobs completed with errors or cancelled there was the same issue in UI: io.vertx.core.impl.NoStackTraceThrowable: Timeout
Test #14 was stopped manually from UI. 2 tests with 25k MARC BIB Update were carried out to confirm that 25k doesn't work properly and has the same issue.
Memory Utilization
The increase in memory utilization was due to the scheduled cluster shutdown. no memory leak is suspected for DI modules.
MARC BIB CREATE
Tests #1-7
1k, 2k, 5k, 10k, 25k, 50k, 100k records
MARC BIB UPDATE
Tests #8-14
1k, 2k, 5k, 10k, 25k, 25k, 25k records
Service CPU Utilization
MARC BIB CREATE
Tests #1-7
1k, 2k, 5k, 10k, 25k, 50k, 100k records
CPU utilization for all modules came back to by default numbers after all tests. The highest 170% of resource utilization was observed for mod-quick-marc-b module in 5k DI create job.
Average for mod-inventory-b - 125%, mod-inventory-storage-b - 25%, mod-source-record-storage-b - 60%, mod-source-record-manager-b - 35%, mod-di-converter-storage-b - 80%, , mod-data-import - 200% spike for 100k job.
MARC BIB UPDATE
Tests #8-14
1k, 2k, 5k, 10k, 25k, 25k, 25k records