Data Import in Central and Member Tenants
Overview
This document contains the results of testing Data Import in central and member tenants with ECS and DI file splitting feature enabled in the Poppy release.
Tickets: - PERF-753Getting issue details... STATUS and - PERF-754Getting issue details... STATUS
Summary
- DI create and update jobs have a better performance compared with test results without ECS enabled on Poppy release. More details in comparison table.
- DI Create job results in central tenant and member tenant do not differ much for from 1k to 50k. DI with 100k records file worked faster in central tenant.
- DI Update jobs perform faster in central tenant and for 100k records file it was 5 minutes faster than in member tenant.
- Service CPU utilization of mod-inventory in central tenant didn't exceed 130% during DI Update jobs. In member tenant it was 160%.
- CPU utilization decreased for major of modules in both central and member tenants if to compare with non-ECS results. The only module which shows insignificant growth is mod-inventory-storage (+6%).
- After an additional set of 100k DI create jobs on pcon (to populate DB with instances on member tenants) it was observed a growth trend of CPU utilization with subsequent Out Of Memory issue for mod-inventory after spike 361%.
- Memory consumption for mod-inventory was 100% in average for both tenants.
- Memory consumption increased for mod-source-record-storage (20%) and mod-source-record-manager (10%) in both central and member tenants if to compare with non-ECS results. It decreased for data-import module (15%).
- RDS utilized 98% with DI MARC Bib Create jobs and 94% with DI MARC Bib Update jobs in central tenant and 95% with DI MARC Bib Create jobs and 90% with DI MARC Bib Update jobs in member tenant.
OpenSearch Service CPU utilization - 97% and Memory consumption - 99%.
Recommendations and Jiras
- After an additional set of 100k DI create jobs on pcon (to populate DB with instances on member tenants) it was observed a growth trend of CPU utilization with subsequent Out Of Memory issue for mod-inventory after spike 361%.
- Jira ticket was created - PERF-764Getting issue details... STATUS and investigated. Results of heap dump analysis attached as a comment.
- Consider more CPU units allocation to mode-data-import taking into account that during DI Update jobs CPU utilization for mod-data-import module exceed 100%.
Test Runs
Test # | Scenario | Load level | Comment |
---|---|---|---|
1 | DI MARC Bib Create | 1K, 5K, 10K, 25K, 50K, 100K consecutively (with 5 min pause) | Central tenant only |
2 | DI MARC Bib Update | 1K, 5K, 10K, 25K, 50K, 100K consecutively (with 5 min pause) | |
3 | DI MARC Bib Create | 1K, 5K, 10K, 25K, 50K, 100K consecutively (with 5 min pause) | Member tenant only |
4 | DI MARC Bib Update | 1K, 5K, 10K, 25K, 50K, 100K consecutively (with 5 min pause) |
Test Results
Profile | MARC File | DI Duration (hh:mm:ss) | |
Central tenant | Member tenant | ||
DI MARC Bib Create (PTF - Create 2) | 1K.mrc | 00:00:38 | 00:00:33 |
5K.mrc | 00:02:13 | 00:02:02 | |
10K.mrc | 00:03:54 | 00:03:54 | |
25K.mrc | 00:09:44 | 00:10:03 | |
50K.mrc | 00:18:49 | 00:18:50 | |
100K.mrc | 00:37:46 | 00:39:33 | |
DI MARC Bib Update (PTF - Updates Success - 1) | 1K.mrc | 00:00:44 | 00:00:33 |
5K.mrc | 00:02:26 | 00:02:39 | |
10K.mrc | 00:04:57 | 00:05:20 | |
25K.mrc | 00:12:05 | 00:13:21 | |
50K.mrc | 00:24:27 | 00:26:43 | |
100K.mrc | 00:49:15 | 00:54:29 |
Comparison Table
Profile | MARC File | DI Duration (hh:mm:ss) | |||
pcon | pcp1 | pcon/pcp1 | |||
Central tenant | Member tenant | Central tenant | Delta, central tenant | ||
DI MARC Bib Create (PTF - Create 2) | 1K.mrc | 00:00:38 | 00:00:33 | 00:00:39 | 00:00:01 |
5K.mrc | 00:02:13 | 00:02:02 | 00:02:39 | 00:00:26 | |
10K.mrc | 00:03:54 | 00:03:54 | 00:05:00 | 00:01:06 | |
25K.mrc | 00:09:44 | 00:10:03 | 00:11:15 | 00:01:31 | |
50K.mrc | 00:18:49 | 00:18:50 | 00:22:16 | 00:03:27 | |
100K.mrc | 00:37:46 | 00:39:33 | 00:49:58 | 00:12:12 | |
DI MARC Bib Update (PTF - Updates Success - 1) | 1K.mrc | 00:00:44 | 00:00:33 | 00:00:34 | 00:00:10 |
5K.mrc | 00:02:26 | 00:02:39 | 00:02:28 | 00:00:02 | |
10K.mrc | 00:04:57 | 00:05:20 | 00:05:31 | 00:00:34 | |
25K.mrc | 00:12:05 | 00:13:21 | 00:14:50 | 00:02:45 | |
50K.mrc | 00:24:27 | 00:26:43 | 00:32:53 | 00:08:26 | |
100K.mrc | 00:49:15 | 00:54:29 | 01:14:39 | 00:25:24 |
* - the results of DI without Check-in/Check-out in Poppy release were taken from the report Data Import with Check-ins Check-outs (Poppy)
Service CPU Utilization
CPU utilization decreased for major of modules in both central and member tenants if to compare with non-ECS tests results. The only module which shows insignificant growth is mod-inventory-storage (+ 6%).
Module | Central tenant | Member tenant | ||||||
Create Jobs | Update Jobs | Create Jobs | Update Jobs | |||||
ECS | Non-ECS | ECS | Non-ECS | ECS | Non-ECS | ECS | Non-ECS | |
mod-inventory-b | 101% | 125% | 132% | 220% | 112% | 125% | 158% | 220% |
mod-inventory-storage-b | 31% | 25% | 36% | 25% | 36% | 25% | 32% | 25% |
mod-source-record-storage-b | 52% | 60% | 36% | 50% | 48% | 60% | 33% | 50% |
mod-source-record-manager-b | 29% | 35% | 22% | 45% | 29% | 35% | 20% | 45% |
mod-di-converter-storage-b | 68% | 80% | 68% | 90% | 56% | 80% | 52% | 90% |
mod-data-import | 135% | 200% | 254% | 96% 25k file | 179% | 200% | 161% | 96% 25k file |
This table provides Average CPU utilization in ECS and Non-ECS test results in 100k records file.
Central tenant
During create jobs the highest cpu utilization was with 100k record file by mod-inventory - 99%. During update jobs mod-inventory module utilized 130% with 100k record file. The spikes were observed in mod-data-import at the very beginning of each job that was expected. The highest spike was in update job with 100k records file - 250%.
Create jobs:
The highest 101% of resource utilization was observed for mod-inventory. And at the end of the test we see that mod-quick-marc-b (98%) module begin to consume more than than other modules. Such behaviour was observed for all create jobs.
Average for mod-inventory-b - 101%, mod-inventory-storage-b - 31%, mod-source-record-storage-b - 52%, mod-source-record-manager-b - 29%, mod-di-converter-storage-b - 68%, , mod-data-import - 135% spike for 100k job.
Non-ECS tests*: Average for mod-inventory-b - 125%, mod-inventory-storage-b - 25%, mod-source-record-storage-b - 60%, mod-source-record-manager-b - 35%, mod-di-converter-storage-b - 80%, , mod-data-import - 200% spike for 100k job and mod-data-import - 86% spike for 25k job.
Update jobs:
The highest 134% of resource utilization was observed for mod-inventory.
Average for mod-inventory-b - 132%, mod-inventory-storage-b - 36%, mod-source-record-storage-b - 36%, mod-source-record-manager-b - 22%, mod-di-converter-storage-b - 68%, , mod-data-import - 254% spike for 100k job and mod-data-import - 85% spike for 25k job.
Non-ECS tests results*: Average for mod-inventory-b - 220%, mod-inventory-storage-b - 25%, mod-source-record-storage-b - 50%, mod-source-record-manager-b - 45%, mod-di-converter-storage-b - 90%, , mod-data-import - 96% spike for 25k job.
Member tenant
During create jobs the highest cpu utilization was with 100k record file by mod-inventory - 117%. During update jobs mod-inventory module utilized 160% with 100k record file. The spikes were observed in mod-data-import at the very beginning of each job that was expected.
Create jobs:
The highest 112% of resource utilization was observed for mod-inventory. And at the end of the test we see that mod-quick-marc-b(122%) module begin to consume more than than other modules. Such behaviour was observed for all create jobs.
Average for mod-inventory-b - 112%, mod-inventory-storage-b - 36%, mod-source-record-storage-b - 48%, mod-source-record-manager-b - 29%, mod-di-converter-storage-b - 56%, , mod-data-import - 179% spike for 100k job and mod-data-import - 70% spike for 25k job.
on-ECS tests results*: Average for mod-inventory-b - 125%, mod-inventory-storage-b - 25%, mod-source-record-storage-b - 60%, mod-source-record-manager-b - 35%, mod-di-converter-storage-b - 80%, , mod-data-import - 200% spike for 100k job.
Update jobs:
The highest 158% of resource utilization was observed for mod-inventory.
Average for mod-inventory-b - 158%, mod-inventory-storage-b - 32%, mod-source-record-storage-b - 33%, mod-source-record-manager-b - 20%, mod-di-converter-storage-b - 52%, , mod-data-import - 161% spike for 100k job and mod-data-import - 73% spike for 25k job.
Non-ECS tests results*: Average for mod-inventory-b - 220%, mod-inventory-storage-b - 25%, mod-source-record-storage-b - 50%, mod-source-record-manager-b - 45%, mod-di-converter-storage-b - 90%, , mod-data-import - 96% spike for 25k job.
Memory Utilization
Memory consumption increased for mod-source-record-storage (20%) and mod-source-record-manager (10%) in both central and member tenants if to compare with non-ECS tests results. It decreased for data-import module (15%).
This table provides Average Memory Consumption in ECS and Non-ECS test results. Mod-di-converter-storage-b in the Non-ECS tests scenario is not provided, as indicated by the "-".
Module | ECS | Non-ECS | |||
Central tenant | Member tenant | ||||
Create Jobs | Update Jobs | Create Jobs | Update Jobs | ||
mod-inventory-b | 96% | 96% | 101% | 101% | 90% |
mod-inventory-storage-b | 18% | 21% | 20% | 23% | 18% |
mod-source-record-storage-b | 64% | 74% | 71% | 71% | 46% |
mod-source-record-manager-b | 49% | 49% | 46% | 43% | 38% |
mod-di-converter-storage-b | 32% | 33% | 33% | 33% | - |
mod-data-import | 38% | 38% | 35% | 37% | 53% |
Central tenant
Memory consumption for module mod-inventory grew gradually to 96 % during create jobs and didn't change during update jobs. No memory leaks were detected.