Data Import Update multi tenant (Poppy)
Overview
- This document contains the results of testing Data Import for MARC Bibliographic records with an update job in the Poppy release. Ticket: PERF-779
Summary
Test set run №2 (updated configuration for mod-permission)
- Three tests of DI with update jobs with 25k records were carried out on the pcp1 cluster.
- DI with a new configuration for mod-permission was successful. DI jobs are processed one by one in test run #3.
- Mod-permission version changed from 6.4.0 → 6.3.2 doesn't solve the issue with NullPointerException: "mappingParameters" is null for configuration parameter for mod-permission in Mem Hard Limit=1684, Mem Soft limit=1544, Xmx=1024m.
- Changed configuration parameter for mod-permission in Mem Hard Limit=1684→2384, Mem Soft limit=1544→2244, Xmx=1024m→1500m allowed to run tests successfully and avoided issue with "mappingParameters" is null.
- Memory consumption for mod-data-import was 36%, mod-inventory - 94%. No memory leaks are suspected.
- CPU utilization for mod-data-import spiked to 72%, mod-inventory - 171%. It was observed that CPU utilization for mod-inventory has a growing trend.
- DB CPU utilization was 97%.
- DB connections - 500 on average and 681 during spikes connected to mod-permission spikes on the CPU graph.
- Slow queries found - SELECT jsonb FROM fs07000002_mod_permissions.permissions
Test set run №1
- Three tests of DI with update jobs with 25k records were carried out on the pcp1 cluster.
- The first and second tests were successful. The third test of DI on three tenants running concurrently failed after a third data import started. Errors attached.
Slow queries found for SELECT jsonb FROM fs07000001_mod_permissions.permissions
- Issue with a spike from mod-permission that led to test #3 problems were investigated. The module stopped after a timeout. Additional analysis with a method that consumed most of the resources attached.
Recommendations and Jiras
- Created Jira ticket to Investigate "mappingParameters" is null error: PERF-801
Test Results
Test set run#2 (updated configuration for mod-permission)
Profile | Test # | Tenant | MARC File | DI Duration Poppy (hh:mm:ss) | Results |
---|---|---|---|---|---|
DI MARC Bib Update (PTF - Updates Success - 1) | 1 | fso7000001 | 25K.mrc | 00:17:18 | Completed |
2 | fso9000000 | 25K.mrc | 00:23:11 | Completed | |
fso7000002 | 25K.mrc | 00:24:16 | Completed | ||
3 | fso9000000 | 25K.mrc | 00:26:36 | Completed | |
fso7000002 | 25K.mrc | 00:38:01 | Completed | ||
fso7000001 | 25K.mrc | 00:49:24 | Completed |
Test set run#1
Profile | Test # | Tenant | MARC File | DI Duration Poppy (hh:mm:ss) | Results |
---|---|---|---|---|---|
DI MARC Bib Update (PTF - Updates Success - 1) | 1 | fso9000000 | 25K.mrc | 00:20:23 | Completed |
2 | fso9000000 | 25K.mrc | 00:24:48 | Completed | |
fso7000001 | 25K.mrc | 00:31:53 | Completed | ||
3 | fso9000000 | 25K.mrc | 00:27:10 | Completed | |
fso7000001 | 25K.mrc | 00:36:09 | with errors* | ||
fso7000002 | 25K.mrc | 00:30:29 | with errors* |
Comparison
The following table compares previous results of Poppy release with current results.
Profile | MARC File | Test run # | DI Duration | DI Delta, (hh:mm:ss) | |
Poppy (previous) | Poppy | ||||
DI MARC Bib Update (PTF - Updates Success - 1) | 25K.mrc | 1 | 00:14:50 | 00:17:18 | + 00:02:28 |
2 | not tested | 00:24:16 | - | ||
3 | not tested | 00:49:24 | - |
Errors for test set #1
mod-permission heap dump analysis
Resource utilization table test set#1
Resource utilization table test set#2
Service CPU Utilization
The highest mod-inventory - 171% during DI on 1 tenant, data-import - 135% in 3 tenants test, mod-permission spiked in 3 tenants test to 462%.
Service Memory Utilization
Mod-inventory - 93%. Data-import - 36%.
DB CPU Utilization
The highest - 97% during all tests
DB Connections
DB connections before test started was 400, after for all tests were 500 in average and 681 during spike. Spikes are connected to spikes in CPU utilization for mod-permission.
DB load
Top SQL-queries:
All tests
3 tenants test
Appendix
Infrastructure
PTF -environment pcp1
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instances, writer
Name Memory GIB vCPUs max_connections db.r6g.xlarge
32 GiB 4 vCPUs 2731 - MSK tenant
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
- Kafka consolidated topics and file splitting features enabled on a non-ecs-enabled environment.
For Test set run №2 mod-permissions parameters were changed to mod-permissions_version=6.4.0->6.3.2, Mem Hard Limit=1684->2384 , Mem Soft limit=1544->2244, Xmx=1024m->1500m
Module | Task Def. Revision | Module Version | Task Count | Mem Hard Limit | Mem Soft limit | CPU units | Xmx | MetaspaceSize | MaxMetaspaceSize | R/W split enabled |
pcp1-pvt | ||||||||||
Thu Feb 01 11:17:08 UTC 2024 | ||||||||||
mod-data-import | 20 | mod-data-import:3.0.7 | 1 | 2048 | 1844 | 256 | 1292 | 384 | 512 | FALSE |
mod-inventory-storage | 14 | mod-inventory-storage:27.0.4 | 2 | 4096 | 3690 | 2048 | 3076 | 384 | 512 | FALSE |
mod-source-record-storage | 17 | mod-source-record-storage:5.7.5 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | FALSE |
mod-inventory | 13 | mod-inventory:20.1.7 | 2 | 2880 | 2592 | 1024 | 1814 | 384 | 512 | FALSE |
mod-di-converter-storage | 17 | mod-di-converter-storage:2.1.5 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | FALSE |
mod-source-record-manager | 16 | mod-source-record-manager:3.7.8 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | FALSE |
nginx-okapi | 9 | nginx-okapi:2023.06.14 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 | FALSE |
okapi-b | 11 | okapi:5.1.2 | 3 | 1684 | 1440 | 1024 | 922 | 384 | 512 | FALSE |
pub-okapi | 9 | pub-okapi:2023.06.14 | 2 | 1024 | 896 | 128 | 768 | 0 | 0 | FALSE |
Methodology/Approach
DI tests scenario(DI MARC Bib Update) were started from UI.
Test set №1
- Test 1: Manually tested 25k records files DI started on one tenant only.
- Test 2: Manually tested 25k records files DI started on 2 tenants concurrently, step 30%.
- Test 3: Manually tested 25k records files DI started on 3 tenants concurrently, step 30%.
Test set №2
- Changed configuration parameter for mod-permission in Mem Hard Limit=1684→2384, Mem Soft limit=1544→2244, Xmx=1024m→1500m.
- Test 1: Manually tested 25k records files DI started on one tenant only.
- Test 2: Manually tested 25k records files DI started on 2 tenants concurrently, step 30%.
- Test 3: Manually tested 25k records files DI started on 3 tenants concurrently, step 30%.
Additional links
Link to Jira ticket: https://folio-org.atlassian.net/browse/PERF-801