...
...
...
...
...
...
...
...
...
...
...
...
...
Table of Contents |
---|
...
Table of Contents |
---|
The purpose of the document is getting results of testing Data Import Create MARC holdings records and to detect performance trends in Quesnelia in scope of ticket Jira Legacy server System Jira columnIds issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution columns key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key PERF-855
...
- Data import create holdings job durations increased significantly in Quesnelia release. 4 times longer with 10k file. And not defined increasing in 80k Failed to complete with 80k file because it was stopped after 4 hours of test run with only 46 committed jobs (total for the test was 81).
- Top CPU utilization: mod-inventory-b - 16%, nginx-okapi - 5%, mod-source-record-storage-b - 4%, mod-quick-marc-b - 7%. Such low resource utilization from modules side can be explained by DB queries huge average latency during INSERT and UPDATE processes which had lock on the same tuple.
- Top memory consumption: mod-inventory-storage-b - 85%, mod-data-import-b - 52%, mod-source-record-storage-b - 45%, mod-source-record-manager-b - 43%. Growing trend was defined in tests set #1 for mod-inventory-storage-b - 85%
- DI job duration for the same file size grew from test to test if to use the same instance HRID to create holdings
- DI perform faster if to use files with 1 unique instance HRID for every 1000 records. DI duration corresponds to file size with such approach. Memory utilized without growing trend. CPU And module CPU and RDS CPU utilization increased because there are less locks in DBhigher load.
Recommendations & Jiras
- Investigate growing trend for mod-inventory-storage in tests set #1 (using 1 instance HRID to create all Holdings)
- Define high number of Holdings associated with one instance HRID that's still realistic
- Consider limit the request /inventory/items-by-holdings-id with limit. Now limit=0.
Jira Legacy server System Jira serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODINVSTOR-1229
Errors
- error status for 32'd split job during 80k file importing- SNAPSHOT_UPDATE_ERROR
Test Runs
Log message:
ERROR taImportKafkaHandler org.folio.inventory.dataimport.exceptions.CacheLoadingException: Error loading jobProfileSnapshot by id: 'aee287c2-0d40-4e8d-9879-4c1c54bcd819', status code: 503
Test Runs
Profile used for testing - Default - Create Holdings and SRS MARC Holdings
Set of tests № | Scenario | Test Conditions | Status |
---|---|---|---|
1 | DI Holdings Create (previous* approach)1 instance HRID for all created holdings | 1K, 5K, 10K, 80K sequentially | 1k, 5k, 10k - Completed 80k - Failed |
2 | DI Holdings Create (new** approach) 1 instance HRID for every 1000 created holdings | 1K, 5K, 10K, 80K sequentially | Completed |
*previous approach - Data import Holdings with mrc file where 1 instance HRID is associated to all holdings (1k, 5k, 10k, 80k)
**new approach - Data import Holdings with mrc file where 1 instance HRID is associated to 1000 holdings
Test Results
Set 1 - Files used to test DI create Holdings had 1 instance HRID for all created Holdings
...
Test
...
File
Duration: Orchid
Set 2 - Files used to test DI create Holdings had 1 unique instance HRID for every 1000 created Holdings (new approach)
Test | File | Duration: Orchid (previous results) | Duration: Poppy (previous results) | Duration: Quesnelia [ECS] Set #1 | Status and Errors Quesnelia [ECS] Set #1 | Duration: Quesnelia [ECS] Set #2 | Status and Errors Quesnelia [ECS] Set #2 |
---|---|---|---|---|---|---|---|
1 | 1k | 45s | 32s | 1 min 22 sec | Success | 1 min 3 sec | Success |
2 | 5k | 7m 47s | 2m 14s | 8 min |
Comparison
Table contains comparison between Quesnelia and Poppy
Set #1
Success | 4 min 16 sec | Success | |||||
3 | 10k | 19m 46s | 4m 35s | 22 min 40 sec | Success | 8 min 59 sec | Success |
4 | 80k | 20m (error*) | 36m 25s | 4 hours 13 min | Stopped by user after 46 job COMMITTED from 81 - 56% finished 1 job status - ERROR, with error status - SNAPSHOT_UPDATE_ERROR (job number - 32, file_name = '1718290065265-80k_holdings_Create_32.mrc') |
Set 2 - Files used to test DI create Holdings had 1 unique instance HRID for every 1000 created Holdings (new approach)
...
Test
...
File
...
Duration: Orchid
(previous results)
...
Duration: Poppy
(previous results)
...
52 min 5 sec | Success |
Previous test report: Data Import Create MARC holdings records [Poppy]
Service CPU Utilization
Expand | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Set #1
Set #2
|
Set #1: mod-inventory-b - 16%, nginx-okapi - 5%, mod-source-record-storage-b - 4%, mod-quick-marc-b - 7%
Set #1
Set #2
Set #2: mod-inventory-b - 33%, nginx-okapi - 23%, mod-source-record-storage-b - 11%, mod-quick-marc-b - 7%
Set #1
Set #2
Memory Utilization
Expand | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Set #1
Set #2
|
Set #1
Set #2
MSK tenant cluster
Disk usage by broker
Set #1
Set #2
CPU (User) usage by broker
Set #1
Set #2
RDS CPU Utilization
...
- 10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
- 1 database instances, writer
Name | Memory GIB | vCPUs | Engine version |
---|---|---|---|
db.r6g. |
...
xlarge | 32 GB | 4 vCPUs | 16.1 |
- MSK tenant
- 2 m5.2xlarge brokers in 2 zones
- Apache Kafka version 2.8.0
- EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=2
...
Expand | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
...
- Prepare Data Import Files 1k, 5k, 10k, 80k with defined number of holding records associated with instance HRID (1 instance HRID for all records or 1 per 1000 records)
- replace instance HRID field with active one from the environment (example: =004 colin00001144043)
- replace location field (example =852 01$bme3CC$hKFN5860.A6$iC732) where me3CC - the code of tenant location. Go to /settings/tenant-settings/location-locations and take the code of the location with active status
- to replace the field 004 - extract instance HRIDs of active instances for this tenant. Use sql query below
Get total jobs durations
Code Block language sql theme FadeToGrey title SQL to get job durations select file_name,total_records_in_file,started_date,completed_date, completed_date - started_date as duration ,status,error_status from [tenant]_mod_source_record_manager.job_execution where subordination_type = 'COMPOSITE_PARENT' -- where started_date > '2024-06-13 14:47:54' and completed_date < '2024-06-13 19:01:50.832' order by started_date desc limit 10
Get instance HRID ids
Code Block language sql theme FadeToGrey title SQL to get instance HRIDs select jsonb->>'hrid' as instanceHRID from [tenant]_mod_inventory_storage.instance where jsonb->>'discoverySuppress' = 'false' and jsonb->>'source' = 'MARC' limit 80
- Put instance HRID ids into stringsHRID.txt file without double quotes and headers. Every row should contain only HRID id
- Use PY script to replace HRID ids in mrc file if needed. Script is located in Git repository perf-testing\workflows-scripts\data-import\Holdings\Data_preparation_steps
View file name PY.zip height 250
- Run Data Import sequentially one by one from the UI with 5 min delay (delay time can vary - this time defined as comfortable to get results).
...