Overview

The purpose of the document is getting results of testing Data Import Create MARC holdings records and to detect performance trends in Quesnelia in scope of ticket PERF-855 - Getting issue details... STATUS

Compared with results in previous test report: Data Import Create MARC holdings records [Poppy]

Summary

Data import create holdings job durations increased significantly in Quesnelia release. 4 times longer with 10k file. Failed to complete with 80k file because it was stopped after 4 hours of test run with only 46 committed jobs (total for the test was 81).
Top CPU utilization: mod-inventory-b - 16%, nginx-okapi - 5%, mod-source-record-storage-b - 4%, mod-quick-marc-b - 7%. Such low resource utilization from modules side can be explained by DB queries huge average latency during INSERT and UPDATE processes which had lock on the same tuple.
Top memory consumption: mod-inventory-storage-b - 85%, mod-data-import-b - 52%, mod-source-record-storage-b - 45%, mod-source-record-manager-b - 43%. Growing trend was defined in tests set #1 for mod-inventory-storage-b - 85%
DI job duration for the same file size grew from test to test if to use the same instance HRID to create holdings
DI perform faster if to use files with 1 unique instance HRID for every 1000 records. DI duration corresponds to file size with such approach. Memory utilized without growing trend. CPU and RDS utilization increased because there are less locks in DB.

Recommendations & Jiras

Investigate growing trend for mod-inventory-storage in tests set #1 (using 1 instance HRID to create all Holdings)
Define high number of Holdings associated with one instance HRID that's still realistic
Consider limit the request /inventory/items-by-holdings-id with limit. Now limit=0. MODINVSTOR-1229 - Getting issue details... STATUS

Errors

error status for 32'd split job during 80k file importing- SNAPSHOT_UPDATE_ERROR

Log message: 
ERROR taImportKafkaHandler org.folio.inventory.dataimport.exceptions.CacheLoadingException: Error loading jobProfileSnapshot by id: 'aee287c2-0d40-4e8d-9879-4c1c54bcd819', status code: 503

Test Runs

Profile used for testing - Default - Create Holdings and SRS MARC Holdings

Set of tests №

Scenario

Test Conditions

Status

1

DI Holdings Create (previous* approach)

1K, 5K, 10K, 80K sequentially

1k, 5k, 10k - Completed

80k - Failed

2

DI Holdings Create (new** approach)

1K, 5K, 10K, 80K sequentially

Completed

*previous approach - Data import Holdings with mrc file where 1 instance HRID is associated to all holdings (1k, 5k, 10k, 80k)

**new approach - Data import Holdings with mrc file where 1 instance HRID is associated to 1000 holdings

Test Results

Set 1 - Files used to test DI create Holdings had 1 instance HRID for all created Holdings

Set 2 - Files used to test DI create Holdings had 1 unique instance HRID for every 1000 created Holdings (new approach)

Test	File	Duration: Orchid (previous results)	Duration: Poppy (previous results)	Duration: Quesnelia [ECS] Set #1	Status and Errors Quesnelia [ECS] Set #1	Duration: Quesnelia [ECS] Set #2	Status and Errors Quesnelia [ECS] Set #2
1	1k	45s	32s	1 min 22 sec	Success	1 min 3 sec	Success
2	5k	7m 47s	2m 14s	8 min	Success	4 min 16 sec	Success
3	10k	19m 46s	4m 35s	22 min 40 sec	Success	8 min 59 sec	Success
4	80k	20m (error*)	36m 25s	4 hours 13 min	Stopped by user after 46 job COMMITTED from 81 - 56% finished 1 job status - ERROR, with error status - SNAPSHOT_UPDATE_ERROR (job number - 32, file_name = '1718290065265-80k_holdings_Create_32.mrc')	52 min 5 sec	Success

Previous test report: Data Import Create MARC holdings records [Poppy]

Service CPU Utilization

CPU utilization, 1k, 5k, 10k, 80k

Set #1

Module	CPU (1k)	CPU (5k)	CPU (10k)	CPU (80k)
mod-inventory-b	20.23	24.03	20.5	16.34
mod-di-converter-storage-b	12.3	13.51	6.21	3.94
nginx-okapi	10.67	13.07	10.37	5.13
mod-inventory-storage-b	9.93	14.49	12.87	10.95
mod-quick-marc-b	8.24	7.34	7.95	6.82
mod-source-record-storage-b	7.99	8.42	6.47	3.79
mod-users-b	7.16	5.95	5.93	6.58
okapi-b	6.76	7.8	6.15	3.75
mod-pubsub-b	5.69	5.73	5.78	5.69
mod-data-import-b	5.15	1.27	1.6	1.23
mod-source-record-manager-b	3.7	4.55	4.01	3.06
mod-authtoken-b	3.69	1.26	1.46	0.99
mod-password-validator-b	2.36	2.34	2.34	3.33
mod-feesfines-b	2.34	2.12	2.2	2.16
mod-configuration-b	2.22	2.08	1.98	2.71
mod-permissions-b	1.66	0.8	1.11	0.94
mod-circulation-storage-b	0.61	0.61	0.61	0.68
mod-circulation-b	0.36	0.39	0.33	0.4
pub-okapi	0.19	0.13	0.13	0.13

Set #2

Module	CPU (1k)	CPU (5k)	CPU (10k)	CPU (80k)
mod-inventory-b	11.93	28.85	32.97	33.23
mod-quick-marc-b	7.19	8.1	8.19	7.64
mod-pubsub-b	6.2	6.56	6.72	6.55
mod-users-b	5.79	6.47	6.09	6.82
mod-configuration-b	3.12	3.21	3.47	3.4
mod-feesfines-b	2.39	2.47	2.55	2.34
mod-password-validator-b	2.26	2.4	3.27	2.41
mod-di-converter-storage-b	1.95	10.22	9.39	8.98
mod-source-record-storage-b	1.79	11.45	11.64	10.84
mod-source-record-manager-b	1.65	6.44	6.42	5.87
mod-data-import-b	1.36	1.48	1.83	1.48
okapi-b	1.28	13.81	14	14.93
mod-authtoken-b	1.02	1.24	1.6	1.6
mod-circulation-storage-b	0.7	0.71	0.71	0.73
nginx-okapi	0.59	20.99	20.39	22.89
mod-permissions-b	0.48	4.95	1.65	1.63
mod-circulation-b	0.35	0.36	0.35	0.36
mod-inventory-storage-b	0.33	14.24	14.54	13.96
pub-okapi	0.17	0.24	0.24	0.23

Set #1: mod-inventory-b - 16%, nginx-okapi - 5%, mod-source-record-storage-b - 4%, mod-quick-marc-b - 7%

Set #1

Set #2

Set #2: mod-inventory-b - 33%, nginx-okapi - 23%, mod-source-record-storage-b - 11%, mod-quick-marc-b - 7%

Memory Utilization

Memory consumption

Set #1

Module	Memory
mod-inventory-storage-b	85.62
mod-data-import-b	51.63
mod-source-record-storage-b	44.97
mod-source-record-manager-b	42.86
mod-users-b	40.38
mod-inventory-b	39.47
mod-permissions-b	35.82
okapi-b	33.4
mod-di-converter-storage-b	33.26
mod-feesfines-b	32.37
mod-quick-marc-b	31.46
mod-configuration-b	29.41
mod-pubsub-b	25.66
mod-authtoken-b	20.55
mod-circulation-storage-b	18.93
mod-circulation-b	17.87
nginx-okapi	4.8
pub-okapi	4.8

Set #2

Module	Memory
mod-inventory-storage-b	56.04
mod-data-import-b	55.45
mod-inventory-b	45.63
mod-source-record-manager-b	41.19
mod-users-b	38.95
mod-source-record-storage-b	37.37
mod-quick-marc-b	33.59
mod-permissions-b	33.45
okapi-b	32.82
mod-feesfines-b	32.65
mod-di-converter-storage-b	31.91
mod-configuration-b	28.49
mod-circulation-storage-b	26.86
mod-pubsub-b	25.83
mod-circulation-b	20.14
mod-authtoken-b	19.97
nginx-okapi	4.69
pub-okapi	4.58

Set #1

Set #2

MSK tenant cluster

Disk usage by broker

Set #1

Set #2

CPU (User) usage by broker

Set #1

Set #2

RDS CPU Utilization

Set #1

62% for major part of the tests which is 20% less than in Poppy. It raised to 73% with 80k file after 50 minutes of test tun.

Set #2

99% during all tests

DB Connections

Set #1

DB connections number- 1430

Set #2

DB connections number- 1500

DB Load

Set #1

Set #2

SQL queries

Set #1

Set #2

TOP SQL Queries for Set #1

UPDATE cs00000int_0001_mod_inventory_storage.holdings_record SET jsonb = $1::jsonb WHERE id = '[UUID]'
INSERT INTO cs00000int_0001_mod_inventory_storage.holdings_record (id, jsonb) VALUES ($1, $2) RETURNING jsonb
autovacuum: VACUUM cs00000int_mod_entities_links.authority
autovacuum: VACUUM cs00000int_mod_entities_links.authority_archive
autovacuum: VACUUM pg_toast.pg_toast_40004

  INSERT INTO cs00000int_mod_search.consortium_instance (tenant_id, instance_id, json, created_date, updated_date)
  VALUES ($1, $2, $3::json, $4, $5)
  ON CONFLICT (tenant_id, instance_id)
  DO UPDATE SET json = EXCLUDED.json, updated_date = EXCLUDED.updated_date

SELECT jsonb,id FROM cs00000int_0001_mod_inventory_storage.instance_holdings_item_view WHERE id='db87a6b4-d1f5-4e3d-b34b-d4bf06426127' LIMIT 1 OFFSET 0

TOP SQL Queries for Set #2

INSERT INTO cs00000int_0001_mod_inventory_storage.holdings_record (id, jsonb) VALUES ($1, $2) RETURNING jsonb
UPDATE cs00000int_0001_mod_inventory_storage.holdings_record SET jsonb = $1::jsonb WHERE id = '47ee9b78-3d8f-4e8b-b09e-82e9396eb3b3'

with "cte" as (select count(*) from "records_lb" where ("records_lb"."snapshot_id" <> cast($1 as uuid) and "records_lb"."external_id" = cast($2 as uuid) and "records_lb"."record_type" = $3::"record_type")) select "records_lb"."id", "records_lb"."snapshot_id", "records_lb"."matched_id", "records_lb"."generation", "records_lb"."record_type", "records_lb"."external_id", "records_lb"."state", "records_lb"."leader_record_status", "records_lb"."order", "records_lb"."suppress_discovery", "records_lb"."created_by_user_id", "records_lb"."created_date", "records_lb"."updated_by_user_id", "records_lb"."updated_date", "records_lb"."external_hrid", "marc_records_lb"."content" as "parsed_record_content", "raw_records_lb"."content" as "raw_record_content", "error_records_lb"."content" as "error_record_content", "error_records_lb"."description", "count" from "records_lb" left outer join "marc_records_lb" on "records_lb"."id" = "marc_records_lb"."id" left outer join "raw_records_lb" on "records_lb"."id" = "raw_records_lb"."id" left outer join "error_records_lb" on "records_lb"."id" = "error_records_lb"."id" right outer join (select * from "cte") as "alias_80949780" on true where ("records_lb"."snapshot_id" <> cast($4 as uuid) and "records_lb"."external_id" = cast($5 as uuid) and "records_lb"."record_type" = $6::"record_type") offset $7 rows fetch next $8 rows only

INSERT INTO cs00000int_0001_mod_source_record_manager.events_processed (handler_id, event_id) VALUES ($1, $2)
INSERT INTO cs00000int_0001_mod_source_record_manager.journal_records (id, job_execution_id, source_id, source_record_order, entity_type, entity_id, entity_hrid, action_type, action_status, error, action_date, title, instance_id, holdings_id, order_id, permanent_location_id, tenant_id) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17)

autovacuum: VACUUM pg_toast.pg_toast_40004

Infrastructure

PTF - environment qcon

10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instances, writer

Name	Memory GIB	vCPUs	Engine version	Architecture settings
db.r6g.xlarge	32 GB	4 vCPUs	16.1	Non-multitenant architecture

MSK tenant
- 2 m5.2xlarge brokers in 2 zones
- Apache Kafka version 2.8.0
- EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=2

Modules

Module	Task Def. Revision	Module Version	Task Count	Mem Hard Limit	Mem Soft limit	CPU units	Xmx	MetaspaceSize	MaxMetaspaceSize
mod-inventory-b	3	mod-inventory:20.2.0	2	2880	2592	1024	1814	384	512
mod-quick-marc-b	1	mod-quick-marc:5.1.0	2	2288	2176	128	1664	384	512
nginx-okapi	1	nginx-okapi:2023.06.14	2	1024	896	128
mod-di-converter-storage-b	2	mod-di-converter-storage:2.2.2	2	1024	896	128	768	88	128
okapi-b	1	okapi:5.3.0	3	1684	1440	1024	922	384	512
mod-source-record-storage-b	1	mod-source-record-storage:5.8.0	2	5600	5000	2048	3500	384	512
mod-source-record-manager-b	1	mod-source-record-manager:3.8.0	2	5600	5000	2048	3500	384	512
mod-inventory-storage-b	2	mod-inventory-storage:27.2.0-SNAPSHOT.738	2	4096	3690	2048	3076	384	512
mod-pubsub-b	1	mod-pubsub:2.13.0	2	1536	1440	1024	922	384	922
mod-users-b	1	mod-users:19.3.1	2	1024	896	128	768	88	128
mod-data-import-b	1	mod-data-import:3.1.0	1	2048	1844	256	1292	384	512
mod-organizations-storage-b	1	mod-organizations-storage:4.7.0	2	1024	896	128	700	88	128
mod-notes-b	1	mod-notes:5.2.0	2	1024	896	128	952	384	512
mod-gobi-b	1	mod-gobi:2.8.0	2	1024	896	128	700	88	128
mod-permissions-b	2	mod-permissions:6.5.0	2	1684	1544	512	1024	384	512
mod-search-b	9	mod-search:3.3.0-SNAPSHOT.224	2	2592	2480	2048	1440	512	1024
mod-circulation-storage-b	1	mod-circulation-storage:17.2.0	2	2880	2592	1536	1814	384	512
mod-circulation-b	2	mod-circulation:24.2.1	2	2880	2592	1536	1814	384	512
pub-okapi	1	pub-okapi:2023.06.14	2	1024	896	128	768

Methodology/Approach

Prepare Data Import Files 1k, 5k, 10k, 80k with defined number of holding records associated with instance HRID (1 instance HRID for all records or 1 per 1000 records)
1. replace instance HRID field with active one from the environment (example: =004 colin00001144043)
2. replace location field (example =852 01$bme3CC$hKFN5860.A6$iC732) where me3CC - the code of tenant location. Go to /settings/tenant-settings/location-locations and take the code of the location with active status
3. to replace the field 004 - extract instance HRIDs of active instances for this tenant. Use sql query below
  1. Get total jobs durations
    SQL to get job durations
    select file_name,total_records_in_file,started_date,completed_date, completed_date - started_date as duration ,status,error_status from [tenant]_mod_source_record_manager.job_execution where subordination_type = 'COMPOSITE_PARENT' -- where started_date > '2024-06-13 14:47:54' and completed_date < '2024-06-13 19:01:50.832' order by started_date desc limit 10
  2. Get instance HRID ids
    SQL to get instance HRIDs
    select jsonb->>'hrid' as instanceHRID from [tenant]_mod_inventory_storage.instance where jsonb->>'discoverySuppress' = 'false' and jsonb->>'source' = 'MARC' limit 80
  3. Put instance HRID ids into stringsHRID.txt file without double quotes and headers. Every row should contain only HRID id
  4. Use PY script to replace HRID ids in mrc file if needed. Script is located in Git repository perf-testing\workflows-scripts\data-import\Holdings\Data_preparation_steps
Run Data Import sequentially one by one from the UI with 5 min delay (delay time can vary - this time defined as comfortable to get results).

Data Import Create MARC Holdings Records [ECS] [Quesnelia]

Summary

Recommendations & Jiras

Errors

Test Runs

Service CPU Utilization

Memory Utilization

MSK tenant cluster

Disk usage by broker

CPU (User) usage by broker

RDS CPU Utilization

DB Connections

DB Load

SQL queries

Infrastructure

Methodology/Approach