Table of Contents outline true
Overview
- This document contains the results of testing Data Export (MARC BIB) on the Quesnelia [ECS] release on qcon environment.
Jira Legacy | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Summary
- Data import tests finished successfully, only Test №5 had one failed record for Tenant 2(qcp1-01) when processed 50k files. Duration of DI grew in correspondence with the number of records in files.
- Check-in and Check-out with 5 virtual users was performed during DI Create new MARC authority records jobs for non-matches No issues.
- Data Import in Quesnelia without CICO perform faster than with it.
- Comparing Poppy and Quesnelia releases
- Check-in / Check-out perform better in Quesnelia. Response time improved during Create jobs for long period of work time on 15% in Average.
- DI durations improved - 11%-14% in Average.
- During testing, we noticed spikes in the mod permissions module. To mitigate this issue and prevent system slowdowns, we adjusted the order of loading files, starting with Tenant 3 (qcp1-02), followed by Tenant 2 (qcp1-01), and finally Tenant 1 (qcp1-00).
Test Results
Comparison
Test №1
Test with 1k, 10k, 25k and 50k records files DI started on one tenant only(qcp1-00), and comparative results between Poppy and Quesnelia.
...
% creates
...
File
...
DI duration
Morning Glory
...
DI duration
Nolana
...
DI duration
Orchid
...
DI duration
Poppy
...
Test №2
Test with CICO 5 concurrent users and DI 1K, 5K, 10K, 25K and 50K started on one tenant only.
- Сomparative Baseline Check-In\Check-Out results without Data Import between Poppy and Quesnelia.
...
CICO, Median time without
DI
(Poppy)
...
593 ms
+4.5%
...
- Сomparative Check-In\Check-Out results between Baseline (Quesnelia) and Check-In\Check-Out plus Data Import (Quesnelia.)
...
DI Duration with CICO
(Quesnelia)
...
20 sec
...
12 min 16 sec
...
1.265
...
- Сomparative Data Import and Check-In\Check-Out results between Poppy and Quesnelia.
...
# of records
(Poppy)
...
DI Duration with CICO
(Poppy)
...
CI time 95th pct
(Poppy)
...
CO time Avg
(Poppy)
...
CO time 95th pct
(Poppy)
...
DI Duration with CICO
(Quesnelia)
...
20 sec
-42.8%
...
12 min 16 sec
-11%
...
1.265
-16%
...
Resource utilization for Test #1
...
title | Resource utilization table |
---|
Service CPU Utilization
Here we can see that mod-inventory-b module used 50% CPU and mod-source-record-storage-b 46% CPU
Service Memory Utilization
Here we can see that all modules show a stable trend.
DB CPU Utilization
DB CPU in the average was 90%.
DB Connections
DB connections was 1151.
DB load
Top SQL-queries
...
INSERT INTO fs09000000_mod_source_record_manager.events_processed (handler_id, event_id) VALUES ($1, $2)
...
INSERT INTO fs09000000_mod_source_record_manager.journal_records (id, job_execution_id, source_id, source_record_order, entity_type, entity_id, entity_hrid, action_type, action_status, error, action_date, title, instance_id, holdings_id, order_id, permanent_location_id, tenant_id) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17)
...
insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)
...
WITH input_rows(record_id, authority_id) AS (
VALUES ($1::uuid,$2::uuid)
)
, ins AS (
INSERT INTO fs09000000_mod_inventory.records_authorities(record_id, authority_id)
SELECT * FROM input_rows
ON CONFLICT (record_id) DO UPDATE SET record_id=EXCLUDED.record_id
RETURNING record_id::uuid, authority_id::uuid
)
SELECT record_id, authority_id
FROM ins
UNION ALL
SELECT c.record_id, c.authority_id
FROM input_rows
JOIN fs09000000_mod_inventory.records_authorities c USING (record_id);
...
UPDATE fs09000000_mod_source_record_manager.job_execution_progress SET succeeded_records_count = succeeded_records_count + $2, error_records_count = error_records_count + $3 WHERE job_execution_id = $1 Returning *
Resource utilization for Test #2
...
title | Resource utilization table |
---|
Service CPU Utilization
Here we can see that mod-inventory-b module and nginx-okapi used 50% CPU
Service Memory Utilization
Here we can see that all modules show a stable trend.
DB CPU Utilization
DB CPU was 93%.
DB Connections
DB connections was 1580.
DB load
...
Table of Contents outline true
Overview
- This document contains the results of testing Data Export (MARC BIB) on the Quesnelia [ECS] release on qcon environment.
Jira Legacy | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Summary
- Data Export tests finished successfully on qcon environment using the profiles Default instances export job profile and srs - holdings and items job profile.
- Data Export test were run on College and Central tenants, but results for comparing between environment releases were taken from College tenant.
- Comparing with previous testing results Poppy and Quesnelia releases
- Data Export processed all files including file with 500k records without errors for Quesnelia releases.
- Data Export durations improved - 80% in Average for Quesnelia releases.
- During testing, we noticed spikes in the mod-data-export up to 593% CPU.
- For Test №5 Data Export started on College tenant(cs00000int_0001), Central tenant(cs00000int) and Professional tenant(cs00000int_0002) concurrently using the Default instances export job profile, we observed that the CPU usage of the mod-data-export module was initially at 44% before the test began then it spiked to 109% during the test and remained elevated without returning to the initial state.
Test Results
This table contains durations for Data Export with 2 job profiles.
Profile | CSV File | Tenant College (cs00000int_0001) | Central Tenant (cs00000int) | ||
---|---|---|---|---|---|
Result | Status | Result | Status | ||
DE MARC Bib (Default instances export job profile) | 1k.csv | 0:00:02 | COMPLETED | 0:00:05 | COMPLETED |
100k.csv | 0:02:39 | COMPLETED | 0:04:24 | COMPLETED | |
500k.csv | 0:05:21 | COMPLETED | 0:06:17 | COMPLETED | |
DE MARC Bib (srs - holdings and items) | 1k.csv | 0:00:05 | COMPLETED | 0:00:05 | COMPLETED |
100k.csv | 0:08:15 | COMPLETED | 0:05:58 | COMPLETED | |
500k.csv | 0:09:22 | COMPLETED | 0:08:28 | COMPLETED |
This table contains durations for Test №5 Data Export for 3 tenants concurrently.
Tenant | CSV File | Result | Status |
---|---|---|---|
Tenant College (cs00000int_0001) | 500k.csv | 0:10:24 | COMPLETED |
Tenant Professional (cs00000int_0002) | 500k.csv | 0:06:47 | COMPLETED |
Central Tenant (cs00000int) | 500k.csv | 0:07:56 | COMPLETED |
Comparison
This table contains durations comparison between Poppy and Quesnelia releases.
Profile | CSV File | DE Duration/Status Orchid | DE Duration/Status Poppy 1 set | DE Duration/Status Quesnelia Tenant College (cs00000int_0001) | DE Duration, DELTA Poppy/Quesnelia | |||
Result | Status | Result | Status | Result | Status | hh:mm:ss / percent | ||
DE MARC Bib (Default instances export job profile) | 1k.csv | 00:00:08 | COMPLETED | 0:00:02 | COMPLETED | -00:00:06 | ||
100k.csv | 00:15:36 | COMPLETED | 0:02:39 | COMPLETED | -00:12:57 | |||
500k.csv | 00:57:25 | FAIL | 0:05:21 | COMPLETED | -00:52:04 | |||
DE MARC Bib (srs - holdings and items) | 1k.csv | 00:00:27 | COMPLETED | 00:00:29 | COMPLETED | 0:00:05 | COMPLETED | -00:00:24 |
100k.csv | 00:47:51 | COMPLETED | 00:47:23 | COMPLETED | 0:08:15 | COMPLETED | -00:39:08 | |
500k.csv | 04:00:26 | COMPLETED | 04:11:09 | FAIL | 0:09:22 | COMPLETED | -04:01:47 |
Resource utilization for Test #1 and Test #2
Expand | ||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||
|
Service CPU Utilization
Here we can see that mod-data-export used 452% CPU in spike.
Service Memory Utilization
Here we can see that all modules show a stable trend.
DB CPU Utilization
DB CPU spike was 32%.
DB Connections
DB connections was 1470.
DB load
Top SQL-queries
# | TOP 5 SQL statements |
---|---|
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
Resource utilization for Test #3 and Test #4
Expand | ||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||
|
Service CPU Utilization
Here we can see that mod-data-export used 336% CPU in spike.
Service Memory Utilization
Here we can see that all modules show a stable trend.
DB CPU Utilization
DB CPU was 35%.
DB Connections
DB connections was 1377.
DB load
Top SQL-queries
# | TOP 5 SQL statements |
---|---|
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
Resource utilization for Test #5
Expand | ||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||
|
Service CPU Utilization
Here we can see that mod-data-export used 593% CPU in spike.
Service Memory Utilization
We observed that the CPU usage of the mod-data-export module was initially at 44% before the test began. It spiked to 109% during the test and remained elevated without returning to the initial state.
DB CPU Utilization
DB CPU was 50%.
DB Connections
DB connections was 1368.
DB load
Top SQL-queries
...
INSERT INTO fs09000000_mod_source_record_manager.events_processed (handler_id, event_id) VALUES ($1, $2)
...
INSERT INTO fs09000000_mod_source_record_manager.journal_records (id, job_execution_id, source_id, source_record_order, entity_type, entity_id, entity_hrid, action_type, action_status, error, action_date, title, instance_id, holdings_id, order_id, permanent_location_id, tenant_id) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17)
...
UPDATE fs09000000_mod_source_record_manager.job_execution_progress SET succeeded_records_count = succeeded_records_count + $2, error_records_count = error_records_count + $3 WHERE job_execution_id = $1 Returning *
...
insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)
...
WITH input_rows(record_id, authority_id) AS (
VALUES ($1::uuid,$2::uuid)
)
, ins AS (
INSERT INTO fs09000000_mod_inventory.records_authorities(record_id, authority_id)
SELECT * FROM input_rows
ON CONFLICT (record_id) DO UPDATE SET record_id=EXCLUDED.record_id
RETURNING record_id::uuid, authority_id::uuid
)
SELECT record_id, authority_id
FROM ins
UNION ALL
SELECT c.record_id, c.authority_id
FROM input_rows
JOIN fs09000000_mod_inventory.records_authorities c USING (record_id);
Appendix
Infrastructure
PTF - environment Quesnelia (qcp1)
...
1 database instances, writer
...
db.r6g.xlarge
...
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
...
title | Quesnelia modules memory and CPU parameters |
---|
Additional links and Errors
Test №5 had one failed record for Tenant 2(qcp1-01) when processed 50k files.
- 09:55:16 [526300/metadata-provider] [fs07000001] [] [mod-authtoken] ERROR Api Access for user 'folio' (9eb67301-6f6e-468f-9b1a-6134dc39a684) requires permission: metadata-provider.incomingrecords.get
- 09:55:16 [815600/metadata-provider] [fs07000001] [9eb67301-6f6e-468f-9b1a-6134dc39a684] [mod_source_record_manager] ERROR PostgresClient queryAndAnalyze: ERROR: invalid input syntax for type uuid: "undefined" (22P02) - SELECT * FROM get_record_processing_log('3e63f944-40ea-477c-ac21-79bb24780bc5', 'undefined')
- 09:55:16 [526300/metadata-provider] [fs07000001] [] [mod-authtoken] ERROR FilterApi Permission missing in []
Also we used different order for Tenants when load files, we decided started load files from Tenant 3(qcp1-02) → Tenant 2(qcp1-01) → Tenant 1(qcp1-00) to avoid problem when mod-permissions spiked and system stacked.
CPU Utilization when mod-permissions spiked and system stacked.
CPU Utilization when mod-permissions spiked and system stacked.
Appendix
Infrastructure
PTF - environment Quesnelia (qcon)
11 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1 [Number of ECS instances, instance type, location region]
1 instance of db.r6.xlarge database instance: Writer instance
OpenSearch
domain: fse
Number of nodes: 9
Version: OpenSearch_2_7_R20240502
MSK - tenat
4 kafka.m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
Kafka consolidated topics enabled
Methodology/Approach
Data Export tests scenario using the profiles Default instances export job profile and srs - holdings and items were started from UI on Quesnelia (qcon) ecs environment.
...
- Test 1: Manually tested 1k, 100k and 500k records files Data Export started on one College tenant(cs00000int_0001) only using Default instances export job profile.
- Test 2: Manually tested 1k, 100k and 500k records files Data Export started on one College tenant(cs00000int_0001) only using srs - holdings and items job profile.
- Test 3: Manually tested 1k, 100k and 500k records files Data Export started on central Central tenant(cs00000int) only using Default instances export job profile.
- Test 4: Manually tested 1k, 100k and 500k records files Data Export started on central Central tenant(cs00000int) only using srs - holdings and items job profile.
- Test 5: Manually tested 500k records file Data Export started on College tenant(cs00000int_0001), Central tenant(cs00000int) and Professional tenant(cs00000int_0002) concurrently using Default instances export job profile.
To get status and time range for export jobs the query used:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
SELECT jsonb->>'status' AS status, to_timestamp((jsonb->>'startedDate')::bigint / 1000) AS startedDate, to_timestamp((jsonb->>'completedDate')::bigint / 1000) AS completedDate, exported_file->>'fileName' AS fileName, jsonb->>'jobProfileName' AS jobProfileName, (jsonb->>'completedDate')::bigint - (jsonb->>'startedDate')::bigint AS duration_ms, to_char( (to_timestamp((jsonb->>'completedDate')::bigint / 1000) - to_timestamp((jsonb->>'startedDate')::bigint / 1000))::interval, 'HH24:MI:SS' ) AS duration_hhmmss FROM cs00000int_0001_mod_data_export.job_executions, jsonb_array_elements(jsonb->'exportedFiles') AS exported_file WHERE -- (jsonb->>'hrId')::int IN (309, 310, 311, 312, 313, 314) -- Central tenant (jsonb->>'hrId')::int IN (266, 267, 268, 269, 270, 271) ORDER BY jsonb->>'startedDate' DESC LIMIT 10; |
...