Table of Contents

outline	true

Overview

This document contains the results of testing Data Import for MARC Bibliographic records with an update job in the Quesnelia release on qcp1 environments with Kafka consolidated topics and file splitting features enabled on a non-ecs environment.

Jira Legacy

server	System Jira
columnIds	issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columns	key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	PERF-832

Summary

Test Results and Comparison

Test №1

Test with 1k, 10k, 25k and 50k records files DI started on one tenant only.

...

% creates

...

File

...

DI duration
Morning Glory

...

DI duration
Nolana

...

DI duration
Orchid

...

DI duration
Poppy

...

Test №2

Test with CICO 5 concurrent users and DI 1K, 5K, 10K, 25K and 50K started on one tenant only.

Сomparative Data Import and Check-In\Check-Out results between Baseline and Quesnelia.

...

# of records

...

DI Duration

with CICO

...

CI time Avg
without

...

Baseline CI delta

...

CI time 95th pct

...

Baseline CI delta

...

CO time Avg

...

Baseline CO Avg

Delta

...

CO time 95th pct

...

Baseline CO delta

...

Table of Contents

outline	true

Overview

This document contains the results of testing Data Import for MARC Bibliographic records with an update job in the Quesnelia release on qcp1 environments with Kafka consolidated topics and file splitting features enabled on a non-ecs environment.

Jira Legacy

server	System Jira
columnIds	issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columns	key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId	01505d01-b853-3c2e-90f1-ee9b165564fc
key	PERF-832

Summary

Data import tests finished successfully, only Test №5 had one failed record for Tenant 2(qcp1-01) when processed 50k files. Duration of DI grew in correspondence with the number of records in files.
Check-in and Check-out with 5 virtual users was performed during DI Create new MARC authority records jobs for non-matches No issues.
Data Import in Quesnelia without CICO perform faster than with it.
Comparing Poppy and Quesnelia releases
- Check-in / Check-out perform better in Quesnelia. Response time improved during Create jobs for long period of work time on 15% in Avarage.
- DI durations improved - 11%-14% in Average.
Used different order for Tenants when load files, started load files from Tenant 3(qcp1-02) → Tenant 2(qcp1-01) → Tenant 1(qcp1-00) to avoid problem when mod-permissions spiked and system stacked.

Test Results and Comparison

Test №1

Test with 1k, 10k, 25k and 50k records files DI started on one tenant only, and comparative results between Poppy and Quesnelia.

# of records	% creates	File	DI duration Morning Glory	DI duration Nolana	DI duration Orchid	DI duration Poppy	DI duration Quesnelia
1,000	100	1k_marc_authority.mrc?api=v2	24 s	27 s	41 sec	29 sec	22 sec -24%
5,000	100	LC_SUBJ_msplit00000000.mrc?api=v2	1 min 21 s	1 min 15 s	1min 21s	1 min 38 sec	1 min 19 sec -19%
10,000	100	msplit00000000.mrc?api=v2	2 min 32 s	2 min 31 s	2min 53s	2 min 53 sec	2 min 36 sec -9.8%
22778 (for Poppy test) 25000 (for Quesnelia test)	100	msplit00000013.mrc?api=v2	11 min 14 s	7 min 7 s	5 min 42s	6 min 24 sec	6 min 19 sec -1.3%
50,000	100	50000_authorityrecords.mrc?api=v2	22 min	11 min 24 s	11 min 11s	13 min 48 sec	11 min 59 sec -13%

Test №2

Test with CICO 5 concurrent users and DI 1K, 5K, 10K, 25K and 50K started on one tenant only.

Сomparative Baseline Check-In\Check-Out results without Data Import between Poppy and Quesnelia.

	CICO, Median time without DI (Poppy)	CICO, 95% time without DI (Poppy)	CICO, Median time without DI (Quesnelia)	CICO, 95% time without DI (Quesnelia)	CICO, Avg time without DI (Quesnelia)
Check-In	516 ms	567 ms	503 ms -2.5%	593 ms +4.5%	511 ms
Check-Out	910 ms	2094 ms	836 ms -8%	1117 ms -46%	876 ms

Сomparative Check-In\Check-Out results between Baseline (Quesnelia) and Check-In\Check-Out plus Data Import (Quesnelia.)

# of records (Quesnelia)	DI Duration with CICO (Quesnelia)	CI time Avg (Quesnelia)	CI time 95th pct (Quesnelia)	CO time Avg (Quesnelia)	CO time 95th pct (Quesnelia)	Baseline CI Avg delta	Baseline CI 95th pct delta	Baseline CO Avg delta	Baseline CO 95th pct delta
1,000	20 sec	0.560	0.754	1.164	1.313	+9%	+27%	+32%	+17%
5,000	1 min 19 sec	0.701	1.171	1.141	1.790	+37%	+97%	+30%	+60%
10,000	2 min 35 se	0.723	1.024	1.179	1.494	+41%	+72%	+34%	+34%
25,000	6 min 26 sec	0.722	1.024	1.180	1.494	+41%	+72%	+35%	+34%
50,000	12 min 16 sec	0.777	1.045	1.265	1.550	+52%	+76%	+44%	+39%

...

Сomparative Data Import and Check-In\Check-Out results between Poppy and Quesnelia.

# of records (Poppy)	DI Duration with CICO (Poppy)	CI time Avg (Poppy)	CI time 95th pct (Poppy)	CO time Avg (Poppy)	CO time 95th pct (Poppy)	# of records (Quesnelia)	DI Duration with CICO (Quesnelia)	CI time Avg (Quesnelia)	CI time 95th pct (Quesnelia)	CO time Avg (Quesnelia)	CO time 95th pct (Quesnelia)
1,000	35 sec	0.525	0.576	1.078	1.326	1,000	20 sec -42.8%	0.560 +6%	0.754 +30%	1.164 +8%	1.313 -1%
5,000	1 min 41 sec	0.513	0.612	0.9	1.019	5,000	1 min 19 sec -21.7%	0.701 +36%	1.171 +91%	1.141 +26%	1.790 +75%
10,000	3 min 4 sec	0.581	0.685	1.016	1.321	10,000	2 min 35 sec -15.7%	0.723 +24%	1.024 +49%	1.179 +16%	1.494 +13%
22,778	6 min 32 sec	0.598	1.542	1.244	1.729	25,000	6 min 26 sec -1.5%	0.722 +20%	1.024 -33%	1.180 -5%	1.494 -13%
50,000	13 min 48 sec	0.671	1.953	1.51	2.09	50,000	12 min 16 sec -11%	0.777 +15%	1.045 -46%	1.265 -16%	1.550 -25%

...

PTF - environment Quesnelia (qcp1)

10 db.r6g.xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instances, writer

Name Memory GIB vCPUs
db.r6g.xlarge
32 GiB 4 vCPUs
MSK ptf-mobius-testing2
- 4 m5.2xlarge brokers in 2 zones
- Apache Kafka version 2.8.0
- EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3

...

The above files are all stored here - MARC Resources
- 22k file what was provided from MARC Resources does nor work, so 50k file was split to file with 25k records and used instead of 22k file.
At the time of the test run, Grafana was not available. As a result, response times for Check-In/Check-Out were parsed manually from a .jtl file, using the start and finish dates of the data import tests. These results were visualized in JMeter using a Listener (Response Times Over Time).

Test set

Test 1: Manually tested 1k, 10k, 25k and 50k records files DI started on one tenant only.
Test 2: Manually tested 1k, 10k, 25k and 50k records files DI started on one tenant only plus Check-in and Checkout (CICO) for 5 concurrent users.
Test 3: Manually tested 1k, 10k, 25k and 50k records files DI started on 3 tenants concurrently. Order for load file without pause between files: 50k, 25k, 10k, 5k, and 1k for order tenants : Tenant 3(qcp1-02), Tenant 2(qcp1-01) and Tenant 1(qcp1-00)
Test 4: Manually tested 1k, 10k, 25k and 50k records files DI started on 3 tenants concurrently. Order for load file with pause between files: 50k, 25k, 10k, 5k, and 1k for order tenants : Tenant 3(qcp1-02), Tenant 1(qcp1-00) and Tenant 2(qcp1-01)
Test 5: Manually tested 1k, 10k, 25k and 50k records files DI started on 3 tenants concurrently. Order for load file without pause between files: 1k, 5k, 10k, 25k and 50k for order tenants : Tenant 3(qcp1-02), Tenant 2(qcp1-01) and Tenant 1(qcp1-00)

...

Versions Compared

Old Version 50

New Version 51

Key

Overview

Summary

Test Results and Comparison

Overview

Summary

Test Results and Comparison

Name	Memory GIB	vCPUs
db.r6g.xlarge	32 GiB	4 vCPUs

Page Comparison

Versions Compared

Old Version 50

New Version 51

Key

Overview

Summary

Test Results and Comparison

Overview

Summary

Test Results and Comparison