Overview
- This document contains the results of testing Data Export (MARC BIB) on the Quesnelia [ECS] release on qcon environment.
- PERF-844Getting issue details... STATUS
Summary
- Data Export tests finished successfully on qcon environment using the profiles Default instances export job profile and srs - holdings and items job profile.
- Data Export test were run on College and Central tenants, but results for comparing between environment releases were taken from College tenant.
- Comparing with previous testing results Poppy and Quesnelia releases
- Data Export processed all files including file with 500k records without errors for Quesnelia releases.
- Data Export durations improved - 80% in Average for Quesnelia releases.
- During testing, we noticed spikes in the mod-data-export up to 593% CPU.
- For Test â„–5 Data Export started on College tenant(cs00000int_0001), Central tenant(cs00000int) and Professional tenant(cs00000int_0002) concurrently using the Default instances export job profile, we observed that the CPU usage of the mod-data-export module was initially at 44% before the test began then it spiked to 109% during the test and remained elevated without returning to the initial state.
Test Results
This table contains durations for Data Export with 2 job profiles.
Profile | CSV File | Tenant College (cs00000int_0001) | Central Tenant (cs00000int) | ||
---|---|---|---|---|---|
Result | Status | Result | Status | ||
DE MARC Bib (Default instances export job profile) | 1k.csv | 0:00:02 | COMPLETED | 0:00:05 | COMPLETED |
100k.csv | 0:02:39 | COMPLETED | 0:04:24 | COMPLETED | |
500k.csv | 0:05:21 | COMPLETED | 0:06:17 | COMPLETED | |
DE MARC Bib (srs - holdings and items) | 1k.csv | 0:00:05 | COMPLETED | 0:00:05 | COMPLETED |
100k.csv | 0:08:15 | COMPLETED | 0:05:58 | COMPLETED | |
500k.csv | 0:09:22 | COMPLETED | 0:08:28 | COMPLETED |
This table contains durations for Data Export for 3 tenants concurrently.
Tenant | CSV File | Result | Status | |||
---|---|---|---|---|---|---|
Tenant College(cs00000int_0001) | 500k.csv | COMPLETED | ||||
Tenant Professional (cs00000int_0002) | COMPLETED | |||||
Central Tenant (cs00000int) | COMPLETED |
Comparison
This table contains durations comparison between Poppy and Quesnelia releases.
Profile | CSV File | DE Duration/Status Orchid | DE Duration/Status Poppy 1 set | DE Duration/Status Quesnelia Tenant College (cs00000int_0001) | DE Duration, DELTA Poppy/Quesnelia | |||
Result | Status | Result | Status | Result | Status | hh:mm:ss / percent | ||
DE MARC Bib (Default instances export job profile) | 1k.csv | 00:00:08 | COMPLETED | 0:00:02 | COMPLETED | -00:00:06 | ||
100k.csv | 00:15:36 | COMPLETED | 0:02:39 | COMPLETED | -00:12:57 | |||
500k.csv | 00:57:25 | FAIL | 0:05:21 | COMPLETED | -00:52:04 | |||
DE MARC Bib (srs - holdings and items) | 1k.csv | 00:00:27 | COMPLETED | 00:00:29 | COMPLETED | 0:00:05 | COMPLETED | -00:00:24 |
100k.csv | 00:47:51 | COMPLETED | 00:47:23 | COMPLETED | 0:08:15 | COMPLETED | -00:39:08 | |
500k.csv | 04:00:26 | COMPLETED | 04:11:09 | FAIL | 0:09:22 | COMPLETED | -04:01:47 |
Resource utilization for Test #1 and Test #2
Service CPU Utilization
Here we can see that mod-data-export used 452% CPU in spike.
Service Memory Utilization
Here we can see that all modules show a stable trend.
DB CPU Utilization
DB CPU spike was 32%.
DB Connections
DB connections was 1470.
DB load
Top SQL-queries
# | TOP 5 SQL statements |
---|---|
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
Resource utilization for Test #3 and Test #4
Service CPU Utilization
Here we can see that mod-data-export used 336% CPU in spike.
Service Memory Utilization
Here we can see that all modules show a stable trend.
DB CPU Utilization
DB CPU was 35%.
DB Connections
DB connections was 1377.
DB load
Top SQL-queries
# | TOP 5 SQL statements |
---|---|
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
Resource utilization for Test #5
Service CPU Utilization
Here we can see that mod-data-export used 593% CPU in spike.
Service Memory Utilization
We observed that the CPU usage of the mod-data-export module was initially at 44% before the test began. It spiked to 109% during the test and remained elevated without returning to the initial state.
DB CPU Utilization
DB CPU was 50%.
DB Connections
DB connections was 1368.
DB load
Top SQL-queries
Appendix
Infrastructure
PTF - environment Quesnelia (qcon)
11 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1 [Number of ECS instances, instance type, location region]
1 instance of db.r6.xlarge database instance: Writer instance
OpenSearch
domain: fse
Number of nodes: 9
Version: OpenSearch_2_7_R20240502
MSK - tenat
4 kafka.m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
Kafka consolidated topics enabled
Methodology/Approach
Data Export tests scenario using the profiles Default instances export job profile and srs - holdings and items were started from UI on Quesnelia (qcon) ecs environment.
Test set
- Test 1: Manually tested 1k, 100k and 500k records files Data Export started on College tenant(cs00000int_0001) only using Default instances export job profile.
- Test 2: Manually tested 1k, 100k and 500k records files Data Export started on College tenant(cs00000int_0001) only using srs - holdings and items job profile.
- Test 3: Manually tested 1k, 100k and 500k records files Data Export started on Central tenant(cs00000int) only using Default instances export job profile.
- Test 4: Manually tested 1k, 100k and 500k records files Data Export started on Central tenant(cs00000int) only using srs - holdings and items job profile.
- Test 5: Manually tested 500k records file Data Export started on College tenant(cs00000int_0001), Central tenant(cs00000int) and Professional tenant(cs00000int_0002) concurrently using Default instances export job profile.
To get status and time range for export jobs the query used:
SELECT jsonb->>'status' AS status, to_timestamp((jsonb->>'startedDate')::bigint / 1000) AS startedDate, to_timestamp((jsonb->>'completedDate')::bigint / 1000) AS completedDate, exported_file->>'fileName' AS fileName, jsonb->>'jobProfileName' AS jobProfileName, (jsonb->>'completedDate')::bigint - (jsonb->>'startedDate')::bigint AS duration_ms, to_char( (to_timestamp((jsonb->>'completedDate')::bigint / 1000) - to_timestamp((jsonb->>'startedDate')::bigint / 1000))::interval, 'HH24:MI:SS' ) AS duration_hhmmss FROM cs00000int_0001_mod_data_export.job_executions, jsonb_array_elements(jsonb->'exportedFiles') AS exported_file WHERE -- (jsonb->>'hrId')::int IN (309, 310, 311, 312, 313, 314) -- Central tenant (jsonb->>'hrId')::int IN (266, 267, 268, 269, 270, 271) ORDER BY jsonb->>'startedDate' DESC LIMIT 10;