PTF - Export deleted MARC authority records (Ramsons) [ECS]
Test status: PASSED
Overview
Regression testing of export of deleted MARC authority records via API. Measurement the performance of an export operation of 2K (paged) records for all 100K/300K deleted records
ECS environment with PTF data set
Classic PTF configuration with no additional improvements.
The purpose of this testing is to compare test results of Ramsons release with previous, Quasnelia release, check for improvements, possible issues/degradation.
Expected export duration - under a minute.
Jiras/ links:
Quesnelia release ticket: PERF-897: Export deleted MARC authority records via APIClosed.
Quesnelia export delete MARC authority records via API Report
Current ticket: PERF-1028: [Ramsons] - Export deleted MARC authority records via APIClosed
Related to improvement task: MDEXP-769: Simplify export of deleted authority recordsClosed
Previous report PTF - Export deleted MARC authority records (Quesnelia) [ECS]
Summary
All tests are passed successfully.
SLA (duration of export should be less than a minute) met.
Test duration is faster in 100K test for more than 50% in Ramsons compared to Quesnelia
for 300K tests, performance is more or less the same, however average api response time is better
No memory leaks or spikes found
CPU usage showed stable trend. Most used module is mod-entities-links.
DB CPU usage is low for 100K tests (around 6%) and 15%-17% for 300K tests. For tests with additional load (tests with 10 loops of JMeter scrip) it’s 15-17% for 100K and 25-30% for 300K.
Test Runs/Results
Each test run got additional rerun to ensure of performance consistency. Moreover tests 3,4,7,8 was performed to check system behaviour under additional load. This tests are “fast“ so additional tests was performed to get more information about system behaviour. (tests with 10 loops of JMeter script), Named in table (10 times)
Test # | Test Conditions | Duration |
---|---|---|
1 | 100K | 8s 652 ms |
2 | 100K (rerun) | 8s 440 ms |
3 | 100K (10 times ) | 7s 860 ms (avg) |
4 | 100K (10 times ) | 7s 591 ms (avg) |
5 | 300K | 29s 567 ms |
6 | 300K | 29s 989 ms |
7 | 300K (10 times ) | 29s 579 ms (avg) |
8 | 300K (10 times ) | 28s 393 ms (avg) |
Comparisons
Tests duration was compared to same tests in Quesnelia release PTF - Export deleted MARC authority records (Quesnelia) [ECS]
In overall - performance is better by more than 50% for 100K tests and the same for 300K test.
Test | Ramsons | Quesnelia | ||
---|---|---|---|---|
| Duration (s/ ms) | GET_authority-storage/authorities response time (ms) | Duration (s/ ms) | GET_authority-storage/authorities response time (ms) |
100K | 8s 652 ms | 178 ms | 13s 317 ms | 262 ms |
300K | 29s 989 ms | 216 ms | 29s 109 ms | 288 ms |
Memory Utilization
Memory showed stable trend without visible signs of memory leaks or anomalies. Most used module is mod-entities-links and it consumes 32% memory in average during whole testing set. Please look on memory trend to all related modules in chart below.
Note: Taking into account that tests durations are less then a minute - in order to show trend, all tests included into chart. Below - table with top 3 modules
Module | Ramsons Avg | Quesnelia Avg |
---|---|---|
mod-entities-links | 32% | 37% |
okapi | 32% | - |
mod-users | 25% | - |
CPU Utilization
In 100K and 300K tests CPU usage is barely visible because of fast test duration that’s why additional tests was performed (100K x10 and 300K x 10, basically same tests but 10 times one after another without pauses) to see if there are anomalies sudden spices, etc.. CPU usage stabilise on ±3%. CPU usage spiked only on mod-entities-links (as obvious), okapi, nginx-okapi and supportive services like mod-users-bl, mod-login, mod-authtoken.
No unexpected spikes or anomalies observed.
Note: Instance-level CPU usage is not included into this report as on instance scale CPU usage during this tests is not visible.
RDS CPU Utilization
RDS CPU usage is almost invisible during 100K tests (CPU utilization was ±6% and have no difference from IDLE state of DB), however with 10X for 300K deleted Authority records, tests it’s visible that CPU reached ±30%.
On database connections chart it’s no visible signs of tests, probably because the only module created connections to DB during that time was mod-entities-links.
Note: No slow queries defined in logs and in performance insights.
Errors
No errors found during tests. All tests passed successfully all response codes and responses checked to be valid.
Appendix
Infrastructure
PTF -environment rcon
11 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 instance of db.r6.xlarge database, writer. Standard fse configuration
MSK fse-tenant
Metadata mode: KRaft
4 kafka.m7g.xlarge brokers in 2 zones
Apache Kafka version 3.7.x
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=2
Open Search ptf-test cluster
r6g.2xlarge.search 4 data nodes
r6g.large.search 3 dedicated master nodes
Ramsons release snapshot.
Dataset Size 12 M Authority records in mod-entities-links, 100K and 300 K prepared data set of deleted authority records
Methodology/Approach
Overview: starting from Orchid release Authority records is stored in mod-entities-links schema in DB. There are tables [tenant]_mod_entities_links.authority where all authority records is stored and [tenant]_mod_entities_links.authority_archive - where deleted authority records is stored.
In order to prepare data set (generating some number of deleted Authority records) Additional sql script was created:
DO $$
DECLARE
num_records_to_move INT := [specify number of records here]; -- Change this to the desired number of records
BEGIN
-- Step 1: Insert random records from authority to authority_archive with 'deleted' set to true
INSERT INTO [tenant]_mod_entities_links.authority_archive (
id, natural_id, source_file_id, source, heading, heading_type, _version,
subject_heading_code, sft_headings, saft_headings, identifiers, notes,
deleted, created_date, updated_date, created_by_user_id, updated_by_user_id
)
SELECT
id, natural_id, source_file_id, source, heading, heading_type, _version,
subject_heading_code, sft_headings, saft_headings, identifiers, notes,
true AS deleted, created_date, updated_date, created_by_user_id, updated_by_user_id
FROM
[tenant]_mod_entities_links.authority
WHERE
id NOT IN (SELECT id FROM [tenant]_mod_entities_links.authority_archive) -- Exclude existing IDs
ORDER BY
random() -- Select random records
LIMIT
num_records_to_move;
END $$;
Note: to run this script, just copy it into PGAdmin (or another DB management tool), replace [tenant] with exact tenant id, replace [specify number of records here]
with number of records to be deleted and execute it. Script above will “move“ records from authority table to authority_archive without deleting, so script is harmless do dataset and may be executed multiple times.
JMeter script designed to call in loop [GET] authority-storage/authorities endpoint with parameters: limit, deleted, offset. On each step of a loop offset parameter is changing - providing pagination by 2000 records as requested in ticket.
Preparation and execution steps:
Run SQL script in order to generate deleted Authority records. Check number of records that already existing in authority_archive beforehand.
Put JMeter script into load generator. As [GET] authority-storage/authorities endpoint is fast, latency is crucial here, so script can’t be executed from local machine.
Execute JMeter script with needed configurations. Example:
jmeter -n -t RCON_Export_Deleted_Records.jmx -l export_authority300K_1loop.jtl -Jloops=1 -Jcount_of_records_in_db=300000
Analysis:
Collect JMeter results. [GET] authority-storage/authorities avg response times and TC_Export Deleted Records transaction controller duration.
Collect CPU, Memory, usage stats for workflow related modules: mod-entities-links, okapi, nginx-okapi, pub-okapi, mod-users, mod-authtoken, mod-login.
Collect DB stats (CPU, connection)
This test is “fast“ so additional tests was performed to get more information about system behaviour. (tests with 10 loops of JMeter script)
Test Artifacts