PTF - Performance testing of Export All Endpoint (Ramsons) [ECS]
Overview
This document contains results of testing Data Export All by the endpoint (POST data-export/export-all) with Default instances, authority, export holdings job profiles and with the custom profile Example 1 on (Ramsons) [ECS] release on rcp1 environment.
PERF-1029: [Ramsons] ECS - Export All Endpoint Performance TestingIn Progress
Summary
Data Export tests finished successfully on relc environment using the Default instances, authority, holdings export job profile and Custom Mapping Profiles - Example 1.
During the tests data-export-all with Default holdings export job profile were stucking at 99%. After some investigation was found the reason:large (or small, but we don't expect that) number of remote holdings in the audit table. After excluding deleted holdings from the export, job was successfully finished
No memory leaks are observed.
The data export duration for default profiles and the duration for the concurrent test of default profiles plus the export of 10,000 instances with a custom profile are the same, so in the report were included only results for default profiles . This is because Data-Export-All uses one data-export module task, so if we run concurrent tests, each job will be executed on a separate data-export module task, because data-export module has two tasks.
Test Runs and Results
This table contains durations for Data Export.
Test # | Job Profile | Count of exported records | Data Export Duration | Results |
---|---|---|---|---|
1 | Default instances export job profile | 20784436 | 1:01:30 | COMPLETED |
2 | Default holdings export job profile(with deleted records) | 7900986 | 1:10:16 | FAILED(stucked on 99%) |
3 | Default authority export job profile | 12328793 | 0:31:16 | COMPLETED |
4 | Custom Mapping Profiles - Example 1 | 20783991 | 8:37:15 | COMPLETED_WITH_ERRORS |
5 | Default holdings export job profile(without deleted records) | 7,896,557 | 0:40:00 | COMPLETED |
Resource utilization for Test №1, №2, №3 and №4
Service CPU Utilization
Here we can see that mod-data-export used 230% CPU in spike.
Service Memory Utilization
Here we can see that data-export module used 74% memory.
Kafka metrics
DB CPU Utilization
DB CPU was 90%.
DB Connections
Max number of DB connections was 1410
DB load
Top SQL-queries
# | TOP 5 SQL statements |
---|---|
1 | SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only |
2 | SELECT i.id as id, i.instance_id as instanceId, i.tenant_id as tenantId, i.json ->> ? AS hrid, i.json ->> ? AS callNumberPrefix, i.json ->> ? AS callNumber, i.json ->> ? AS callNumberSuffix, |
3 | SELECT id, jsonb, holdings_record_id FROM cs00000int_0002_mod_data_export.v_item WHERE holdings_record_id in ($1) |
4 | SELECT id, jsonb, holdings_record_id FROM cs00000int_0004_mod_data_export.v_item WHERE holdings_record_id in ($1) |
5 | SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only |
Resource utilization for Test №5
Service CPU Utilization
Here we can see that mod-data-export used 1100% CPU.
Service Memory Utilization
Here we can see that all modules show a stable trend
Kafka metrics
DB CPU Utilization
DB CPU was 38%.
DB Connections
Max number of DB connections was 2930.
DB load
Top SQL-queries
# | TOP 5 SQL statements |
---|---|
1 | SELECT * FROM v_folio_holdings_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only |
2 | select ie1_0.id,ie1_0.jsonb from v_instance ie1_0 where ie1_0.id in ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32, |
3 | SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only |
4 | SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only |
5 | SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only |
Appendix
Infrastructure
PTF -environment relc
4 r7g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instance, writer
Name | Memory GIB | vCPUs |
db.r7g.4xlarge | 128 GiB | 16 vCPUs |
Number of records in DB:
cs00000int
instances - 20784436
holdings - 7900986 (tenant (cs00000int_0004) with the biggest number of holding records)
authorities - 12328793
Open Search ptf-loc
Data nodes
Instance type - r7g.xlarge.search
Number of nodes - 4
Version: OpenSearch_2_17_R20250403
Dedicated master nodes
Instance type - m7g.large.search
Number of nodes - 3
MSK tenant
4 kafka.m7g.xlarge brokers in 2 zones
Apache Kafka version 3.7.x
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
Methodology/Approach
Test set:
Test 1: run Export All instances started by JMeter using endpoint(POST data-export/export-all) with profile Default instances export job profile. Data Export started on central tenant (cs00000int) with one task for data-export module.
Test 2: run Export All holdings started by JMeter using endpoint(POST data-export/export-all) with profile Default holdings export job profile. Data Export started on one tenant (tenant (cs00000int_0004) with the biggest number of holding records) with one tasks for data-export module.
Test 3: run Export All authority started by JMeter using endpoint(POST data-export/export-all) with profile Default authority export job profile. Data Export started on one tenant (cs00000int) with one tasks for data-export module.
Test 4: run Export All instances started by JMeter using endpoint(POST data-export/export-all) with the Custom Profile Example 1. Data Export started on one tenant (cs00000int) with one task for data-export module.
Test 5: run Export All holdings started by JMeter using endpoint(POST data-export/export-all) with profile Default holdings export job profile. Data Export started on one tenant (tenant (cs00000int_0004) with the biggest number of holding records) (without deleted records) with one tasks for data-export module.
To get status and time range for import jobs the query used:
SQL Query
select jsonb->'exportedFiles'->0->>'fileName' as fileName, job_profile_name,exported, started_date,completed_date, completed_date - started_date as duration ,status
from fs09000000_mod_data_export.job_executions where started_date > '2024-07-8' order by started_date desc limit 1000;