PTF - Performance testing of Export All Endpoint (Ramsons) [non-ECS]
Overview
This document contains results of testing Data Export All by the endpoint (POST data-export/export-all) with Default instances, authority, export holdings job profiles and with the custom profile Example 1 on (Ramsons) [non-ECS] release on rcon (fs09000000 tenant) environment.
https://folio-org.atlassian.net/browse/PERF-1146
Summary
Data Export tests finished successfully on relc environment using the Default instances, authority, holdings export job profile and Custom Mapping Profiles - Example 1.
No memory leaks are observed.
The data export duration for default profiles and the duration for the concurrent test of default profiles plus the export of 10,000 instances with a custom profile are the same, so in the report were included only results for default profiles . This is because Data-Export-All uses one data-export module task, so if we run concurrent tests, each job will be executed on a separate data-export module task, because data-export module has two tasks.
Test Runs and Results
This table contains durations for Data Export.
Test # | Job Profile | Count of exported records | Data Export Duration | Results |
|---|---|---|---|---|
2 | Default holdings export job profile | 7,814,915 | 0:50:00 | COMPLETED_WITH_ERRORS |
1 | Default instances export job profile | 8,154,287 | 3:23:00 | COMPLETED |
4 | Custom Mapping Profiles - Example 1 | 8,154,287 | 3:26:00 | COMPLETED_WITH_ERRORS |
3 | Default authority export job profile | 68,615 | 0:03:00 | COMPLETED |
Resource utilization for Test №1, №2, №3 and №4
Service CPU Utilization
Here we can see that mod-data-export used 30.8% CPU in spike.
Service Memory Utilization
Here we can see that data-export module used 108% memory.
Kafka metrics
DB CPU Utilization
DB CPU was 73.2%.
DB Connections
Max number of DB connections was 1434
DB load
Top SQL-queries
# | TOP 5 SQL statements |
|---|---|
1 | SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only |
2 | select hre1_0.id,hre1_0.instance_id,hre1_0.jsonb from v_holdings_record hre1_0 where hre1_0.instance_id=$1 |
3 | SELECT id, jsonb, holdings_record_id FROM fs09000000_mod_data_export.v_item WHERE holdings_record_id in ($1) |
4 | select ie1_0.id,ie1_0.jsonb from v_instance ie1_0 where ie1_0.id in ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,$102,$103,$104,$105,$106,$107,$108 |
5 | select iwhe1_0.id,iwhe1_0.hrid from v_instance_hrid iwhe1_0 where iwhe1_0.id in ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,$102,$103,$104,$105,$1 |
Appendix
Infrastructure
PTF -environment relc
6 r7g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instance, writer
Name | Memory GIB | vCPUs |
db.r7g.xlarge | 32 GiB | 4 vCPUs |
Number of records in DB:
fs09000000
instances - 8,154,287
holdings - 7,814,915
authorities - 68,615
Open Search ptf-loc
Data nodes
Instance type - r7g.xlarge.search
Number of nodes - 4
Version: OpenSearch_2_17_R20250403
Dedicated master nodes
Instance type - m7g.large.search
Number of nodes - 3
MSK tenant
4 kafka.m7g.xlarge brokers in 2 zones
Apache Kafka version 3.7.x
EBS storage volume per broker 300 GiB
auto.create.topics.enable=true
log.retention.minutes=480
default.replication.factor=3
Methodology/Approach
Test set:
Test 1: run Export All instances started by JMeter using endpoint(POST data-export/export-all) with profile Default holdings export job profile. Data Export started on rcon( tenant fs09000000) with one task for data-export module.
Test 2: run Export All holdings started by JMeter using endpoint(POST data-export/export-all) with profile Default instances export job profile. Data Export started on on rcon( tenant fs09000000) with one tasks for data-export module.
Test 3: run Export All instances started by JMeter using endpoint(POST data-export/export-all) with profile Custom Profile Example 1. Data Export started on rcon( tenant fs09000000) with one tasks for data-export module.
Test 4: run Export All authority started by JMeter using endpoint(POST data-export/export-all) with the Default authority export job profile Data Export started on rcon( tenant fs09000000) with one task for data-export module.
To get status and time range for import jobs the query used:
SQL Query
select jsonb->'exportedFiles'->0->>'fileName' as fileName, job_profile_name,exported, started_date,completed_date, completed_date - started_date as duration ,status
from fs09000000_mod_data_export.job_executions where started_date > '2024-07-8' order by started_date desc limit 1000;