PTF - Performance testing of Export All Endpoint (Ramsons) [ECS]

PTF - Performance testing of Export All Endpoint (Ramsons) [ECS]

Overview

  • This document contains results of testing Data Export All by the endpoint (POST data-export/export-all) with Default instances, authority, export holdings job profiles and with the custom profile Example 1 on (Ramsons) [ECS] release on rcp1 environment. 

PERF-1029: [Ramsons] ECS - Export All Endpoint Performance TestingClosed

Summary

  • Data Export tests finished successfully on relc environment using the Default instances, authority, holdings export job profile and Custom Mapping Profiles - Example 1.

  • During the tests data-export-all with Default holdings export job profile were stucking at 99%. After some investigation was found the reason:large (or small, but we don't expect that) number of remote holdings in the audit table. After excluding deleted holdings from the export, job was successfully finished

  • No memory leaks are observed.

  • The data export duration for default profiles and the duration for the concurrent test of default profiles plus the export of 10,000 instances with a custom profile are the same, so in the report were included only results for default profiles . This is because Data-Export-All uses one data-export module task, so if we run concurrent tests, each job will be executed on a separate data-export module task, because data-export module has two tasks.

Test Runs and Results

This table contains durations for Data Export. 

Test #

Job Profile

Count of exported records

Data Export Duration
(hh:mm:ss)

Results

Test #

Job Profile

Count of exported records

Data Export Duration
(hh:mm:ss)

Results

1

Default instances export job profile

20784436

1:01:30

COMPLETED

2

Default holdings export job profile(with deleted records)

7900986

1:10:16

FAILED(stucked on 99%)

3

Default authority export job profile

12328793

0:31:16

COMPLETED

4

Custom Mapping Profiles - Example 1

20783991

8:37:15

COMPLETED_WITH_ERRORS

5

Default holdings export job profile(without deleted records)

7,896,557

0:40:00

COMPLETED





Resource utilization for Test №1, №2, №3 and №4

Service Name

CPU

Service Name

RAM

Service Name

CPU

Service Name

RAM

mod-data-export-b

243.5%

mod-data-export-b

74.88%

mod-serials-management-b

16.80%

mgr-tenant-entitlements-b

65.40%

mod-tlr-b

8.14%

mod-agreements-b

64.91%

mod-quick-marc-b

7.92%

mgr-applications-b

63.61%

mod-ebsconet-b

7.80%

mod-data-export-worker-b

59.79%

mod-scheduler-b

6.85%

mgr-tenants-b

57.36%

mod-dcb-b

6.52%

mod-scheduler-b

55.79%

mod-inventory-b

6.22%

folio-keycloak-b

55.32%

mod-audit-b

5.64%

mod-roles-keycloak-b

54.37%

mod-search-b

5.56%

mod-invoice-b

54.07%

Service CPU Utilization

Here we can see that mod-data-export used 230% CPU in spike.

Service Memory Utilization

Here we can see that data-export module used 74% memory.

Service Memory Usage.png





Kafka metrics





DB CPU Utilization

DB CPU was 90%.

DB Connections

Max number of DB connections was 1410

DB load

Top SQL-queries



#

TOP 5 SQL statements

#

TOP 5 SQL statements

1

SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

2

SELECT i.id as id, i.instance_id as instanceId, i.tenant_id as tenantId, i.json ->> ? AS hrid, i.json ->> ? AS callNumberPrefix, i.json ->> ? AS callNumber, i.json ->> ? AS callNumberSuffix,
i.json ->> ? AS copyNumber, i.json ->> ? AS permanentLocationId, i.json ->> ? AS discoverySuppress FROM cs00000int_mod_search.holding i WHERE instance_id = $1::uuid LIMIT ? OFFSET ?

3

SELECT id, jsonb, holdings_record_id FROM cs00000int_0002_mod_data_export.v_item WHERE holdings_record_id in ($1)

4

SELECT id, jsonb, holdings_record_id FROM cs00000int_0004_mod_data_export.v_item WHERE holdings_record_id in ($1)

5

SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

Resource utilization for Test №5

Service Name

CPU

Service Name

RAM

Service Name

CPU

Service Name

RAM

mod-data-export-b

1100%

mod-inventory-b

73.98%

mod-serials-management-b

318%

mod-data-export-b

72.38%

mod-pubsub-b

36.70%

mgr-applications-b

67.41%

mod-inventory-b

31.61%

mod-data-import-b

65.46%

mod-data-export-worker-b

19.94%

mgr-tenant-entitlements-b

63.50%

folio-keycloak-b

16.63%

mod-data-export-worker-b

63.31%

mod-source-record-storage-b

16.56%

mod-finance-b

59.47%

kong-b

15.77%

folio-keycloak-b

58.00%

mod-consortia-keycloak-b

14.20%

mod-orders-b

57.34%

mod-search-b

12.68%

mod-roles-keycloak-b

55.70%

Service CPU Utilization

Here we can see that mod-data-export used 1100% CPU.

Service Memory Utilization

Here we can see that all modules show a stable trend 



Kafka metrics





DB CPU Utilization

DB CPU was 38%.

DB Connections

Max number of DB connections was 2930.

DB load

Top SQL-queries



#

TOP 5 SQL statements

#

TOP 5 SQL statements

1

SELECT * FROM v_folio_holdings_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

2

select ie1_0.id,ie1_0.jsonb from v_instance ie1_0 where ie1_0.id in ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,
$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,
$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,$102,$103,$104,$105,$106,$107,$108

3

 SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

4

SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

5

SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

Appendix

Infrastructure

PTF -environment relc

Name

Memory GIB

vCPUs

db.r7g.4xlarge

 128 GiB

16 vCPUs

  • Number of records in DB:

    • cs00000int

      • instances - 20784436

      • holdings - 7900986 (tenant (cs00000int_0004) with the biggest number of holding records)

      • authorities - 12328793

  • Open Search ptf-loc

    • Data nodes

      • Instance type - r7g.xlarge.search

      • Number of nodes - 4

      • Version: OpenSearch_2_17_R20250403

    • Dedicated master nodes

      • Instance type - m7g.large.search

      • Number of nodes - 3

  • MSK tenant

    • 4 kafka.m7g.xlarge brokers in 2 zones

    • Apache Kafka version 3.7.x

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3



Cluster Resources - relc (Jun 18, 2025, 9:45:13 AM) 2025



Methodology/Approach

Test set:

  • Test 1:  run Export All instances started by JMeter using endpoint(POST data-export/export-all) with profile Default instances export job profileData Export started on central tenant (cs00000int) with one task for data-export module.

  • Test 2:  run Export All holdings started by JMeter using endpoint(POST data-export/export-all) with profile Default holdings export job profileData Export started on one tenant (tenant (cs00000int_0004) with the biggest number of holding records) with one tasks for data-export module.

  • Test 3:  run Export All authority started by JMeter using endpoint(POST data-export/export-all) with profile Default authority export job profileData Export started on one tenant (cs00000int) with one tasks for data-export module.

  • Test 4:  run Export All instances started by JMeter using endpoint(POST data-export/export-all) with the Custom Profile Example 1. Data Export started on one tenant (cs00000int) with one task for data-export module.

  • Test 5:  run Export All holdings started by JMeter using endpoint(POST data-export/export-all) with profile Default holdings export job profileData Export started on one tenant (tenant (cs00000int_0004) with the biggest number of holding records) (without deleted records) with one tasks for data-export module





To get status and time range for import jobs the query used: 

SQL Query
select jsonb->'exportedFiles'->0->>'fileName' as fileName, job_profile_name,exported, started_date,completed_date, completed_date - started_date as duration ,status from fs09000000_mod_data_export.job_executions where started_date > '2024-07-8' order by started_date desc limit 1000;



Magda Zacharska
June 16, 2025

Does this table represents how long the PTF tests lasted or how long did it take for data export to export reported number of records? Also, the Result column is this a result of the PTF test or the result of data export job?

Oleksandr Hrusha
July 10, 2025

Hello! Duration of data export is answer for both your questions!