PTF - Performance testing of Export All Endpoint (Ramsons) [ECS]

PTF - Performance testing of Export All Endpoint (Ramsons) [ECS]

Overview

  • This document contains results of testing Data Export All by the endpoint (POST data-export/export-all) with Default instances, authority, export holdings job profiles and with the custom profile Example 1 on (Ramsons) [ECS] release on rcp1 environment. 

PERF-1029: [Ramsons] ECS - Export All Endpoint Performance TestingIn Progress

Summary

  • Data Export tests finished successfully on relc environment using the Default instances, authority, holdings export job profile and Custom Mapping Profiles - Example 1.

  • During the tests data-export-all with Default holdings export job profile were stucking at 99%. After some investigation was found the reason:large (or small, but we don't expect that) number of remote holdings in the audit table. After excluding deleted holdings from the export, job was successfully finished

  • No memory leaks are observed.

  • The data export duration for default profiles and the duration for the concurrent test of default profiles plus the export of 10,000 instances with a custom profile are the same, so in the report were included only results for default profiles . This is because Data-Export-All uses one data-export module task, so if we run concurrent tests, each job will be executed on a separate data-export module task, because data-export module has two tasks.

Test Runs and Results

This table contains durations for Data Export. 

Test #

Job Profile

Count of exported records

Data Export Duration
(hh:mm:ss)

Results

Test #

Job Profile

Count of exported records

Data Export Duration
(hh:mm:ss)

Results

1

Default instances export job profile

20784436

1:01:30

COMPLETED

2

Default holdings export job profile(with deleted records)

7900986

1:10:16

FAILED(stucked on 99%)

3

Default authority export job profile

12328793

0:31:16

COMPLETED

4

Custom Mapping Profiles - Example 1

20783991

8:37:15

COMPLETED_WITH_ERRORS

5

Default holdings export job profile(without deleted records)

7,896,557

0:40:00

COMPLETED





Resource utilization for Test №1, №2, №3 and №4

Service Name

CPU

Service Name

RAM

Service Name

CPU

Service Name

RAM

mod-data-export-b

243.5%

mod-data-export-b

74.88%

mod-serials-management-b

16.80%

mgr-tenant-entitlements-b

65.40%

mod-tlr-b

8.14%

mod-agreements-b

64.91%

mod-quick-marc-b

7.92%

mgr-applications-b

63.61%

mod-ebsconet-b

7.80%

mod-data-export-worker-b

59.79%

mod-scheduler-b

6.85%

mgr-tenants-b

57.36%

mod-dcb-b

6.52%

mod-scheduler-b

55.79%

mod-inventory-b

6.22%

folio-keycloak-b

55.32%

mod-audit-b

5.64%

mod-roles-keycloak-b

54.37%

mod-search-b

5.56%

mod-invoice-b

54.07%

Service CPU Utilization

Here we can see that mod-data-export used 230% CPU in spike.

Service Memory Utilization

Here we can see that data-export module used 74% memory.

Service Memory Usage.png





Kafka metrics





DB CPU Utilization

DB CPU was 90%.

DB Connections

Max number of DB connections was 1410

DB load

 

                                                                                                                    

Top SQL-queries



#

TOP 5 SQL statements

#

TOP 5 SQL statements

1

SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

2

SELECT i.id as id, i.instance_id as instanceId, i.tenant_id as tenantId, i.json ->> ? AS hrid, i.json ->> ? AS callNumberPrefix, i.json ->> ? AS callNumber, i.json ->> ? AS callNumberSuffix,
i.json ->> ? AS copyNumber, i.json ->> ? AS permanentLocationId, i.json ->> ? AS discoverySuppress FROM cs00000int_mod_search.holding i WHERE instance_id = $1::uuid LIMIT ? OFFSET ?

3

SELECT id, jsonb, holdings_record_id FROM cs00000int_0002_mod_data_export.v_item WHERE holdings_record_id in ($1)

4

SELECT id, jsonb, holdings_record_id FROM cs00000int_0004_mod_data_export.v_item WHERE holdings_record_id in ($1)

5

SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

Resource utilization for Test №5

Service Name

CPU

Service Name

RAM

Service Name

CPU

Service Name

RAM

mod-data-export-b

1100%

mod-inventory-b

73.98%

mod-serials-management-b

318%

mod-data-export-b

72.38%

mod-pubsub-b

36.70%

mgr-applications-b

67.41%

mod-inventory-b

31.61%

mod-data-import-b

65.46%

mod-data-export-worker-b

19.94%

mgr-tenant-entitlements-b

63.50%

folio-keycloak-b

16.63%

mod-data-export-worker-b

63.31%

mod-source-record-storage-b

16.56%

mod-finance-b

59.47%

kong-b

15.77%

folio-keycloak-b

58.00%

mod-consortia-keycloak-b

14.20%

mod-orders-b

57.34%

mod-search-b

12.68%

mod-roles-keycloak-b

55.70%

Service CPU Utilization

Here we can see that mod-data-export used 1100% CPU.

Service Memory Utilization

Here we can see that all modules show a stable trend 



Kafka metrics





DB CPU Utilization

DB CPU was 38%.

DB Connections

Max number of DB connections was 2930.

DB load

 

                                                                                                                    

Top SQL-queries



#

TOP 5 SQL statements

#

TOP 5 SQL statements

1

SELECT * FROM v_folio_holdings_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

2

select ie1_0.id,ie1_0.jsonb from v_instance ie1_0 where ie1_0.id in ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,
$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,
$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,$102,$103,$104,$105,$106,$107,$108

3

 SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

4

SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

5

SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

Appendix

Infrastructure

PTF -environment relc

Name

Memory GIB

vCPUs

db.r7g.4xlarge

 128 GiB

16 vCPUs

  • Number of records in DB:

    • cs00000int

      • instances - 20784436

      • holdings - 7900986 (tenant (cs00000int_0004) with the biggest number of holding records)

      • authorities - 12328793

  • Open Search ptf-loc

    • Data nodes

      • Instance type - r7g.xlarge.search

      • Number of nodes - 4

      • Version: OpenSearch_2_17_R20250403

    • Dedicated master nodes

      • Instance type - m7g.large.search

      • Number of nodes - 3

  • MSK tenant

    • 4 kafka.m7g.xlarge brokers in 2 zones

    • Apache Kafka version 3.7.x

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3



Cluster Resources - relc (Jun 18, 2025, 9:45:13 AM) 2025



Methodology/Approach

Test set:

  • Test 1:  run Export All instances started by JMeter using endpoint(POST data-export/export-all) with profile Default instances export job profileData Export started on central tenant (cs00000int) with one task for data-export module.

  • Test 2:  run Export All holdings started by JMeter using endpoint(POST data-export/export-all) with profile Default holdings export job profileData Export started on one tenant (tenant (cs00000int_0004) with the biggest number of holding records) with one tasks for data-export module.

  • Test 3:  run Export All authority started by JMeter using endpoint(POST data-export/export-all) with profile Default authority export job profileData Export started on one tenant (cs00000int) with one tasks for data-export module.

  • Test 4:  run Export All instances started by JMeter using endpoint(POST data-export/export-all) with the Custom Profile Example 1. Data Export started on one tenant (cs00000int) with one task for data-export module.

  • Test 5:  run Export All holdings started by JMeter using endpoint(POST data-export/export-all) with profile Default holdings export job profileData Export started on one tenant (tenant (cs00000int_0004) with the biggest number of holding records) (without deleted records) with one tasks for data-export module





To get status and time range for import jobs the query used: 

SQL Query
select jsonb->'exportedFiles'->0->>'fileName' as fileName, job_profile_name,exported, started_date,completed_date, completed_date - started_date as duration ,status from fs09000000_mod_data_export.job_executions where started_date > '2024-07-8' order by started_date desc limit 1000;