PTF - Performance testing of Export All Endpoint (Quesnelia) [non-ECS]

Overview

  • This document contains results of testing Data Export All by the endpoint (POST data-export/export-all) with Default instances, authority, export holdings job profiles and with the custom profile Example 1 on the Quesnelia [non-ECS] release on qcp1 environment

PERF-890 - Getting issue details... STATUS  

Summary

  • Data Export tests finished successfully on qcp1 environment using the Default instances, authority, holdings export job profile and Custom Mapping Profiles - Example 1.
  • During the tests data-export-all with Custom Mapping Profiles - Example 1 we observe Errors converting json to marc for instances with 9 and more holdings, task for investigation was created  MDEXP-780 - Getting issue details... STATUS
  • Two very slow queries when checking deleted records in tables mod_inventory_storage.audit_instance and mod_inventory_storage.audit_holdings_record for Data-Export-All, task for investigation was created  MODINVSTOR-1234 - Getting issue details... STATUS
  • No memory leaks are observed.
  • The data export duration for default profiles and the duration for the concurrent test of default profiles plus the export of 10,000 instances with a custom profile are the same. This is because Data-Export-All uses one data-export module task, so if we run concurrent tests, each job will be executed on a separate data-export module task, because data-export module has two tasks.

Test Runs and Results

This table contains durations for Data Export. 

Test #Job ProfileCount of exported records

Data Export Duration
(hh:mm:ss)

Results
1Default instances export job profile196645063:03:59COMPLETED
Custom Mapping Profiles - Example 1100000:11:14
2Default authority export job profile61865173:29:03COMPLETED
Custom Mapping Profiles - Example 1100000:01:49
3Default holdings export job profile206692351:46:56COMPLETED
Custom Mapping Profiles - Example 1100000:01:22
4Default instances export job profile197895082:34:09COMPLETED
5Default holdings export job profile207942261:46:16COMPLETED
6Default authority export job profile61865173:28:51COMPLETED
7Custom Mapping Profiles - Example 1197895035:51:08COMPLETED_WITH_ERRORS



Resource utilization for Test №1, №2 and №3

 Resource utilization table
Service NameCPUService NameRAM
mod-data-export-b143%mod-data-export-b106%
mod-inventory-b12%mod-source-record-manager-b53%
mod-source-record-manager-b1.90%mod-source-record-storage-b51%
mod-source-record-storage-b1.40%okapi-b47%
okapi-b0.80%mod-users-bl-b43%
mod-authtoken-b0.60%mod-inventory-b26%
mod-users-bl-b0.50%mod-authtoken-b26%
mod-inventory-storage-b0.30%mod-inventory-storage-b17%
nginx-okapi0.10%pub-okapi4.70%
pub-okapi0.00%nginx-okapi4.60%

Service CPU Utilization

Here we can see that mod-data-export used 150% CPU.

Service Memory Utilization

Here we can see that data-export module used 95% memory.


Kafka metrics



DB CPU Utilization

DB CPU was 90%.

DB Connections

Max number of DB connections was 750

DB load

                                                                                                                     

Top SQL-queries


#TOP 5 SQL statements
1
SELECT id, content, external_id, record_type, state, leader_record_status, suppress_discovery FROM v_authority_all WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only
2
select ie1_0.id,ie1_0.jsonb from v_instance ie1_0 where ie1_0.id in ($1...)
3
SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only
4
SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only
5
SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

Resource utilization for Test №4, №5 and №6

 Resource utilization table
Service NameCPUService NameRAM
mod-data-export-b177%mod-data-export-b103%
mod-inventory-b12%mod-source-record-manager-b52%
mod-source-record-manager-b1.80%mod-inventory-b48%
mod-source-record-storage-b1.40%okapi-b38%
okapi-b1.00%mod-source-record-storage-b31%
mod-authtoken-b0.80%mod-users-bl-b31%
mod-users-bl-b0.50%mod-authtoken-b26%
mod-inventory-storage-b0.30%mod-inventory-storage-b17%
nginx-okapi0.20%nginx-okapi4.70%
pub-okapi0.10%pub-okapi4.40%

Service CPU Utilization

Here we can see that mod-data-export used 194% CPU.

Service Memory Utilization

Here we can see that all modules show a stable trend except mod-source-record-manager.


Kafka metrics



DB CPU Utilization

DB CPU was 92%.

DB Connections

Max number of DB connections was 762.

DB load

                                                                                                                     

Top SQL-queries


#TOP 5 SQL statements
1
SELECT id, content, external_id, record_type, state, leader_record_status, suppress_discovery FROM v_authority_all WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only
2
select ie1_0.id,ie1_0.jsonb from v_instance ie1_0 where ie1_0.id in ($1...)
3
SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only
4
SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only
5
SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

Resource utilization for Test №7

 Resource utilization table
Service NameCPUService NameRAM
mod-data-export-b145%mod-data-export-b92%
mod-inventory-b12.80%mod-source-record-manager-b47%
mod-source-record-manager-b1.70%okapi-b46%
mod-authtoken-b1.40%mod-source-record-storage-b39%
mod-source-record-storage-b1.20%mod-inventory-b36%
okapi-b1.00%mod-users-bl-b34%
mod-users-bl-b0.40%mod-authtoken-b25%
mod-inventory-storage-b0.40%mod-inventory-storage-b15%
nginx-okapi0.30%nginx-okapi4.50%
pub-okapi0.10%pub-okapi4.40%

Service CPU Utilization

Here we can see that mod-data-export used 145% CPU.

Service Memory Utilization

Here we can see that mod-data-export used 93% memory .


Kafka metrics




DB CPU Utilization

DB CPU was 82%.

DB Connections

Max number of DB connections was 730.

DB load

                                                                                                                     

Top SQL-queries


#TOP 5 SQL statements
1
select ie1_0.id,ie1_0.holdings_record_id,ie1_0.jsonb from v_item ie1_0 where ie1_0.holdings_record_id in ($1)
2
select iwhe1_0.id,iwhe1_0.hrid from v_instance_hrid iwhe1_0 where iwhe1_0.id in ($1,$2...)
3
SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only
4
select hre1_0.id,hre1_0.instance_id,hre1_0.jsonb from v_holdings_record hre1_0 where hre1_0.instance_id=$1
5
SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

Appendix

Infrastructure

PTF -environment qcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 1 database  instance, writer

    NameMemory GIBvCPUsmax_connections

    db.r6g.xlarge

    32 GiB4 vCPUs2731
  • Number of records in DB:
    •  fs09000000
      • instances - 27289981
      • items - 28463562
      • holdings - 27535678
      • authorities - 6193573
  • Open Search ptf-test
    • Data nodes
      • Instance type - r6g.2xlarge.search
      • Number of nodes - 4
      • Version: OpenSearch_2_7_R20240502
    • Dedicated master nodes
      • Instance type - r6g.large.search
      • Number of nodes - 3
  • MSK tenant
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3


 Quesnelia modules memory and CPU parameters

Cluster Resources - qcp1 (Mon Jul 08 15:51:20 EEST 2024)

ModuleTask Definition RevisionModule VersionTask CountMem Hard LimitMem Soft LimitCPU UnitsXmxMetaspace SizeMax Metaspace Size
mod-remote-storage5mod-remote-storage:3.2.024920447210243960512512
mod-ncip5mod-ncip:1.14.42102489612876888128
mod-finance-storage5mod-finance-storage:8.6.021024896102470088128
mod-agreements5mod-agreements:7.0.0215921488128000
mod-ebsconet5mod-ebsconet:2.2.0212481024128700128256
mod-organizations5mod-organizations:1.9.02102489612876888128
mod-consortia3mod-consortia:1.1.025136477610244416384512
edge-sip23edge-sip2:3.2.0-SNAPSHOT.2092102489612876888128
mod-serials-management5mod-serials-management:1.0.02248023121281792384512
mod-settings5mod-settings:1.0.32102489620076888128
mod-data-import8mod-data-import:3.1.01204818442561292384512
edge-dematic5edge-dematic:2.2.01102489612876888128
mod-search5mod-search:3.2.0225922480204814405121024
mod-inn-reach3mod-inn-reach:3.2.0-SNAPSHOT.86236003240102428805121024
mod-tags5mod-tags:2.2.02102489612876888128
edge-courses5edge-courses:1.4.02102489612876888128
mod-authtoken6mod-authtoken:2.15.121440115251292288128
mod-inventory-update5mod-inventory-update:3.3.02102489612876888128
mod-notify5mod-notify:3.2.02102489612876888128
mod-configuration5mod-configuration:5.10.02102489612876888128
mod-orders-storage5mod-orders-storage:13.7.02102489651270088128
edge-caiasoft5edge-caiasoft:2.2.02102489612876888128
mod-login-saml5mod-login-saml:2.8.02102489612876888128
mod-erm-usage-harvester5mod-erm-usage-harvester:4.5.02102489612876888128
mod-licenses5mod-licenses:6.0.02248023125121792384512
mod-gobi5mod-gobi:2.8.02102489612876888128
mod-password-validator5mod-password-validator:3.2.0214401298128768384512
mod-bulk-operations5mod-bulk-operations:2.0.023072260010241536384512
mod-fqm-manager5mod-fqm-manager:2.0.12102489612876888128
edge-dcb5edge-dcb:1.1.02102489612876888128
mod-graphql6mod-graphql:1.12.12102489612876888128
mod-finance5mod-finance:4.9.02102489612876888128
mod-erm-usage5mod-erm-usage:4.7.022800255010241800384512
mod-batch-print6mod-batch-print:1.1.02102489612876888128
mod-copycat5mod-copycat:1.6.02102489612876888128
mod-lists5mod-lists:2.0.02102489612876888128
mod-entities-links6mod-entities-links:3.0.0225922480400144001024
mod-permissions10mod-permissions:6.5.02168415445121024384512
pub-edge3pub-edge:2023.06.142102489612876800
mod-orders5mod-orders:12.8.022048174010241024384512
edge-patron5edge-patron:5.1.02102489625676888128
edge-ncip5edge-ncip:1.9.22102489612876888128
edge-inn-reach3edge-inn-reach:3.1.1-SNAPSHOT.452102489612876888128
mod-users-bl5mod-users-bl:7.7.021440115251292288128
mod-oa4mod-oa:2.1.0-SNAPSHOT.622102489612876888256
mod-inventory-storage5mod-inventory-storage:27.1.024096369020483076384512
mod-invoice6mod-invoice:5.8.021440115251292288128
mod-user-import5mod-user-import:3.8.02102489612876888128
mod-sender6mod-sender:1.12.02102489612876888128
edge-oai-pmh5edge-oai-pmh:2.9.021512136010241440384512
mod-data-export-worker5mod-data-export-worker:3.2.123072280010242048384512
mod-rtac5mod-rtac:3.6.02102489612876888128
mod-circulation-storage5mod-circulation-storage:17.2.022880259215361814384512
mod-calendar5mod-calendar:3.1.02102489612876888128
mod-source-record-storage7mod-source-record-storage:5.8.025600500020483500384512
mod-event-config5mod-event-config:2.7.02102489612876888128
mod-courses5mod-courses:1.4.102102489612876888128
mod-circulation-item5mod-circulation-item:1.0.02102489612876888128
mod-inventory6mod-inventory:20.2.024096368810241814384512
mod-email5mod-email:1.17.02280025505121800384512
mod-pubsub5mod-pubsub:2.13.02153614401024922384512
mod-circulation5mod-circulation:24.2.022880259215361814384512
mod-di-converter-storage5mod-di-converter-storage:2.2.02102489612876888128
edge-rtac5edge-rtac:2.7.12102489612876888128
edge-orders5edge-orders:3.0.02102489612876888128
mod-users6mod-users:19.3.12102489612876888128
mod-template-engine5mod-template-engine:1.20.02102489612876888128
mod-patron-blocks5mod-patron-blocks:1.10.021024896102476888128
mod-audit5mod-audit:2.9.02102489612876888128
edge-fqm5edge-fqm:2.0.02102489612876888128
mod-source-record-manager6mod-source-record-manager:3.9.0-SNAPSHOT.33025600500020483500384512
nginx-edge3nginx-edge:2023.06.1421024896128000
mod-quick-marc5mod-quick-marc:5.1.01228821761281664384512
nginx-okapi3nginx-okapi:2023.06.1421024896128000
okapi-b5okapi:5.3.03168414401024922384512
mod-feesfines5mod-feesfines:19.1.02102489612876888128
mod-invoice-storage5mod-invoice-storage:5.8.021872153610241024384512
mod-dcb6mod-dcb:1.1.02102489612876888128
mod-service-interaction5mod-service-interaction:4.0.12204818442561290384512
mod-data-export17mod-data-export:5.0.41204818442048000
mod-patron5mod-patron:6.1.02102489612876888128
mod-oai-pmh5mod-oai-pmh:3.13.024096369020483076384512
edge-connexion5edge-connexion:1.2.02102489612876888128
mod-kb-ebsco-java5mod-kb-ebsco-java:4.0.02102489612876888128
mod-notes5mod-notes:5.2.021024896128952384512
mod-data-export-spring5mod-data-export-spring:3.2.01204818442561536384512
mod-organizations-storage5mod-organizations-storage:4.7.02102489612876888128
mod-login5mod-login:7.11.02144012981024768384512
pub-okapi3pub-okapi:2023.06.142102489612876800
mod-eusage-reports5mod-eusage-reports:2.1.12102489612876888128


Methodology/Approach

Test set:

  • Test 1: Manually tested 10k instances records files with the custom profile Example 1 and concurrently run Export All instances started by JMeter using endpoint(POST data-export/export-all) with profile Default instances export job profileData Export started on one tenant (fs09000000) with two tasks for data-export module.
  • Test 2: Manually tested 10k instances records files with the custom profile Example 1 and concurrently run Export All instances started by JMeter using endpoint(POST data-export/export-all) with profile Default holdings export job profileData Export started on one tenant (fs09000000) with two tasks for data-export module.
  • Test 3: Manually tested 10k instances records files with the custom profile Example 1 and concurrently run Export All instances started by JMeter using endpoint(POST data-export/export-all) with profile Default authority export job profileData Export started on one tenant (fs09000000) with two tasks for data-export module.

Test set:

  • Test 4: Export All instances started by JMeter using endpoint(POST data-export/export-all) with profile Default instances export job profileData Export started on one tenant (fs09000000) with one task for data-export module.
  • Test 5: Export All holdings started by JMeter using endpoint(POST data-export/export-all) with profile Default holdings export job profileData Export started on one tenant (fs09000000) with one task for data-export module.
  • Test 6: Export All authority started by JMeter using endpoint(POST data-export/export-all) with profile Default authority export job profileData Export started on one tenant (fs09000000) with one task for data-export module.

Test №7:

  • Export All instances started by JMeter using endpoint(POST data-export/export-all) with the Custom Profile Example 1Data Export started on one tenant (fs09000000) with one task for data-export module.


To get status and time range for import jobs the query used: 

SQL Query
select jsonb->'exportedFiles'->0->>'fileName' as fileName, job_profile_name,exported, started_date,completed_date, completed_date - started_date as duration ,status
from fs09000000_mod_data_export.job_executions where started_date > '2024-07-8' order by started_date desc limit 1000;