PTF - Data Export Test Report (Quesnelia) [non-ECS]

Overview

This document contains the results of testing Data Export (MARC BIB) on Quesnelia release with Data Export tests for 1k, 100k, 500k. Three csv files were prepared to run Data Export with Default instances export job profile and srs - holdings and items job profiles.

Ticket: PERF-822 - Getting issue details... STATUS


Summary

  • DE jobs perform dramatically better in Quesnelia release if to compare with Poppy. No issues with token. All jobs with file's volume 1000, 100k, 500k records completed successfully.
  • The improvement varies from file size or job profile - from 4 to 9 times better duration. Additional test was conducted with job profile prepared by script to check consistency of results.
  • The Average CPU utilization for mod-data-export depends on file size and job profile. Exporting 100k records- 63% in Default and 92% in custom job profile. Exporting 500k - Default instances export job profile- 434%, srs - holdings and items- 296%
  • Average Memory consumption for mod-data-export was close to 100%. Almost the same as in Poppy release.
  • Average DB utilization - 17% with 100k and 33% with 500k. DB connections - 1360 instead of 200 in Poppy. 

Recommendations & Jiras

  • Consider increasing of cpu for mod-data-export module to smoothen spikes with big size files.

Test Results

This table contains durations for jobs with 3 job profiles. 

ProfileCSV  FileDE Duration/Status Quesnelia 


ResultStatus
DE MARC Bib (Default instances export job profile)  1kDE.csv00:00:02COMPLETED
100kDE.csv00:02:17COMPLETED
500kDE.csv00:05:10COMPLETED
DE MARC Bib (srs - holdings and items)  1kDE.csv00:00:04COMPLETED
100kDE.csv00:05:13COMPLETED
500kDE.csv00:08:58COMPLETED
Export for Data Import updates(created by script)  1kDE.csv00:00:04COMPLETED
100kDE.csv00:05:08COMPLETED
500kDE.csv00:10:41COMPLETED

Comparison

This table contains durations comparison between Poppy and Quesnelia releases

ProfileCSV  FileDE Duration/Status Poppy 2 set DE Duration/Status Quesnelia DE Duration, DELTA Poppy/Quesnelia


ResultStatusResultStatushh:mm:ss
DE MARC Bib (Default instances export job profile)1kDE.csv00:00:08COMPLETED00:00:02COMPLETED00:00:06 - 4 times improvement

100kDE.csv00:15:36COMPLETED00:02:17COMPLETED00:13:19 - 7 times improvement

500kDE.csv00:57:25FAIL00:05:10COMPLETED
DE MARC Bib (srs - holdings and items)1kDE.csv00:00:29COMPLETED00:00:04COMPLETED00:00:25 - 7 times improvement

100kDE.csv00:47:23COMPLETED00:05:13COMPLETED00:42:10 - 9 times improvement

500kDE.csv04:11:09FAIL00:08:58COMPLETED

Service CPU Utilization

 CPU utilization

Default instances export job profile with 500k file

ModuleCPU
mod-data-export-b434.91
mod-inventory-b10.79
mod-source-record-manager-b1.74
mod-source-record-storage-b1.5
okapi-b0.96
mod-users-bl-b0.63
mod-authtoken-b0.61
mod-inventory-storage-b0.39
nginx-okapi0.24
pub-okapi0.16

srs - holdings and items with 500k file

ModuleCPU
mod-data-export-b296.31
mod-inventory-b12.79
mod-source-record-manager-b1.85
mod-source-record-storage-b1.49
okapi-b1.1
mod-authtoken-b0.95
mod-users-bl-b0.69
mod-inventory-storage-b0.54
nginx-okapi0.32
pub-okapi0.21

TOP 20 modules

ModuleCPU
mod-data-export-b296.31
mod-consortia-b19.03
mod-inventory-b12.79
mod-dcb-b8.17
mod-quick-marc-b7.99
mod-pubsub-b7.05
mod-users-b6.85
mod-audit-b6.18
mod-kb-ebsco-java-b5.78
edge-dematic-b4.5
mod-erm-usage-harvester-b4.49
mod-organizations-storage-b4.04
mod-licenses-b3.51
mod-permissions-b3.06
mod-data-export-spring-b2.78
mod-lists-b2.77
mod-organizations-b2.74
mod-user-import-b2.7
mod-tags-b2.57
mod-sender-b2.5

Mod-data-export-b

For Default instances export job profile with 100k file used 92% , during exporting 500k file - 434%.

For job profile "Export for Data Import updates" (created by script) - exporting with 100k - 33%, 500k - 202%. 

For srs - holdings and items job profile - 100k - 63%, 500k - 296%.

Memory Utilization

 Memory consumption
ModuleMemory

mod-data-export-b

97

mod-inventory-b

55

okapi-b

42

mod-source-record-manager-b

42

mod-users-bl-b

32

mod-source-record-storage-b

27

mod-authtoken-b

25

mod-inventory-storage-b

14

nginx-okapi

5

pub-okapi

4

TOP 20 modules srs - holdings and items with 500k file

ModuleMemory
mod-data-export-b97.51
mod-data-export-worker-b89.71
mod-dcb-b78.72
mod-consortia-b78.13
mod-oa-b76.73
mod-orders-b65.1
mod-copycat-b62.06
mod-calendar-b56.01
mod-agreements-b55.19
mod-invoice-b55.1
mod-permissions-b55.07
mod-circulation-item-b53.84
mod-erm-usage-harvester-b52.36
mod-service-interaction-b51.65
mod-notes-b50.5
mod-orders-storage-b50.49
mod-users-b49.88
mod-tags-b47.66
mod-inventory-b45.58
mod-audit-b44.97


This graph contains DE related modules.

Mod-data-export-b use 97%, mod-inventory - 55%. The rest was under 50%

All modules memory consumption graph. Here we see a lot of modules that go with more than 50% consumption. 

DB CPU Utilization

DB utilized with Default instances export job profile 100k file - 17%, 500k file - 25%. For Export for Data Import updates(created by script) job profile - 100k file - 17%, 500k file - 38%. 

For srs - holdings and items job profile - 100k file - 17%, 500k file - 33%. 

DB Connections

DB connections - 1260 in Average. High peaks go with 1360.

DB Load

Default instances export job profile

Export for Data Import updates(test)1 job profile

srs - holdings and items job profile

SQL queries

Default instances export job profile

Export for Data Import updates(test)1 job profile

srs - holdings and items job profile

Top-SQL statement: 

Default instances export job profile

autovacuum: VACUUM fs09000000_mod_data_export.job_executions_export_ids

INSERT INTO job_executions_export_ids (job_execution_id, instance_id) VALUES ($1, $2) ON CONFLICT DO NOTHING

srs - holdings and items job profile

autovacuum: VACUUM fs09000000_mod_data_export.job_executions_export_ids

select ie1_0.id,ie1_0.holdings_record_id,ie1_0.jsonb from v_item ie1_0 where ie1_0.holdings_record_id in ($1)

select hre1_0.id,hre1_0.instance_id,hre1_0.jsonb from v_holdings_record hre1_0 where hre1_0.instance_id=$1
INSERT INTO job_executions_export_ids (job_execution_id, instance_id) VALUES ($1, $2) ON CONFLICT DO NOTHING

Errors / Additional information

Note: inconsistency of data or duplicated records in the file with instances may lead to job failures. Zip file generates on UI side if number of records more than 100.000. 

Methodology/Approach

3 files were prepared with query: SELECT id FROM [tenant_id]_mod_inventory_storage.instance where jsonb->>'source'='MARC' LIMIT 1000|100000|500000;

All tests were carried out sequentially with each job profile on main tenant fs09000000. 

To get status and time range for export jobs the query used: 

SQL Query
select exported as filesize,completed_date - started_date as duration,job_profile_name,status as status,started_date,completed_date
FROM [tenant]_mod_data_export.job_executions
where job_profile_name = 'srs - holdings and items'
ORDER BY completed_date DESC

select exported as filesize,completed_date - started_date as duration,job_profile_name,status as status,started_date,completed_date
FROM [tenant]_mod_data_export.job_executions
where job_profile_name = 'Default instances export job profile'
ORDER BY completed_date DESC

select exported as filesize,completed_date - started_date as duration,job_profile_name,status as status,started_date,completed_date
FROM [tenant]_mod_data_export.job_executions
where job_profile_name = 'Export for Data Import updates(test)1'
ORDER BY completed_date DESC

Infrastructure

PTF -environment qcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 2 database  instances, writer/reader

    NameMemory GIBvCPUsmax_connections

    db.r6g.xlarge

    32 GiB4 vCPUs2731

    Data set for fs09000000

    • Instances - 25606331
    • Items       - 26779913
    • Holdings - 25576735
  • MSK tenant
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3

QCP1 modules

 All qcp1 modules
ModuleTask Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSize
qcp1-pvt








Tue Jun 04 07:31:53 UTC 2024








mod-remote-storage4mod-remote-storage:3.2.024920447210243960512512
mod-ncip4mod-ncip:1.14.42102489612876888128
mod-finance-storage4mod-finance-storage:8.6.021024896102470088128
mod-agreements4mod-agreements:7.0.0215921488128000
mod-ebsconet4mod-ebsconet:2.2.0212481024128700128256
mod-organizations4mod-organizations:1.9.02102489612870088128
mod-consortia2mod-consortia:1.1.023072204812820485121024
edge-sip22edge-sip2:3.2.0-SNAPSHOT.2092102489612876888128
mod-serials-management4mod-serials-management:1.0.02248023121281792384512
mod-settings4mod-settings:1.0.32102489620076888128
mod-data-import7mod-data-import:3.1.01204818442561292384512
edge-dematic4edge-dematic:2.2.01102489612876888128
mod-search4mod-search:3.2.0225922480204814405121024
mod-inn-reach2mod-inn-reach:3.2.0-SNAPSHOT.86236003240102428805121024
mod-tags4mod-tags:2.2.02102489612876888128
edge-courses4edge-courses:1.4.02102489612876888128
mod-authtoken5mod-authtoken:2.15.121440115251292288128
mod-inventory-update4mod-inventory-update:3.3.02102489612876888128
mod-notify4mod-notify:3.2.02102489612876888128
mod-configuration4mod-configuration:5.10.02102489612876888128
mod-orders-storage4mod-orders-storage:13.7.02102489651270088128
edge-caiasoft4edge-caiasoft:2.2.02102489612876888128
mod-login-saml4mod-login-saml:2.8.02102489612876888128
mod-erm-usage-harvester4mod-erm-usage-harvester:4.5.02102489612876888128
mod-licenses4mod-licenses:6.0.02248023121281792384512
mod-gobi4mod-gobi:2.8.02102489612870088128
mod-password-validator4mod-password-validator:3.2.0214401298128768384512
mod-bulk-operations4mod-bulk-operations:2.0.023072260010241536384512
mod-fqm-manager4mod-fqm-manager:2.0.12300026001282048384512
edge-dcb4edge-dcb:1.1.02102489612876888128
mod-graphql5mod-graphql:1.12.12102489612876888128
mod-finance4mod-finance:4.9.02102489612870088128
mod-erm-usage4mod-erm-usage:4.7.02102489612876888128
mod-batch-print5mod-batch-print:1.1.02102489612876888128
mod-copycat4mod-copycat:1.6.02102451212876888128
mod-lists4mod-lists:2.0.02300026001282048384512
mod-entities-links5mod-entities-links:3.0.0225922480400144001024
mod-permissions8mod-permissions:6.5.02168415445121024384512
pub-edge3pub-edge:2023.06.142102489612876800
mod-orders4mod-orders:12.8.022048144010241024384512
edge-patron4edge-patron:5.1.02102489625676888128
edge-ncip4edge-ncip:1.9.22102489612876888128
edge-inn-reach2edge-inn-reach:3.1.1-SNAPSHOT.452102489612876888128
mod-users-bl4mod-users-bl:7.7.021440115251292288128
mod-oa2mod-oa:2.1.0-SNAPSHOT.622102489612876888128
mod-inventory-storage4mod-inventory-storage:27.1.024096369020483076384512
mod-invoice5mod-invoice:5.8.021440115251292288128
mod-user-import4mod-user-import:3.8.02102489612876888128
mod-sender5mod-sender:1.12.02102489612876888128
edge-oai-pmh4edge-oai-pmh:2.9.021512136010241440384512
mod-data-export-worker4mod-data-export-worker:3.2.123072204810242048384512
mod-rtac4mod-rtac:3.6.02102489612876888128
mod-circulation-storage4mod-circulation-storage:17.2.022880259215361814384512
mod-calendar4mod-calendar:3.1.02102489612876888128
mod-source-record-storage4mod-source-record-storage:5.8.025600500020483500384512
mod-event-config4mod-event-config:2.7.02102489612876888128
mod-courses4mod-courses:1.4.102102489612876888128
mod-circulation-item4mod-circulation-item:1.0.021024896128000
mod-inventory4mod-inventory:20.2.022880259210241814384512
mod-email4mod-email:1.17.02102489612876888128
mod-pubsub4mod-pubsub:2.13.02153614401024922384512
mod-circulation4mod-circulation:24.2.022880259215361814384512
mod-di-converter-storage4mod-di-converter-storage:2.2.02102489612876888128
edge-rtac4edge-rtac:2.7.12102489612876888128
edge-orders4edge-orders:3.0.02102489612876888128
mod-users5mod-users:19.3.12102489612876888128
mod-template-engine4mod-template-engine:1.20.02102489612876888128
mod-patron-blocks4mod-patron-blocks:1.10.021024896102476888128
mod-audit4mod-audit:2.9.02102489612876888128
edge-fqm4edge-fqm:2.0.02102489612876888128
mod-source-record-manager5mod-source-record-manager:3.9.0-SNAPSHOT.33025600500020483500384512
nginx-edge3nginx-edge:2023.06.1421024896128000
mod-quick-marc4mod-quick-marc:5.1.01228821761281664384512
nginx-okapi3nginx-okapi:2023.06.1421024896128000
okapi-b4okapi:5.3.03168414401024922384512
mod-feesfines4mod-feesfines:19.1.02102489612876888128
mod-invoice-storage4mod-invoice-storage:5.8.021872153610241024384512
mod-dcb5mod-dcb:1.1.02102489612876888128
mod-service-interaction4mod-service-interaction:4.0.12204818442561290384512
mod-data-export11mod-data-export:5.0.41204815241024000
mod-patron4mod-patron:6.1.02102489612876888128
mod-oai-pmh4mod-oai-pmh:3.13.024096369020483076384512
edge-connexion4edge-connexion:1.2.02102489612876888128
mod-kb-ebsco-java4mod-kb-ebsco-java:4.0.02102489612876888128
mod-notes4mod-notes:5.2.021024896128952384512
mod-data-export-spring4mod-data-export-spring:3.2.01204818442561536384512
mod-organizations-storage4mod-organizations-storage:4.7.02102489612870088128
mod-login4mod-login:7.11.02144012981024768384512
pub-okapi3pub-okapi:2023.06.142102489612876800
mod-eusage-reports4mod-eusage-reports:2.1.12102489612876888128

Data Export related modules

ModuleTask Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSize
qcp1-pvt








Tue Jun 04 07:31:53 UTC 2024








mod-authtoken5mod-authtoken:2.15.121440115251292288128
mod-users-bl4mod-users-bl:7.7.021440115251292288128
mod-inventory-storage4mod-inventory-storage:27.1.024096369020483076384512
mod-source-record-storage4mod-source-record-storage:5.8.025600500020483500384512
mod-inventory4mod-inventory:20.2.022880259210241814384512
mod-source-record-manager5mod-source-record-manager:3.9.0-SNAPSHOT.33025600500020483500384512
nginx-okapi3nginx-okapi:2023.06.1421024896128000
okapi-b4okapi:5.3.03168414401024922384512
mod-data-export11mod-data-export:5.0.41204815241024000
pub-okapi3pub-okapi:2023.06.142102489612876800