PTF - Data Export Test Report (Sunflower) [ECS]
1 Overview
2 Summary
7 Appendix
7.1 Infrastructure
Overview
This document contains the results of testing Data Export (MARC BIB) on the Sunflower [ECS] release.
https://folio-org.atlassian.net/browse/PERF-1119
Summary
Data Export tests finished successfully on Sunflower(secon) environment using the profiles Default instances export job profile and srs - holdings and items job profile.
Data Export test executed on College tenant only.
Ramsons release results
Data Export:
Default instances export job profile
File with 1k records - 7 seconds
File with 100k records - 2 minute 55 seconds
File with 500k records - 4 minute 28 seconds
srs - holdings and items
File with 1k records - 8 seconds
File with 100k records - 6 minute 52 seconds
File with 500k records - 7 minute 48 seconds
Comparing Ramsons (previous results) and Sunflower releases results. DE perform better in Ramsons for smaller datasets but it is better for 500k data in Sunflower env. Performance depends on file size and job profile.
Default instances export job profile
File with 100k records +52.7%
File with 500k records 0%
srs - holdings and items
File with 100k records +31.6%
File with 500k records - 6.21%
Mod-data-export used most of CPU with Default instances export job profile - 166% and srs - holdings and items - 120% with the 500k records file
Concurrent Data Export testing with srs - holdings and items job profile revealed the slowness on College DCB tenant. One notable difference is that CPU utilization of services is significantly higher in the Sunflower environment compared to Ramsons. The service memory utilization metric shows that the most memory-intensive services differ between the Sunflower and Ramsons environments.
Test Runs
Profile | Test # | CSV File |
---|---|---|
DE MARC Bib (Default instances export job profile) | 1 | 1k.csv |
2 | 100k.csv | |
3 | 500k.csv | |
DE MARC Bib (srs - holdings and items) | 4 | 1k.csv |
5 | 100k.csv | |
6 | 500k.csv |
Test Results
This table contains durations for Data Export with 2 job profiles.
Profile | CSV File | Tenant College (cs00000int_0001) | |
---|---|---|---|
Result | Status | ||
DE MARC Bib (Default instances export job profile) | 1k.csv | 0:00:07 | COMPLETED |
100k.csv | 0:02:55 | COMPLETED | |
500k.csv | 0:04:28 | COMPLETED | |
DE MARC Bib (srs - holdings and items) | 1k.csv | 0:00:08 | COMPLETED |
100k.csv | 0:06:52 | COMPLETED | |
500k.csv | 0:07:48 | COMPLETED |
Comparison
This table contains durations comparison between Ramsons and Sunflower releases.
Profile | CSV File | Ramsons (cs00000int_0001) College tenant | Sunflower (cs00000int_0001) College tenant | DE Duration, DELTA Ramsons /Sunflower |
Duration (hh:mm:ss) | percent / time | |||
DE MARC Bib (Default instances export job profile) | 1k.csv | 00:00:02 | 00:00:07 | 250.00% / 5 sec |
100k.csv | 00:01:55 | 00:02:55 | 52.17% / 60 sec | |
500k.csv | 00:04:26 | 00:04:28 | 0.75% / 2 sec | |
DE MARC Bib (srs - holdings and items) | 1k.csv | 00:00:07 | 00:00:08 | 14.28% / 1 sec |
100k.csv | 00:05:13 | 00:06:52 | 31.62% / 1 min 39 sec | |
500k.csv | 00:08:19 | 00:07:48 | -6.21% / 31 sec |
Resource utilization
Service CPU Utilization
Maximum CPU utilization was in 500k file: Default instances export job profile - 166%, srs - holdings and items - 120%
Service Memory Utilization
Maximum memory consumption was in mgr-applications - 75%, mod-dcb - 75%, mod-inventory - 67%, mod-scheduler - 62%
DB CPU Utilization
Maximum RDS CPU 500k file , srs - holdings and items job - 31%, Default instances export job - 25%
DB Connections
DB connections - 1171 in average. No spikes with different file size or job profile.
DB load
Top SQL-queries
# | TOP SQL statements |
---|---|
1 | select iwhe1_0.id,iwhe1_0.hrid from v_instance_hrid iwhe1_0 where iwhe1_0.id in ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,$102,$103,$104,$105,$1 |
2 | select mre1_0.id,mre1_0.content,mre1_0.external_id,mre1_0.generation,mre1_0.leader_record_status,mre1_0.record_type,mre1_0.state,mre1_0.suppress_discovery from v_marc_records_lb mre1_0 where mre1_0.external_id in ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74 |
3 | SELECT id, jsonb, holdings_record_id FROM cs00000int_0001_mod_data_export.v_item WHERE holdings_record_id in ($1) |
4 | select hre1_0.id,hre1_0.instance_id,hre1_0.jsonb from v_holdings_record hre1_0 where hre1_0.instance_id=$1 |
5 | INSERT INTO job_executions_export_ids (job_execution_id, instance_id) VALUES ($1, $2) ON CONFLICT DO NOTHING |
6 | COMMIT |
7 | select eie1_0.id,eie1_0.instance_id,eie1_0.job_execution_id from job_executions_export_ids eie1_0 where eie1_0.job_execution_id=$1 and eie1_0.instance_id>=$2 and eie1_0.instance_id<=$3 order by eie1_0.instance_id offset $4 rows fetch first $5 rows only |
Appendix
Infrastructure
|
---|
|