PTF - Data Export Test Report (Poppy)
Overview
This document contains the results of testing Data Export (MARC BIB) on Poppy release with Data Export tests for 1k, 100k, 500k. Three csv files were prepared to run Data Export with Default instances export job profile and srs - holdings and items job profiles. Because of some processes were running on background during tests it was decided to run additional set of tests with 1k and 100k for each job profile. So graphs in the report show only second set of tests.
Ticket: - PERF-748Getting issue details... STATUS
Summary
- DE jobs with files 1k, 100k, 500k durations has no significant changes compared with Orchid. No issues with token. DE jobs with 500k were completed with FAIL statuses in a both profiles.
- FAIL status for DE srs - holdings and items job profile with 500k file is likely connected with too high data volume transfer to S3 bucket. It will be investigated in story created by firebird team.
- Average CPU utilization for mod-data-export didn't exceed 18% with spikes. During 100k it was 10-12%.
- Average Memory consumption for mod-data-export was close to 100%.
- Average DB utilization - 18%. DB connections - 200. During tests spikes with 40% observed every 15 minutes.
Recommendations & Jiras
- FAIL was under investigation by firebird team before with Data Export fails/stops triggered by a large file (> 196K) with - MDEXP-658Getting issue details... STATUS , - MDEXP-587Getting issue details... STATUS
- The problem with a long query (414 Error) in 500k with srs - holdings and items job should be resolved by - MDEXP-607Getting issue details... STATUS
Test Results
This table contains durations for jobs with 2 job profiles.
Profile | CSV File | DE Duration/Status Poppy 1 set | DE Duration/Status Poppy 2 set | ||
Result | Status | Result | Status | ||
DE MARC Bib (Default instances export job profile) | 1kDE.csv | 00:00:08 | COMPLETED | 00:00:23 | COMPLETED |
100kDE.csv | 00:15:36 | COMPLETED | 00:15:23 | COMPLETED | |
500kDE.csv | 00:57:25 | FAIL | |||
DE MARC Bib (srs - holdings and items) | 1kDE.csv | 00:00:29 | COMPLETED | 00:00:38 | COMPLETED |
100kDE.csv | 00:47:23 | COMPLETED | 00:52:57 | COMPLETED | |
500kDE.csv | 04:11:09 | FAIL |
Comparison
This table contains durations comparison between Orchid and Poppy releases
Profile | CSV File | DE Duration/Status Orchid | DE Duration/Status Poppy 1 set | DE Duration, DELTA Orchid/Poppy 1 set | DE Duration/Status Poppy 2 set | DE Duration, DELTA Orchid/Poppy 2 set | |||
Result | Status | Result | Status | hh:mm:ss | Result | Status | hh:mm:ss | ||
DE MARC Bib (Default instances export job profile) | 1kDE.csv | 00:00:08 | COMPLETED | 00:00:23 | COMPLETED | ||||
100kDE.csv | 00:15:36 | COMPLETED | 00:15:23 | COMPLETED | |||||
500kDE.csv | 00:57:25 | FAIL | |||||||
DE MARC Bib (srs - holdings and items) | 1kDE.csv | 00:00:27 | COMPLETED | 00:00:29 | COMPLETED | + 00:00:02 | 00:00:38 | COMPLETED | + 00:00:11 |
100kDE.csv | 00:47:51 | COMPLETED | 00:47:23 | COMPLETED | - 00:00:28 | 00:52:57 | COMPLETED | + 00:05:06 | |
500kDE.csv | 04:00:26 | COMPLETED | 04:11:09 | FAIL | + 00:10:43 |
Instance CPU Utilization
Service CPU Utilization
Memory Utilization
This graph shows that mod-data-export doesn't exceed 102% during test with 100k and at the end of the test it grew up to 112%. Memory consumption didn't grow with 500k file.
DB CPU Utilization
During tests spikes were observed on DB every 15 minutes. Average CPU Utilization equal to 18% .
DB Connections
Average DB connection is 200.
DB Load
SQL queries
Top-SQL statement:
inventory - go to tables loans - check are there any
autovacuum: VACUUM fs09000000_mod_inventory_storage.instance (to prevent wraparound)
WITH deleted_rows AS ( delete from marc_indexers mi where exists( select ? from marc_records_tracking mrt where mrt.is_dirty = ? and mrt.marc_id = mi.marc_id and mrt.version > mi.version ) returning mi.marc_id), deleted_rows2 AS ( delete from marc_indexers mi where exists( select ? from records_lb where records_lb.id = mi.marc_id and records_lb.state = ? ) returning mi.marc_id) INSERT IN
SELECT fs09000000_mod_inventory_storage.count_estimate(?)
with "cte" as (select count(*) from "records_lb" where ("records_lb"."external_id" in (cast($1 as uuid), cast($2 as uuid), cast($3 as uuid), cast($4 as uuid), cast($5 as uuid), cast($6 as uuid), cast($7 as uuid), cast($8 as uuid), cast($9 as uuid), cast($10 as uuid), cast($11 as uuid), cast($12 as uuid), cast($13 as uuid), cast($14 as uuid), cast($15 as uuid), cast($16 as uuid), cast($17 as uuid), cast($18 as uuid), cast($19 as uuid), cast($20 as uuid), cast($21 as uuid), cast($22 as uuid), cast
Errors / Additional information
- During test with DE MARC Bib (Default instances export job profile) in UI we see this message:
- 2023-12-01T10:05:09.426+00:00 ERROR Export is completed with errors: some records have failed to export: number of failed records: 6
- During test with DE MARC Bib (srs - holdings and items) in UI we see this message:
- 2023-12-01T17:38:18.129+00:00 ERROR Error while getting items by holding ids Exception while calling...
message: Get invalid response with status: 414
- 2023-12-01T17:38:18.129+00:00 ERROR Error while getting items by holding ids Exception while calling...
Methodology/Approach
3 files were prepared with query: SELECT id FROM [tenant_id]_mod_inventory_storage.instance where jsonb->>'source'='MARC' LIMIT 1000|100000|500000;
All tests were carried out sequentially with each job profile.
To get status and time range for export jobs the query used:
SELECT jsonb->>'status',jsonb->>'startedDate' AS startedDate,jsonb->>'completedDate' AS completedDate
FROM [REPLACE_tenant_id_HERE]_mod_data_export.job_executions
WHERE jsonb->>'jobProfileName'='[REPLACE_WITH_DE_JOB_HERE]'
ORDER BY jsonb->>'startedDate' desc LIMIT 10;
Infrastructure
PTF -environment pcp1
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, writer/reader
Name Memory GIB vCPUs max_connections db.r6g.xlarge
32 GiB 4 vCPUs 2731 - MSK tenant
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Table contains modules, memory and CPU parameters
Module | Task Def. Revision | Module Version | Task Count | Mem Hard Limit | Mem Soft limit | CPU units | Xmx | MetaspaceSize | MaxMetaspaceSize | R/W split enabled |
pcp1-pvt | ||||||||||
Wed Nov 22 08:06:06 UTC 2023 | ||||||||||
mod-data-export | 11 | 4.8.1 | 1 | 1024 | 896 | 1024 | 768 | 88 | 128 | FALSE |
mod-authtoken | 13 | 2.14.1 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 | FALSE |
mod-users-bl | 9 | 7.6.0 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 | FALSE |
mod-inventory-storage | 12 | 27.0.3 | 2 | 4096 | 3690 | 2048 | 3076 | 384 | 512 | FALSE |
mod-inventory | 11 | 20.1.3 | 2 | 2880 | 2592 | 1024 | 1814 | 384 | 512 | FALSE |
mod-source-record-storage | 15 | 5.7.3 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | FALSE |
mod-source-record-manager | 14 | 3.7.4 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | FALSE |
nginx-okapi | 9 | 2023.06.14 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 | FALSE |
okapi-b | 11 | 5.1.2 | 3 | 1684 | 1440 | 1024 | 922 | 384 | 512 | FALSE |