Overview
This document contains the results of testing Data Export (MARC BIB) on Orchid with baseline Data Export tests.
Ticket: - PERF-669Getting issue details... STATUS
Summary
- Data export jobs duration has no degradation for all DE files (1k, 100k, 500k).
- Maximum CPU utilization was observed for mod-data-export module during DE job with 500k of instances - 10%.
Test Results
Profile used for testing - "srs - holdings and items"
Test | File | Duration: Orchid |
---|---|---|
1 | 1k | 30s |
2 | 100k | 48m 22s |
3 | 500k | 3h 53m 22s |
Instance CPU Utilization
Service CPU Utilization
Memory Utilization
DB CPU Utilization
DB Connections
DB Load
SQL queries
Top-SQL statement:
SELECT fs09000000_mod_inventory_storage.count_estimate(?) |
Additional information
In UI all jobs have status - completed with column 'Failed' value equal to '-1' for 1k and 100k and '-12' for 500k.
In DB we can see that exported value higher than total and that's why column for 'Failed' shows negative value.
"status": "COMPLETED",
"progress": {
"total": 500000,
"failed": -12,
"exported": 500012
}
Methodology/Approach
To get Baseline numbers for Data Export in main tenant with 1 user 3 files with instance ids were used.
To get status and time range for export jobs the query used:
SELECT jsonb->>'status',jsonb->>'startedDate' AS startedDate,jsonb->>'completedDate' AS completedDate
FROM [tenant_id]_mod_data_export.job_executions
WHERE jsonb->>'jobProfileName'='srs - holdings and items'
ORDER BY jsonb->>'startedDate' desc LIMIT 10;
Test preparation:
- 3 files were prepared with query: SELECT id FROM [tenant_id]_mod_inventory_storage.instance where jsonb->>'source'='MARC' LIMIT 1000|100000|500000;
- All tests were carried out sequentially
Infrastructure
PTF -environment ncp5
- 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, one reader, and one writer
Name API Name Memory GIB vCPUs max_connections R6G Extra Large db.r6g.xlarge 32 GiB 4 vCPUs 2731 - number of connections for mod-source-record-manager and mod-source-record-storage - 30 connections.
- MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
- Kafka topics partitioning: - 2 partitions for DI topics
Modules memory and CPU parameters
Module | Task Def. Revision | Version | Task Count | Mem Hard Limit | Mem Soft limit | CPU units | Xmx | MetaspaceSize | MaxMetaspaceSize | R/W split enabled |
ncp5-pvt | ||||||||||
Mon Sep 18 10:17:13 UTC 2023 | ||||||||||
mod-authtoken | 8 | 2.13.0 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 | FALSE |
mod-users-bl | 8 | 7.5.0 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 | FALSE |
mod-inventory-storage | 12 | 26.0.0 | 2 | 4096 | 3690 | 2048 | 3076 | 384 | 512 | FALSE |
mod-source-record-storage | 27 | 5.6.7 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | FALSE |
mod-source-record-manager | 18 | 3.6.4 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | FALSE |
okapi-b | 8 | 5.0.1 | 3 | 1684 | 1440 | 1024 | 922 | 384 | 512 | FALSE |
mod-data-export | 6 | 4.7.1 | 1 | 1024 | 896 | 1024 | 768 | 88 | 128 | FALSE |