Skip to end of banner
Go to start of banner

PTF - Data Export Test Report (Orchid)

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Overview

This document contains the results of testing Data Export (MARC BIB) on Orchid with baseline Data Export tests.

Ticket: PERF-669 - Getting issue details... STATUS


Summary

  • Data export jobs duration has no degradation for all DE files (1k, 100k, 500k). 
  • Maximum CPU utilization was observed for mod-data-export module during DE job with 500k of instances - 10%.

Test Results

Profile used for testing - "srs - holdings and items"

Test

File

Duration: Orchid

1

1k30s
2100k48m 22s
3500k3h 53m 22s

Instance CPU Utilization

Service CPU Utilization

Memory Utilization

DB CPU Utilization

DB Connections

DB Load

SQL queries

Top-SQL statement: 

SELECT fs09000000_mod_inventory_storage.count_estimate(?)

Additional information

In UI all jobs have status - completed with column 'Failed' value equal to '-1' for 1k and 100k and '-12' for 500k.

In DB we can see that exported value higher than total and that's why column for 'Failed' shows negative value.

"status": "COMPLETED",
  "progress": {
    "total": 500000,
    "failed": -12,
    "exported": 500012
  }

Methodology/Approach

To get Baseline numbers for Data Export in main tenant with 1 user 3 files with instance ids were used.

To get status and time range for export jobs the query used: 

SELECT jsonb->>'status',jsonb->>'startedDate' AS startedDate,jsonb->>'completedDate' AS completedDate
FROM [tenant_id]_mod_data_export.job_executions
WHERE jsonb->>'jobProfileName'='srs - holdings and items'
ORDER BY jsonb->>'startedDate' desc LIMIT 10;

Test preparation: 

  • 3 files were prepared with query: SELECT id FROM [tenant_id]_mod_inventory_storage.instance where jsonb->>'source'='MARC' LIMIT 1000|100000|500000;
  • All tests were carried out sequentially

Infrastructure

PTF -environment ncp5 

  • 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1  
  • 2 database  instances, one reader, and one writer

    NameAPI NameMemory GIBvCPUsmax_connections
    R6G Extra Largedb.r6g.xlarge32 GiB4 vCPUs2731
  • number of connections for mod-source-record-manager and mod-source-record-storage - 30 connections.
  • MSK ptf-kakfa-3
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
  • Kafka topics partitioning: - 2 partitions for DI topics

Modules memory and CPU parameters

ModuleTask Def. RevisionVersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSizeR/W split enabled
ncp5-pvt
Mon Sep 18 10:17:13 UTC 2023
mod-authtoken82.13.021440115251292288128FALSE
mod-users-bl87.5.021440115251292288128FALSE
mod-inventory-storage1226.0.024096369020483076384512FALSE
mod-source-record-storage275.6.725600500020483500384512FALSE
mod-source-record-manager183.6.425600500020483500384512FALSE
okapi-b85.0.13168414401024922384512FALSE
mod-data-export64.7.111024896102476888128FALSE
  • No labels