PTF - Data Export Test Report (Poppy)

PTF - Data Export Test Report (Poppy)

Overview

This document contains the results of testing Data Export (MARC BIB) on Poppy release with Data Export tests for 1k, 100k, 500k. Three csv files were prepared to run Data Export with Default instances export job profile and srs - holdings and items job profiles. Because of some processes were running on background during tests it was decided to run additional set of tests with 1k and 100k for each job profile. So graphs in the report show only second set of tests.

Ticket: https://folio-org.atlassian.net/browse/PERF-748

 

Summary

  • DE jobs with files 1k, 100k, 500k durations has no significant changes compared with Orchid. No issues with token. DE jobs with 500k were completed with FAIL statuses in a both profiles.

  • FAIL status for DE srs - holdings and items job profile with 500k file is likely connected with too high data volume transfer to S3 bucket. It will be investigated in story created by firebird team.

  • Average CPU utilization for mod-data-export didn't exceed 18% with spikes. During 100k it was 10-12%. 

  • Average Memory consumption for mod-data-export was close to 100%. 

  • Average DB utilization - 18%. DB connections - 200. During tests spikes with 40% observed every 15 minutes.

Recommendations & Jiras

Test Results

This table contains durations for jobs with 2 job profiles. 

Profile

CSV  File

DE Duration/Status Poppy 1 set

DE Duration/Status Poppy 2 set

Result

Status

Result

Status

DE MARC Bib (Default instances export job profile)

1kDE.csv

00:00:08

COMPLETED

00:00:23

COMPLETED

100kDE.csv

00:15:36

COMPLETED

00:15:23

COMPLETED

500kDE.csv

00:57:25

FAIL

 

 

DE MARC Bib (srs - holdings and items)

1kDE.csv

00:00:29

COMPLETED

00:00:38

COMPLETED

100kDE.csv

00:47:23

COMPLETED

00:52:57

COMPLETED

500kDE.csv

04:11:09

FAIL

 

 

Comparison

This table contains durations comparison between Orchid and Poppy releases

Profile

CSV  File

DE Duration/Status Orchid

DE Duration/Status Poppy 1 set

DE Duration, DELTA Orchid/Poppy 1 set

DE Duration/Status Poppy 2 set

DE Duration, DELTA Orchid/Poppy 2 set

Result

Status

Result

Status

hh:mm:ss

Result

Status

hh:mm:ss

DE MARC Bib (Default instances export job profile)

1kDE.csv

 

 

00:00:08

COMPLETED

 

00:00:23

COMPLETED

 

100kDE.csv

 

 

00:15:36

COMPLETED

 

00:15:23

COMPLETED

 

500kDE.csv

 

 

00:57:25

FAIL

 

 

 

 

DE MARC Bib (srs - holdings and items)

1kDE.csv

00:00:27

COMPLETED

00:00:29

COMPLETED

+ 00:00:02

00:00:38

COMPLETED

+ 00:00:11

100kDE.csv

00:47:51

COMPLETED

00:47:23

COMPLETED

- 00:00:28

00:52:57

COMPLETED

+ 00:05:06

500kDE.csv

04:00:26

COMPLETED

04:11:09

FAIL

+ 00:10:43

 

 

 

Instance CPU Utilization

Service CPU Utilization

Memory Utilization

This graph shows that mod-data-export doesn't exceed 102% during test with 100k and at the end of the test it grew up to 112%. Memory consumption didn't grow with 500k file.

DB CPU Utilization

During tests spikes were observed on DB every 15 minutes. Average CPU Utilization equal to 18% .

 

DB Connections

Average DB connection is 200.

 

 

DB Load

SQL queries

Top-SQL statement: 

inventory - go to tables loans - check are there any

autovacuum: VACUUM fs09000000_mod_inventory_storage.instance (to prevent wraparound)

WITH deleted_rows AS ( delete from marc_indexers mi where exists( select ? from marc_records_tracking mrt where mrt.is_dirty = ? and mrt.marc_id = mi.marc_id and mrt.version > mi.version ) returning mi.marc_id), deleted_rows2 AS ( delete from marc_indexers mi where exists( select ? from records_lb where records_lb.id = mi.marc_id and records_lb.state = ? ) returning mi.marc_id) INSERT IN

SELECT fs09000000_mod_inventory_storage.count_estimate(?)

with "cte" as (select count(*) from "records_lb" where ("records_lb"."external_id" in (cast($1 as uuid), cast($2 as uuid), cast($3 as uuid), cast($4 as uuid), cast($5 as uuid), cast($6 as uuid), cast($7 as uuid), cast($8 as uuid), cast($9 as uuid), cast($10 as uuid), cast($11 as uuid), cast($12 as uuid), cast($13 as uuid), cast($14 as uuid), cast($15 as uuid), cast($16 as uuid), cast($17 as uuid), cast($18 as uuid), cast($19 as uuid), cast($20 as uuid), cast($21 as uuid), cast($22 as uuid), cast

Errors / Additional information

  • During test with DE MARC Bib (Default instances export job profile) in UI we see this message:

    • 2023-12-01T10:05:09.426+00:00 ERROR Export is completed with errors: some records have failed to export: number of failed records: 6

  • During test with DE MARC Bib (srs - holdings and items) in UI we see this message:

    • 2023-12-01T17:38:18.129+00:00 ERROR Error while getting items by holding ids Exception while calling...
      message: Get invalid response with status: 414

Methodology/Approach

3 files were prepared with query: SELECT id FROM [tenant_id]_mod_inventory_storage.instance where jsonb->>'source'='MARC' LIMIT 1000|100000|500000;

All tests were carried out sequentially with each job profile. 

To get status and time range for export jobs the query used: 

SELECT jsonb->>'status',jsonb->>'startedDate' AS startedDate,jsonb->>'completedDate' AS completedDate
FROM [REPLACE_tenant_id_HERE]_mod_data_export.job_executions
WHERE jsonb->>'jobProfileName'='[REPLACE_WITH_DE_JOB_HERE]'
ORDER BY jsonb->>'startedDate' desc LIMIT 10;

Infrastructure

PTF -environment pcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • 2 database  instances, writer/reader

  • MSK tenant

    • 4 m5.2xlarge brokers in 2 zones

    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3

Table contains modules, memory and CPU parameters

Module

Task Def. Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft limit

CPU units

Xmx

MetaspaceSize

MaxMetaspaceSize

R/W split enabled

pcp1-pvt

Wed Nov 22 08:06:06 UTC 2023

mod-data-export

11

4.8.1

1

1024

896

1024

768

88

128

FALSE

mod-authtoken

13

2.14.1

2

1440

1152

512

922

88

128

FALSE

mod-users-bl

9

7.6.0

2

1440

1152

512

922

88

128

FALSE

mod-inventory-storage

12

27.0.3

2

4096

3690

2048

3076

384

512

FALSE

mod-inventory

11

20.1.3

2

2880

2592

1024

1814

384

512

FALSE

mod-source-record-storage

15

5.7.3

2

5600

5000

2048

3500

384

512

FALSE

mod-source-record-manager

14

3.7.4

2

5600

5000

2048

3500

384

512

FALSE

nginx-okapi

9

2023.06.14

2

1024

896

128

0

0

0

FALSE

okapi-b

11

5.1.2

3

1684

1440

1024

922

384

512

FALSE