Copy of PTF - Performance testing of Export All Endpoint (Quesnelia) [non-ECS]

Copy of PTF - Performance testing of Export All Endpoint (Quesnelia) [non-ECS]

Overview

  • This document contains results of testing Data Export All by the endpoint (POST data-export/export-all) with Default instances, authority, export holdings job profiles and with the custom profile Example 1 on the Quesnelia [non-ECS] release on qcp1 environment

https://folio-org.atlassian.net/browse/PERF-890 

Summary

  • Data Export tests finished successfully on qcp1 environment using the Default instances, authority, holdings export job profile and Custom Mapping Profiles - Example 1.

  • During the tests data-export-all with Custom Mapping Profiles - Example 1 we observe Errors converting json to marc for instances with 9 and more holdings, task for investigation was created https://folio-org.atlassian.net/browse/MDEXP-780

  • Two very slow queries when checking deleted records in tables mod_inventory_storage.audit_instance and mod_inventory_storage.audit_holdings_record for Data-Export-All, task for investigation was created https://folio-org.atlassian.net/browse/MODINVSTOR-1234

  • No memory leaks are observed.

  • The data export duration for default profiles and the duration for the concurrent test of default profiles plus the export of 10,000 instances with a custom profile are the same. This is because Data-Export-All uses one data-export module task, so if we run concurrent tests, each job will be executed on a separate data-export module task, because data-export module has two tasks.

Test Runs and Results

This table contains durations for Data Export. 

Test #

Job Profile

Count of exported records

Data Export Duration
(hh:mm:ss)

Results

Test #

Job Profile

Count of exported records

Data Export Duration
(hh:mm:ss)

Results

1

Default instances export job profile

19664506

3:03:59

COMPLETED

Custom Mapping Profiles - Example 1

10000

0:11:14

2

Default authority export job profile

6186517

3:29:03

COMPLETED

Custom Mapping Profiles - Example 1

10000

0:01:49

3

Default holdings export job profile

20669235

1:46:56

COMPLETED

Custom Mapping Profiles - Example 1

10000

0:01:22

4

Default instances export job profile

19789508

2:34:09

COMPLETED

5

Default holdings export job profile

20794226

1:46:16

COMPLETED

6

Default authority export job profile

6186517

3:28:51

COMPLETED

7

Custom Mapping Profiles - Example 1

19789503

5:51:08

COMPLETED_WITH_ERRORS





Resource utilization for Test №1, №2 and №3

Service Name

CPU

Service Name

RAM

Service Name

CPU

Service Name

RAM

mod-data-export-b

143%

mod-data-export-b

106%

mod-inventory-b

12%

mod-source-record-manager-b

53%

mod-source-record-manager-b

1.90%

mod-source-record-storage-b

51%

mod-source-record-storage-b

1.40%

okapi-b

47%

okapi-b

0.80%

mod-users-bl-b

43%

mod-authtoken-b

0.60%

mod-inventory-b

26%

mod-users-bl-b

0.50%

mod-authtoken-b

26%

mod-inventory-storage-b

0.30%

mod-inventory-storage-b

17%

nginx-okapi

0.10%

pub-okapi

4.70%

pub-okapi

0.00%

nginx-okapi

4.60%

Service CPU Utilization

Here we can see that mod-data-export used 150% CPU.

Service Memory Utilization

Here we can see that data-export module used 95% memory.



Kafka metrics





DB CPU Utilization

DB CPU was 90%.

DB Connections

Max number of DB connections was 750

DB load

Top SQL-queries



#

TOP 5 SQL statements

#

TOP 5 SQL statements

1

SELECT id, content, external_id, record_type, state, leader_record_status, suppress_discovery FROM v_authority_all WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

2

select ie1_0.id,ie1_0.jsonb from v_instance ie1_0 where ie1_0.id in ($1...)

3

SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

4

SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

5

SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

Resource utilization for Test №4, №5 and №6

Service Name

CPU

Service Name

RAM

Service Name

CPU

Service Name

RAM

mod-data-export-b

177%

mod-data-export-b

103%

mod-inventory-b

12%

mod-source-record-manager-b

52%

mod-source-record-manager-b

1.80%

mod-inventory-b

48%

mod-source-record-storage-b

1.40%

okapi-b

38%

okapi-b

1.00%

mod-source-record-storage-b

31%

mod-authtoken-b

0.80%

mod-users-bl-b

31%

mod-users-bl-b

0.50%

mod-authtoken-b

26%

mod-inventory-storage-b

0.30%

mod-inventory-storage-b

17%

nginx-okapi

0.20%

nginx-okapi

4.70%

pub-okapi

0.10%

pub-okapi

4.40%

Service CPU Utilization

Here we can see that mod-data-export used 194% CPU.

Service Memory Utilization

Here we can see that all modules show a stable trend except mod-source-record-manager.



Kafka metrics





DB CPU Utilization

DB CPU was 92%.

DB Connections

Max number of DB connections was 762.

DB load

Top SQL-queries



#

TOP 5 SQL statements

#

TOP 5 SQL statements

1

SELECT id, content, external_id, record_type, state, leader_record_status, suppress_discovery FROM v_authority_all WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

2

select ie1_0.id,ie1_0.jsonb from v_instance ie1_0 where ie1_0.id in ($1...)

3

SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

4

SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

5

SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

Resource utilization for Test №7

Service Name

CPU

Service Name

RAM

Service Name

CPU

Service Name

RAM

mod-data-export-b

145%

mod-data-export-b

92%

mod-inventory-b

12.80%

mod-source-record-manager-b

47%

mod-source-record-manager-b

1.70%

okapi-b

46%

mod-authtoken-b

1.40%

mod-source-record-storage-b

39%

mod-source-record-storage-b

1.20%

mod-inventory-b

36%

okapi-b

1.00%

mod-users-bl-b

34%

mod-users-bl-b

0.40%

mod-authtoken-b

25%

mod-inventory-storage-b

0.40%

mod-inventory-storage-b

15%

nginx-okapi

0.30%

nginx-okapi

4.50%

pub-okapi

0.10%

pub-okapi

4.40%

Service CPU Utilization

Here we can see that mod-data-export used 145% CPU.

Service Memory Utilization

Here we can see that mod-data-export used 93% memory .



Kafka metrics







DB CPU Utilization

DB CPU was 82%.

DB Connections

Max number of DB connections was 730.

DB load

Top SQL-queries



#

TOP 5 SQL statements

#

TOP 5 SQL statements

1

select ie1_0.id,ie1_0.holdings_record_id,ie1_0.jsonb from v_item ie1_0 where ie1_0.holdings_record_id in ($1)

2

select iwhe1_0.id,iwhe1_0.hrid from v_instance_hrid iwhe1_0 where iwhe1_0.id in ($1,$2...)

3

SELECT * FROM v_marc_instance_all_non_deleted_non_suppressed WHERE external_id BETWEEN $1 AND $2 ORDER BY id ASC offset $3 rows fetch next $4 rows only

4

select hre1_0.id,hre1_0.instance_id,hre1_0.jsonb from v_holdings_record hre1_0 where hre1_0.instance_id=$1

5

SELECT * FROM v_folio_instance_all_non_deleted_non_suppressed WHERE id BETWEEN $1 AND $2 ORDER BY id ASC fetch first $3 rows only

Appendix

Infrastructure

PTF -environment qcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • 1 database  instance, writer

  • Number of records in DB:

    •  fs09000000

      • instances - 27289981

      • items - 28463562

      • holdings - 27535678

      • authorities - 6193573

  • Open Search ptf-test

    • Data nodes

      • Instance type - r6g.2xlarge.search