PTF - Data Export Test Report (Fameflower)

PTF - Data Export Test Report (Fameflower)

Overview

In this workflow, we are checking the performance of exporting MARC Bib records workflow running in the Fameflower release - PERF-29: Exporting MARC Bib recordsClosed

We tested with 1, 2 virtual users for 30 minutes.  A longevity test will also be executed to see if there were memory issues.

Backend:

  • mod-data-export-1.1.1

  • mod-source-record-storage-3.1.4

  • mod-source-record-manager-2.1.4

  • okapi-2.38.0

Frontend:

  • folio_data-export-1.0.2

Environment:

  • 7.2 million UChi SRS records

  • 7.2 million inventory records

  • 55 FOLIO back-end modules deployed in 110 ECS services

  • 3 okapi ECS services

  • 8 m5.large  EC2 instances

  •  db.r5.xlarge AWS RDS instance

  • INFO logging level

High-Level Summary

  1. EXPLAIN ANALYZE statements were found which increases the overall execution time by 100%.

  2. SQL query takes 38 seconds to export 1 instance record and grows linearly with the number of records. Therefore with 5 instance records, it takes over 5+ minutes.

Test Runs

Test

Virtual Users

Duration

OKAPI log level

OKAPI Version

Profiled

Avg Total time to Export 1 instance

Avg Total time to Export 5 instances

Avg Total time to Export 6 instances

1.

1

30 min

INFO

2.38.0

No

1 minute

5 minutes 58 seconds

6 minutes 43 seconds

2.

2

30 min

INFO

2.38.0

No

2 minutes 32 seconds

11 minutes 10 seconds

11 minutes 52 seconds

3.

1

30 min

INFO

2.38.0

Yes

1 minute 20 seconds

6 minutes 14 seconds

6 minutes 44 seconds

4.

2

30 min

INFO

2.38.0

Yes

2 minutes 40 seconds

10 minutes 53 seconds

11 minutes 52 seconds

Results

  1. Overall data export of Bib Marc record's time in seconds

    Above data export jobs are run against 7.2 million SRS records. As we increase the number of inventory instance, export time linearly increases with the number of records. Data export fails after running with 6 instances. Since it is an asynchronous task, the export job does not fail immediately. So for example exporting 6 instances will eventually fail but will still take 6 minutes until it stops.



       2. Slow APIs taking more than 100 ms to return

API

1 user (75th %tile)

2 users (75th %tile)

API

1 user (75th %tile)

2 users (75th %tile)

POST_/data-export/fileDefinitions/{id}/upload - 1 instance

1.02 minutes

1.33475 minutes

API

5 instances

6 instances

API

5 instances

6 instances

GET_/source-storage/records?query=externalIdsHolder.instanceId

Time - 6 minutes 13 seconds



SRS records successfully retrieved

Time - 6 minutes 40 seconds



SRS records failed to retrieve due to Time-out error

JVM Profiling

  • Overall slow methods (between the modules profiled: okapi, mod-data-export, mod-source-record-storage)

http://ec2-3-93-19-104.compute-1.amazonaws.com/grafana/d/U9JtDPLWz/stacktrace?orgId=1&class=org.apache.logging.log4j.spi.AbstractLogger&method=debug&from=1588464851532&to=1591056851532

http://ec2-3-93-19-104.compute-1.amazonaws.com/grafana/d/U9JtDPLWz/stacktrace?orgId=1&class=com.fasterxml.jackson.databind.ObjectMapper&method=writeValueAsString&from=1588464851532&to=1591056851532

Database

The database does not show much CPU usage for 1, 2 users run. At maximum 40%-50% CPU usage for the high case of 2 users exporting 6 inventory instances.



The following WARNING statements of missing indexes were generated  by mod-source-record-storage during a test run:

WARNING: Doing LIKE search without index for records.jsonb->'externalIdsHolder'->>'instanceId', CQL >>> SQL: externalIdsHolder.instanceId == bd2a6f64-7b77-42b6-a100-568b33e836ad >>> lower(f_unaccent(records.jsonb->'externalIdsHolder'->>'instanceId')) LIKE lower(f_unaccent('bd2a6f64-7b77-42b6-a100-568b33e836ad')) WARNING: Doing LIKE search without index for records.jsonb->'externalIdsHolder'->>'instanceId', CQL >>> SQL: externalIdsHolder.instanceId == 0b9dda1d-ba76-4296-aa2b-056d3fb7debe >>> lower(f_unaccent(records.jsonb->'externalIdsHolder'->>'instanceId')) LIKE lower(f_unaccent('0b9dda1d-ba76-4296-aa2b-056d3fb7debe')) WARNING: Doing LIKE search without index for records.jsonb->'externalIdsHolder'->>'instanceId', CQL >>> SQL: externalIdsHolder.instanceId == d31ca25c-756a-4cc9-b85b-6e9a4b5755a6 >>> lower(f_unaccent(records.jsonb->'externalIdsHolder'->>'instanceId')) LIKE lower(f_unaccent('d31ca25c-756a-4cc9-b85b-6e9a4b5755a6')) WARNING: Doing LIKE search without index for records.jsonb->'externalIdsHolder'->>'instanceId', CQL >>> SQL: externalIdsHolder.instanceId == ce8d0d63-7e4f-4630-9f44-350a8cc2984d >>> lower(f_unaccent(records.jsonb->'externalIdsHolder'->>'instanceId')) LIKE lower(f_unaccent('ce8d0d63-7e4f-4630-9f44-350a8cc2984d')) WARNING: Doing LIKE search without index for records.jsonb->'externalIdsHolder'->>'instanceId', CQL >>> SQL: externalIdsHolder.instanceId == a6faacc9-3cdd-44ec-9df7-4c29b5315b98 >>> lower(f_unaccent(records.jsonb->'externalIdsHolder'->>'instanceId')) LIKE lower(f_unaccent('a6faacc9-3cdd-44ec-9df7-4c29b5315b98'))



RDS Postgresql slow queries:

  1. Export 1 inventory instance record:

fs09000000_mod_source_record_storage@folio:[7395]:LOG: statement: SELECT id, jsonb, totalrows FROM get_records('WHERE lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''957a8840-e3ad-4c2b-83c3-23333391095d'')) ', '', 10, 0, 'fs09000000_mod_source_record_storage') fs09000000_mod_source_record_storage@folio:[7395]:LOG: duration: 38243.470 ms fs09000000_mod_source_record_storage@folio:[7395]:LOG: statement: EXPLAIN ANALYZE SELECT id, jsonb, totalrows FROM get_records('WHERE lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''957a8840-e3ad-4c2b-83c3-23333391095d'')) ', '', 10, 0, 'fs09000000_mod_source_record_storage') fs09000000_mod_source_record_storage@folio:[7395]:LOG: duration: 41883.335 ms

Total duration: 38243.470 ms + 41883.335 ms = ~1 minute 20 seconds



      2. Export 5 inventory instance records:

fs09000000_mod_source_record_storage@folio:[31709]:LOG: statement: SELECT id, jsonb, totalrows FROM get_records('WHERE ((((lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''bd2a6f64-7b77-42b6-a100-568b33e836ad''))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''0b9dda1d-ba76-4296-aa2b-056d3fb7debe'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''d31ca25c-756a-4cc9-b85b-6e9a4b5755a6'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''ce8d0d63-7e4f-4630-9f44-350a8cc2984d'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''a6faacc9-3cdd-44ec-9df7-4c29b5315b98''))) ', '', 10, 0, 'fs09000000_mod_source_record_storage') fs09000000_mod_source_record_storage@folio:[31709]:LOG: duration: 177984.112 ms fs09000000_mod_source_record_storage@folio:[31709]:LOG: statement: EXPLAIN ANALYZE SELECT id, jsonb, totalrows FROM get_records('WHERE ((((lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''bd2a6f64-7b77-42b6-a100-568b33e836ad''))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''0b9dda1d-ba76-4296-aa2b-056d3fb7debe'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''d31ca25c-756a-4cc9-b85b-6e9a4b5755a6'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''ce8d0d63-7e4f-4630-9f44-350a8cc2984d'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''a6faacc9-3cdd-44ec-9df7-4c29b5315b98''))) ', '', 10, 0, 'fs09000000_mod_source_record_storage') fs09000000_mod_source_record_storage@folio:[31709]:LOG: duration: 180791.965 ms

Total duration: 177984.112 ms + 180791.965 ms = ~5 minute 57 seconds

CPU and Memory Utilization

CPU was stable throughout the runs, only a spike here or there, but in a 30 minutes run they were consistent. 

Memory utilization is also stable for 1-2 users run while exporting 6 inventory instances against 7.2 million SRS records. Maximum utilization is around 52% - 55%. 

mod-data-export:



mod-source-record-storage

Recommended improvement

Appendix

For more raw data of the test run please see the attached data-export-test-report.xls.