PTF - Data Export Test Report (Fameflower)

Overview

In this workflow, we are checking the performance of exporting MARC Bib records workflow running in the Fameflower release -  PERF-29 - Getting issue details... STATUS

We tested with 1, 2 virtual users for 30 minutes.  A longevity test will also be executed to see if there were memory issues.

Backend:

  • mod-data-export-1.1.1
  • mod-source-record-storage-3.1.4
  • mod-source-record-manager-2.1.4
  • okapi-2.38.0

Frontend:

  • folio_data-export-1.0.2

Environment:

  • 7.2 million UChi SRS records
  • 7.2 million inventory records
  • 55 FOLIO back-end modules deployed in 110 ECS services
  • 3 okapi ECS services
  • 8 m5.large  EC2 instances
  •  db.r5.xlarge AWS RDS instance
  • INFO logging level

High-Level Summary

  1. EXPLAIN ANALYZE statements were found which increases the overall execution time by 100%.
  2. SQL query takes 38 seconds to export 1 instance record and grows linearly with the number of records. Therefore with 5 instance records, it takes over 5+ minutes.

Test Runs

Test

Virtual Users

Duration

OKAPI log level

OKAPI Version

Profiled

Avg Total time to Export 1 instanceAvg Total time to Export 5 instancesAvg Total time to Export 6 instances

1.

1

30 min

INFO

2.38.0

No

1 minute5 minutes 58 seconds6 minutes 43 seconds

2.

2

30 min

INFO

2.38.0

No

2 minutes 32 seconds11 minutes 10 seconds11 minutes 52 seconds

3.

1

30 min

INFO

2.38.0

Yes

1 minute 20 seconds6 minutes 14 seconds6 minutes 44 seconds

4.

2

30 min

INFO

2.38.0

Yes

2 minutes 40 seconds10 minutes 53 seconds11 minutes 52 seconds

Results

  1. Overall data export of Bib Marc record's time in seconds


    Average (minutes)75th %tile (minutes)

    1 instance5 instances6 instances1  instance5 instances6 instances
    1 user1.305.966.72331.405.976.72
    2 users2.6611.1711.882.7512.8813.34

    Above data export jobs are run against 7.2 million SRS records. As we increase the number of inventory instance, export time linearly increases with the number of records. Data export fails after running with 6 instances. Since it is an asynchronous task, the export job does not fail immediately. So for example exporting 6 instances will eventually fail but will still take 6 minutes until it stops.


       2. Slow APIs taking more than 100 ms to return

API1 user (75th %tile)2 users (75th %tile)
POST_/data-export/fileDefinitions/{id}/upload - 1 instance1.02 minutes1.33475 minutes
API5 instances6 instances
GET_/source-storage/records?query=externalIdsHolder.instanceId

Time - 6 minutes 13 seconds


SRS records successfully retrieved

Time - 6 minutes 40 seconds


SRS records failed to retrieve due to Time-out error

JVM Profiling

  • Overall slow methods (between the modules profiled: okapi, mod-data-export, mod-source-record-storage)

http://ec2-3-93-19-104.compute-1.amazonaws.com/grafana/d/U9JtDPLWz/stacktrace?orgId=1&class=org.apache.logging.log4j.spi.AbstractLogger&method=debug&from=1588464851532&to=1591056851532

http://ec2-3-93-19-104.compute-1.amazonaws.com/grafana/d/U9JtDPLWz/stacktrace?orgId=1&class=com.fasterxml.jackson.databind.ObjectMapper&method=writeValueAsString&from=1588464851532&to=1591056851532

Database

The database does not show much CPU usage for 1, 2 users run. At maximum 40%-50% CPU usage for the high case of 2 users exporting 6 inventory instances.


The following WARNING statements of missing indexes were generated  by mod-source-record-storage during a test run:

WARNING: Doing LIKE search without index for records.jsonb->'externalIdsHolder'->>'instanceId', CQL >>> SQL: externalIdsHolder.instanceId == bd2a6f64-7b77-42b6-a100-568b33e836ad >>> lower(f_unaccent(records.jsonb->'externalIdsHolder'->>'instanceId')) LIKE lower(f_unaccent('bd2a6f64-7b77-42b6-a100-568b33e836ad'))
WARNING: Doing LIKE search without index for records.jsonb->'externalIdsHolder'->>'instanceId', CQL >>> SQL: externalIdsHolder.instanceId == 0b9dda1d-ba76-4296-aa2b-056d3fb7debe >>> lower(f_unaccent(records.jsonb->'externalIdsHolder'->>'instanceId')) LIKE lower(f_unaccent('0b9dda1d-ba76-4296-aa2b-056d3fb7debe'))
WARNING: Doing LIKE search without index for records.jsonb->'externalIdsHolder'->>'instanceId', CQL >>> SQL: externalIdsHolder.instanceId == d31ca25c-756a-4cc9-b85b-6e9a4b5755a6 >>> lower(f_unaccent(records.jsonb->'externalIdsHolder'->>'instanceId')) LIKE lower(f_unaccent('d31ca25c-756a-4cc9-b85b-6e9a4b5755a6'))
WARNING: Doing LIKE search without index for records.jsonb->'externalIdsHolder'->>'instanceId', CQL >>> SQL: externalIdsHolder.instanceId == ce8d0d63-7e4f-4630-9f44-350a8cc2984d >>> lower(f_unaccent(records.jsonb->'externalIdsHolder'->>'instanceId')) LIKE lower(f_unaccent('ce8d0d63-7e4f-4630-9f44-350a8cc2984d'))
WARNING: Doing LIKE search without index for records.jsonb->'externalIdsHolder'->>'instanceId', CQL >>> SQL: externalIdsHolder.instanceId == a6faacc9-3cdd-44ec-9df7-4c29b5315b98 >>> lower(f_unaccent(records.jsonb->'externalIdsHolder'->>'instanceId')) LIKE lower(f_unaccent('a6faacc9-3cdd-44ec-9df7-4c29b5315b98'))


RDS Postgresql slow queries:

  1. Export 1 inventory instance record:
fs09000000_mod_source_record_storage@folio:[7395]:LOG:  statement: SELECT id, jsonb, totalrows FROM get_records('WHERE lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''957a8840-e3ad-4c2b-83c3-23333391095d'')) ', '', 10, 0, 'fs09000000_mod_source_record_storage')
fs09000000_mod_source_record_storage@folio:[7395]:LOG:  duration: 38243.470 ms

fs09000000_mod_source_record_storage@folio:[7395]:LOG:  statement: EXPLAIN ANALYZE SELECT id, jsonb, totalrows FROM get_records('WHERE lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''957a8840-e3ad-4c2b-83c3-23333391095d'')) ', '', 10, 0, 'fs09000000_mod_source_record_storage')
fs09000000_mod_source_record_storage@folio:[7395]:LOG:  duration: 41883.335 ms

Total duration: 38243.470 ms + 41883.335 ms = ~1 minute 20 seconds


      2. Export 5 inventory instance records:

fs09000000_mod_source_record_storage@folio:[31709]:LOG:  statement: SELECT id, jsonb, totalrows FROM get_records('WHERE ((((lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''bd2a6f64-7b77-42b6-a100-568b33e836ad''))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''0b9dda1d-ba76-4296-aa2b-056d3fb7debe'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''d31ca25c-756a-4cc9-b85b-6e9a4b5755a6'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''ce8d0d63-7e4f-4630-9f44-350a8cc2984d'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''a6faacc9-3cdd-44ec-9df7-4c29b5315b98''))) ', '', 10, 0, 'fs09000000_mod_source_record_storage')
fs09000000_mod_source_record_storage@folio:[31709]:LOG:  duration: 177984.112 ms

fs09000000_mod_source_record_storage@folio:[31709]:LOG:  statement: EXPLAIN ANALYZE SELECT id, jsonb, totalrows FROM get_records('WHERE ((((lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''bd2a6f64-7b77-42b6-a100-568b33e836ad''))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''0b9dda1d-ba76-4296-aa2b-056d3fb7debe'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''d31ca25c-756a-4cc9-b85b-6e9a4b5755a6'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''ce8d0d63-7e4f-4630-9f44-350a8cc2984d'')))) OR (lower(f_unaccent(records.jsonb->''externalIdsHolder''->>''instanceId'')) LIKE lower(f_unaccent(''a6faacc9-3cdd-44ec-9df7-4c29b5315b98''))) ', '', 10, 0, 'fs09000000_mod_source_record_storage')
fs09000000_mod_source_record_storage@folio:[31709]:LOG:  duration: 180791.965 ms

Total duration: 177984.112 ms + 180791.965 ms = ~5 minute 57 seconds

CPU and Memory Utilization

CPU was stable throughout the runs, only a spike here or there, but in a 30 minutes run they were consistent. 

Memory utilization is also stable for 1-2 users run while exporting 6 inventory instances against 7.2 million SRS records. Maximum utilization is around 52% - 55%. 

mod-data-export:


mod-source-record-storage

Recommended improvement

  • On average, we can improve performance by 50% if we remove redundant "EXPLAIN ANALYZE" query call on mod_source_record_storage.records table
  • UIDEXP-107 - Getting issue details... STATUS
  • MODSOURCE-147 - Getting issue details... STATUS
  • MODSOURCE-148 - Getting issue details... STATUS

Appendix

For more raw data of the test run please see the attached data-export-test-report.xls.