Check-in-check-out Test Report (Fameflower)

Overview

In this workflow we are checking the performance of the check-in-check-out workflow running in the Fameflower release ( PERF-9 - Getting issue details... STATUS ).  We will test it with 1, 5, 8, and 20 virtual users for 30 minutes.  A longevity test will also be executed to see if there were memory issues.

Backend:

  • mod-circulation-18.0.9
  • mod-circulation-storage-11.0.8
  • mod-inventory-storage-19.1.2
  • mod-inventory-14.1.3
  • okapi-2.38.0

Frontend:

  • folio_circulation-2.0.0
  • Item Check-in (folio_checkin-2.0.1)
  • Item Check-out (folio_checkout-3.0.2)

Environment:

  • 55 back-end modules deployed in 110 ECS services
  • 3 okapi ECS services
  • 8 m5.large  EC2 instances
  •  db.r5.xlarge AWS RDS instance
  • INFO logging level

High Level Summary

  1. Overall check in, check out time in seconds

    1. Average check in time is 1.51 seconds for a typical use case of 8 users, 1.65 seconds for 20 users
    2. Average check out time is 1.75 seconds for a typical use case of 8 users, 1.90 seconds for 20 users
  2. Slow APIs taking more than 100ms to run 
    1. POST checkout-by-barcode
    2. POST checkin-by-barcode 
    3. Get circulation/loans
    4. Get inventory/items
  3. mod-circulation-storage log warnings for missing indexes - 64K lines in  45 minutes run. Logging level could be reduced to WARNING or INFO, but at the cost of having less data to work with should there be a need to troubleshoot. Adding the missing indexes could improve performance while stop logging these warnings  CIRCSTORE-215 - Getting issue details... STATUS
  4. JVM profiling shows JSON de/serialization operations one of the slowest operations, totaling more CPU time than other calls.  Since FOLIO modules retrieve and stores JSON objects, making sure that serializing and deserializing JSON efficient is essential, see Recommended Improvements

Test Runs

Test

Virtual Users

Duration

OKAPI log level

OKAPI Version

Profiled

1.

1

30 min

INFO

2.38.0

No

2.

5

30 min

INFO

2.38.0

No

3.

8

30 min

INFO

2.38.0

No

4.

20

20 min

INFO

2.38.0

No

5.

8

45 min

INFO

2.38.0

Yes

Results

  1. Response times


    Average (seconds)50th %tile (seconds)75th %tile (seconds)95th %tile  (seconds)

    Check-inCheck-outCheck-inCheck-outCheck-inCheck-outCheck-inCheck-out
    1 user1.0151.2340.961.2771.0711.4091.3221.653
    5 users1.2361.4881.1561.3931.4641.8691.7042.219
    8 users1.5121.7511.4031.8521.7412.0312.022.274
    20 users1.6491.8981.5351.9961.8962.2112.2522.539
  2. Slow APIs taking more than 100 ms to return

    API1 user (75th %tile)5 users (75th %tile)8 users (75th %tile)20 Users (75th %tile)
    POST checkout-by-barcode615 ms905 ms906 ms988 ms

    POST checkin-by-barcode 

    548 ms830 ms1053 ms1137 ms
    Get circulation/loans283 ms346 ms449 ms479 ms
    Get inventory/items217 ms232 ms237 ms281 ms

JVM Profiling

  • Only slow Okapi methods:


When drilling down org.folio.okapi.managers.ModuleManager.getEnabledModules, we get the following tree. To see more click here: http://ec2-3-93-19-104.compute-1.amazonaws.com/grafana/d/U9JtDPLWz/stacktrace?orgId=1&class=org.folio.okapi.managers.ModuleManager&method=getEnabledModules&from=1589940304610&to=1589943246772 

  • Slow mod-circulation methods:

Database

Database does not show much CPU usage for 1, 5, 8 and 20 users runs.  At maximum only 25% CPU usage for the high case of 20 users.


The following WARNING statements of missing indexes were generated during a test run and logged by mod-circulation-storage:

WARNING: Doing SQL query without index for scheduled_notice.jsonb->>'nextRunTime', CQL >>> SQL: nextRunTime < >>> scheduled_notice.jsonb->>'nextRunTime' <'2020-05-20T03:07:08.090Z'
WARNING: Doing FT search without index for request.jsonb->>'pickupServicePointId', CQL >>> SQL: pickupServicePointId = 130d8bff-bdbd-4dc5-a4ac-6d970f4918ff >>> to_tsvector('simple', f_unaccent(request.jsonb->>'pickupServicePointId')) @@ replace((to_tsquery('simple', f_unaccent('''130d8bff-bdbd-4dc5-a4ac-6d970f4918ff''')))::text, '&', '<->')::tsquery
WARNING: Doing FT search without index for request.jsonb->>'requesterId', CQL >>> SQL: requesterId = 001164c0-5466-4822-86f2-dcd2393a7ef7 >>> to_tsvector('simple', f_unaccent(request.jsonb->>'requesterId')) @@ replace((to_tsquery('simple', f_unaccent('''001164c0-5466-4822-86f2-dcd2393a7ef7''')))::text, '&', '<->')::tsquery
WARNING: Doing FT search without index for request.jsonb->>'status', CQL >>> SQL: status = Open >>> to_tsvector('simple', f_unaccent(request.jsonb->>'status')) @@ replace((to_tsquery('simple', f_unaccent('''Open''')))::text, '&', '<->')::tsquery
WARNING: Doing LIKE search without index for jsonb->>'requestId', CQL >>> SQL: requestId == 01819cdb-de38-4562-9835-14071dcaf53d >>> lower(f_unaccent(jsonb->>'requestId')) LIKE lower(f_unaccent('01819cdb-de38-4562-9835-14071dcaf53d'))
WARNING: Doing LIKE search without index for request.jsonb->>'requesterId', CQL >>> SQL: requesterId == 005de89f-bfe1-4bf7-a1e3-f34a707ace21 >>> lower(f_unaccent(request.jsonb->>'requesterId')) LIKE lower(f_unaccent('005de89f-bfe1-4bf7-a1e3-f34a707ace21'))
WARNING: Doing LIKE search without index for patron_action_session.jsonb->>'actionType', CQL >>> SQL: actionType == Check-out >>> lower(f_unaccent(patron_action_session.jsonb->>'actionType')) LIKE lower(f_unaccent('Check-out'))
WARNING: Doing LIKE search without index for scheduled_notice.jsonb->>'triggeringEvent', CQL >>> SQL: triggeringEvent == "Due date" >>> lower(f_unaccent(scheduled_notice.jsonb->>'triggeringEvent')) LIKE lower(f_unaccent('Due date'))
WARNING: Doing LIKE search without index for scheduled_notice.jsonb->'noticeConfig'->>'sendInRealTime', CQL >>> SQL: noticeConfig.sendInRealTime == false >>> lower(f_unaccent(scheduled_notice.jsonb->'noticeConfig'->>'sendInRealTime')) LIKE lower(f_unaccent('false'))
WARNING: Doing SQL query without index for scheduled_notice.jsonb->>'nextRunTime', CQL >>> SQL: nextRunTime < >>> scheduled_notice.jsonb->>'nextRunTime' <'2020-05-20T03:07:08.090Z'
WARNING: Doing SQL query without index for scheduled_notice.jsonb->>'nextRunTime', CQL >>> SQL: nextRunTime < 2020-05-20T00:00:00.000Z >>> scheduled_notice.jsonb->>'nextRunTime' <'2020-05-20T00:00:00.000Z'

CPU Utilization


1 user5 users8 users20 users

Average %Range %Average %Range %Average %Range %Average %Range %
Okapi1.860.32-6.855.951.16 - 16.557.921.41-16.0717.522.36-29.85
mod-inventory0.400.28 - 3.010.760.32-8.360.620.8-2.1710.25-3
mod-inventory-storage0.940.328 - 4.092.880.36-9.113.331.84-19.727.961.88-14.67

Memory

Memory was stable throughout the runs, only a spike here or there, but in a 30 minutes run they were consistent. 


1 user5 users8 users20 users

AverageAverageAverageAverage
Okapi50%50%46%46%
mod-circulation70%62%70%71%
mod-circulation-storage30%30%31%31%
mod-inventory38%38%38%38%
mod-inventory-storage41%41%41%41%

Logging

With INFO log level, In a 45 minutes run,  there were more than 66K lines of log in mod-circulation-storage logs, 64K of them were the following warnings.  Indeed this corroborates with the JVM profiling showing one of the top 3 slowest methods is for logging.

WARNING: Doing SQL query without index for scheduled_notice.jsonb->>'nextRunTime', CQL >>> SQL: nextRunTime < >>> scheduled_notice.jsonb->>'nextRunTime' <'2020-05-20T03:07:08.090Z'
WARNING: Doing FT search without index for request.jsonb->>'pickupServicePointId', CQL >>> SQL: pickupServicePointId = 130d8bff-bdbd-4dc5-a4ac-6d970f4918ff >>> to_tsvector('simple', f_unaccent(request.jsonb->>'pickupServicePointId')) @@ replace((to_tsquery('simple', f_unaccent('''130d8bff-bdbd-4dc5-a4ac-6d970f4918ff''')))::text, '&', '<->')::tsquery
WARNING: Doing FT search without index for request.jsonb->>'requesterId', CQL >>> SQL: requesterId = 001164c0-5466-4822-86f2-dcd2393a7ef7 >>> to_tsvector('simple', f_unaccent(request.jsonb->>'requesterId')) @@ replace((to_tsquery('simple', f_unaccent('''001164c0-5466-4822-86f2-dcd2393a7ef7''')))::text, '&', '<->')::tsquery
WARNING: Doing FT search without index for request.jsonb->>'status', CQL >>> SQL: status = Open >>> to_tsvector('simple', f_unaccent(request.jsonb->>'status')) @@ replace((to_tsquery('simple', f_unaccent('''Open''')))::text, '&', '<->')::tsquery
WARNING: Doing LIKE search without index for jsonb->>'requestId', CQL >>> SQL: requestId == 01819cdb-de38-4562-9835-14071dcaf53d >>> lower(f_unaccent(jsonb->>'requestId')) LIKE lower(f_unaccent('01819cdb-de38-4562-9835-14071dcaf53d'))
WARNING: Doing LIKE search without index for request.jsonb->>'requesterId', CQL >>> SQL: requesterId == 005de89f-bfe1-4bf7-a1e3-f34a707ace21 >>> lower(f_unaccent(request.jsonb->>'requesterId')) LIKE lower(f_unaccent('005de89f-bfe1-4bf7-a1e3-f34a707ace21'))

Recommended Improvements 

  • In mod-circulation and okapi consider using a more efficient JSON package or calling use the existing jackson serialization calls in a different way to address the item: JVM profiling shows JSON de/serialization operations one of the slowest operations.
  • In mod-circulation consider using a more efficient date-time package instead of joda time because it's one of the slowest operations.
  • Consider logging with ERROR level if not fixing the JIRA below to reduce the excess logging by mod-circulation-storage

CIRCSTORE-215 - Getting issue details... STATUS

  • Have follow-up stories to study the performance of the four APIs that are still taking over 100ms to return to see where performance could improve.

Appendix

For more raw data of the test runs please see the attached check-in-check-out-FF-UChicago.xlsx.