Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
outlinetrue

Overview

This is a report for a series of Check-in-check-out test runs against the Honeysuckle release.

Jira Legacy
serverSystem Jira
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-135

Backend:

  • mod-circulation-19.2.7
  • mod-circulation-storage-12.1.4
  • mod-inventory-16.1.3
  • mod-inventory-storage-19.4.4
  • mod-authtoken-2.6.0
  • mod-pubsub-1.3.3
  • okapi-4.3.3 (also with 4.2.2)

Frontend:

  • folio_circulation-4.0.1
  • Item Check-in (folio_checkin-4.0.1)
  • Item Check-out (folio_checkout-5.0.1)

Environment:

  • 61 back-end modules deployed in 110 ECS services
  • 3 okapi ECS services
  • 8 m5.large  EC2 instances
  • 2 db.r5.xlarge AWS RDS instance
  • INFO logging level

High Level Summary

  • Check-out: Honeysuckle is slower by 9%-28% than Goldenrod
  • Check-in: 4%-22% slower than Goldenrod
  • APIs turned slower in Honeysuckle: GET /automated-patron-blocks/{id} (150% slower) and  GET /circulation/loans (60%). These are covered by MODPATBLK-70 and CIRC-1014, respectively
  • Okapi v4.3.3 seem to be using 2x-3x CPU cycles than in v1.3.2 (Goldenrod).  Potential issue found with the logging methods OKAPI-964 
  • mod-pubsub has a memory leak that would drag down performance under high loads (see section on longevity test): MODPUBSUB-136
  • Caching Okapi tokens in Okapi reduced mod-authtoken's CPU usage by over 90%
  • Database's memory usage improved dramatically from Goldenrod's - little memory consumptions observed.

Test Runs

Test

Virtual Users

Duration

OKAPI log level

1.

130 minsINFO

2.

530 minsINFO

3.

830 minsINFO

4.

2030 minsINFO

5.

2024 HoursINFO

Results

Response Times


Average (seconds)50th %tile (seconds)75th %tile (seconds)95th %tile  (seconds)

Check-inCheck-outCheck-inCheck-outCheck-inCheck-outCheck-inCheck-out
1 user0.9671.9890.8891.8320.9842.2011.2542.815
5 users1.0532.1710.9811.9691.1142.2531.5283.370
8 users1.1932.2441.0762.0221.3392.3721.8953.544
20 users2.3913.9011.6393.0732.2634.124.8118.784


The following table shows the slow 75th percentile APIs taking more than 100 ms to return, and also comparing them against Goldenrod's.  Other than the 1-user test, starting with the 5-users test, all listed APIs are slower with GET automated-patron-blocks leading the way at 150% slower, while GET circulation/loans regressed up to 60%

API

1 user  GR (75th %tile)

1 user  HS (75th %tile)

5 users GR (75th %tile)

5 users HS (75th %tile)

8 users GR (75th %tile)8 users HS (75th %tile)

20 users GR (75th %tile)

20 users HS (75th %tile)
GET circulation/loans0.3450.3490.3650.4060.0750.1220.6540.784
GET inventory/items0.2080.1860.2080.2220.2250.2440.3120.375

POST checkin-by-barcode 

0.6820.5930.6310.6640.8150.8741.2961.467
POST checkout-by-barcode0.7500.7170.6880.7840.7330.8771.2051.469
GET automated-patron-blocks0.0690.1630.0850.1800.0790.1970.1180.296



Average50th Percentile75th percentile95th percentile 
Check-in GRCheck-in HSCheck-out GRCheck-out HSCheck-in GRCheck-in HSCheck-out GRCheck-out HSCheck-in GRCheck-in HSCheck-out GRCheck-out HSCheck-in GRCheck-in HSCheck-out GRCheck-out HS
1 user1.060.9671.9941.9890.9770.8891.821.8321.1110.9842.0042.0211.3231.2542.8112.815
5 users1.0111.0531.9242.1710.9750.9811.8141.9691.0671.1141.9772.2531.2511.5282.6253.37
8 users1.1421.1932.0442.2441.0611.0761.8992.0221.2741.3392.0932.3721.5691.8953.1073.544
20 users1.7022.3913.023.9011.491.6392.6523.0731.9362.2633.2734.122.9534.8115.3528.784

This table shows that the average response time of the checkout transaction is 9% to 28% slower than in Goldenrod, and check-in in Honeysuckle likewise is about 4-22% slower (data aggregated from 5 to 20 users test runs).

Longevity test

A 24-hour longevity test was performed with 20 concurrent users and running with Okapi 4.2.2. Key observations:

Image Added

  • Performance degraded after 2 hours.
    • Average response time in the first two hours: 
      • Check-in:  2.28 sec
      • Check-out: 4.76 sec
    • Average response time in the first 12 hours:
      • Check-in: 2.709 sec
      • Check-out: 7.898 sec
    • Average response time in the last 12 hours:
      • Check-in:  4.605 sec
      • Check-out: 17.297 sec
    • Together the average response time for the entire test run:
      • Check-in: 3.414 sec
      • Check-out: 11.419 sec
  • Throughput degrades over time, unsurprisingly
    • Image Added


  • Modules Services CPU utilizations (The black lines indicate the start and finish time of the test run)
  • Image Added
    • With the load of 20 users, Okapi started out working hardest, but its CPU utilization drops down over time. This is because mod-pubsub's is leaking HTTP Client and taking up more resources therefore slowing down, causing Okapi to not need to route messages as much. Clearly mod-pubsub's CPU utilization went up. The other modules follow Okapi's pattern.
  • Services Modules Memory utilizations 
  • Image Added
    • No modules exhibited memory leaks except for mod-pubsub
  • Although there were two instances of mod-pubsub running on two different ec2 instances, mod-pubsub's traffic seemed to have been stickied to one instance. Here are graphs showing mod-pubsub's on one instance using up memory and CPU resources, and on another instance not showing much activities:
    • Image Added
    • mod-pubsub and Okapi on another node - Okapi's CPU utilization dwindles while mod-pubsub does not seem to be busy at all
    • Image Added

CPUs and Memories

Okapi was profiled because of the apparent 3x CPU utilization compared to the Goldenrod runs.

Goldenrod test runs:

Image Added

Honeysuckle test runs:

Image Added

Clearly Okapi is using more CPU cycles in Honeysuckle than in Goldenrod, 1 user 


Other relevant modules' CPU utilizations in Goldenrod

Image Added


Same modules in Honeysuckle:

Image Added

A few things to note:

  1. mod-authtoken uses much less CPU in Honeysuckle, over 90% reduction across all tests! This is because of the token caching functionality that was added to Okapi 4.x
  2. mod-circulation's CPU utilization in Honeysuckle averages over 20% lower than in Goldenrod.
  3. mod-circulation's CPU utilization in Honeysuckle is about 10-30% higher than in Goldenrod
  4. mod-inventory's CPU utilization in Honeysuckle averages 30% more than in Goldenrod
  5. mod-inventory-storage's CPU utilization in Honeysuckle averages 20% more than in Goldenrod 
  6. mod-pubsub's CPU utilization in Honeysuckle is about 15% less than in Goldenrod
  7. mod-patron-blocks CPU utilization in Honeysuckle is at least 30% less than in Goldenrod

JVM Profiling

Because Okapi's CPU utilization in Honeysuckle seemed to have averaged 2x to 3x higher than in Goldenrod, it was profiled to get more insights of what happened inside it.

The slowest methods in Honeysuckle are once again the Logger and Jackson serialization methods

Image Added

Compared to Goldenrod:

Image Added

Note that the AbstractLogger.Info method in Okapi 4.3.3 total CPU time is about 3x higher than in Goldenrod.  This is confirmed by Okapi 4.3.3's metrics showing ProxyContext.logRequest and ProxyContext.logResponse methods' response times degrade over time. These two methods need to be investigated.

Image Added

Image Added


Database

The database CPU utilizations are about the same between the Honeysuckle and Goldenrod

Honeysuckle's 

Image Added


Goldenrod's

Image Added


Honeysuckle's database memory utilization is much much better than Goldenrod's.  For the most part Honeysuckle's consecutive test runs did not produce signs of aggressive memory leaks that were seen in Goldenrod. 

Image Added


Goldenrod's memory profile shows quick claims of memory over 30 minutes tests runs. 


Image Added

Missing Indexes

Honeysuckle tests revealed the following missing indexes:

mod-circulation missing indexes

Code Block
WARNING: Doing LIKE search without index for jsonb->>'requestId', CQL >>> SQL: requestId == 920e1d64-c221-48a0-a44d-ff50f3ad6cd6 >>> lower(f_unaccent(jsonb->>'requestId')) LIKE lower(f_unaccent('920e1d64-c221-48a0-a44d-ff50f3ad6cd6'))
WARNING: Doing FT search without index for request.jsonb->>'requesterId', CQL >>> SQL: requesterId = ae4c1cf3-0738-4465-8112-e75089e5b5c6 >>> get_tsvector(f_unaccent(request.jsonb->>'requesterId')) @@ tsquery_phrase(f_unaccent('ae4c1cf3-0738-4465-8112-e75089e5b5c6'))
WARNING: Doing FT search without index for request.jsonb->>'pickupServicePointId', CQL >>> SQL: pickupServicePointId = 7068e104-aa14-4f30-a8bf-71f71cc15e07 >>> get_tsvector(f_unaccent(request.jsonb->>'pickupServicePointId')) @@ tsquery_phrase(f_unaccent('7068e104-aa14-4f30-a8bf-71f71cc15e07'))
WARNING: Doing LIKE search without index for patron_action_session.jsonb->>'patronId', CQL >>> SQL: patronId == d7cabcb2-7431-43ea-a2cc-0dfe5bee17c6 >>> lower(f_unaccent(patron_action_session.jsonb->>'patronId')) LIKE lower(f_unaccent('d7cabcb2-7431-43ea-a2cc-0dfe5bee17c6'))
WARNING: Doing LIKE search without index for patron_action_session.jsonb->>'actionType', CQL >>> SQL: actionType == Check-out >>> lower(f_unaccent(patron_action_session.jsonb->>'actionType')) LIKE lower(f_unaccent('Check-out'))

The following warnings are captured when the background tasks are running

Code Block
WARNING: Doing LIKE search without index for jsonb->'noticeConfig'>>'timing', CQL >>> SQL: noticeConfig.timing == After >>> lower(f_unaccent(jsonb>'noticeConfig'->>'timing')) LIKE lower(f_unaccent('After'))
WARNING: Doing LIKE search without index for jsonb->>'loanId', CQL >>> SQL: loanId == 671233fd-5c15-4f9e-8fab-f86330c389bd >>> lower(f_unaccent(jsonb->>'loanId')) LIKE lower(f_unaccent('671233fd-5c15-4f9e-8fab-f86330c389bd'))
WARNING: Doing LIKE search without index for jsonb->>'triggeringEvent', CQL >>> SQL: triggeringEvent == "Due date" >>> lower(f_unaccent(jsonb->>'triggeringEvent')) LIKE lower(f_unaccent('Due date'))

mod-feesfines missing indexes:

Code Block
WARNING: Doing LIKE search without index for accounts.jsonb->>'userId', CQL >>> SQL: userId == e96618a9-04ee-4fea-aa60-306a8f4dd89b >>> lower(f_unaccent(accounts.jsonb->>'userId')) LIKE lower(f_unaccent('e96618a9-04ee-4fea-aa60-306a8f4dd89b'))
WARNING: Doing LIKE search without index for accounts.jsonb->'status'>>'name', CQL >>> SQL: status.name <> Closed >>> lower(f_unaccent(accounts.jsonb>'status'->>'name')) NOT LIKE lower(f_unaccent('Closed'))
WARNING: Doing LIKE search without index for manualblocks.jsonb->>'userId', CQL >>> SQL: userId == a79b533d-8f29-4be1-9415-5f5cd936623b >>> lower(f_unaccent(manualblocks.jsonb->>'userId')) LIKE lower(f_unaccent('a79b533d-8f29-4be1-9415-5f5cd936623b'))

Appendix

CIRC-1014

https://issues.folio.org/browse/MODPATBLK-70

https://issues.folio.org/browse/OKAPI-964

https://issues.folio.org/browse/OKAPI-965


IN-REVIEW

Table of Contents
outlinetrue

Overview


Backend:

  • mod-circulation-19.2.7
  • mod-circulation-storage-12.1.4
  • mod-inventory-16.1.3
  • mod-inventory-storage-19.4.4
  • mod-authtoken-2.6.0
  • mod-pubsub-1.3.3
  • okapi-4.3.3 (also with 4.2.2)

...

  • 61 back-end modules deployed in 110 ECS services
  • 3 okapi ECS services
  • 8 m5.large  EC2 instances
  • 2 db.r5.xlarge AWS RDS instance
  • INFO logging level

High Level Summary

  • Check-out: Honeysuckle is slower by 9%-28% than Goldenrod
  • Check-in: 4%-22% slower than Goldenrod
  • APIs turned slower in Honeysuckle: GET /automated-patron-blocks/{id} (150% slower) and  GET /circulation/loans (60%). These are covered by MODPATBLK-70 and CIRC-1014, respectively
  • Okapi v4.3.3 seem to be using 2x-3x CPU cycles than in v1.3.2 (Goldenrod).  Potential issue found with the logging methods OKAPI-964 
  • mod-pubsub has a memory leak that would drag down performance under high loads (see section on longevity test): MODPUBSUB-136
  • Caching Okapi tokens in Okapi reduced mod-authtoken's CPU usage by over 90%
  • Database's memory usage improved dramatically from Goldenrod's - little memory consumptions observed.

Test Runs

Test

Virtual Users

Duration

OKAPI log level

1.

130 minsINFO

2.

530 minsINFO

3.

830 minsINFO

4.

2030 minsINFO

5.

2024 HoursINFO

Results

Response Times


Average (seconds)50th %tile (seconds)75th %tile (seconds)95th %tile  (seconds)

Check-inCheck-outCheck-inCheck-outCheck-inCheck-outCheck-inCheck-out
1 user0.9671.9890.8891.8320.9842.2011.2542.815
5 users1.0532.1710.9811.9691.1142.2531.5283.370
8 users1.1932.2441.0762.0221.3392.3721.8953.544
20 users2.3913.9011.6393.0732.2634.124.8118.784

...

  • Services Modules Memory utilizations 
    • No modules exhibited memory leaks except for mod-pubsub
  • Although there were two instances of mod-pubsub running on two different ec2 instances, mod-pubsub's traffic seemed to have been stickied to one instance. Here are graphs showing mod-pubsub's on one instance using up memory and CPU resources, and on another instance not showing much activities:
    • mod-pubsub and Okapi on another node - Okapi's CPU utilization dwindles while mod-pubsub does not seem to be busy at all

CPUs and Memories

Okapi was profiled because of the apparent 3x CPU utilization compared to the Goldenrod runs.

...

  1. mod-authtoken uses much less CPU in Honeysuckle, over 90% reduction across all tests! This is because of the token caching functionality that was added to Okapi 4.x
  2. mod-circulation's CPU utilization in Honeysuckle averages over 20% lower than in Goldenrod.
  3. mod-circulation's CPU utilization in Honeysuckle is about 10-30% higher than in Goldenrod
  4. mod-inventory's CPU utilization in Honeysuckle averages 30% more than in Goldenrod
  5. mod-inventory-storage's CPU utilization in Honeysuckle averages 20% more than in Goldenrod 
  6. mod-pubsub's CPU utilization in Honeysuckle is about 15% less than in Goldenrod
  7. mod-patron-blocks CPU utilization in Honeysuckle is at least 30% less than in Goldenrod

JVM Profiling

Because Okapi's CPU utilization in Honeysuckle seemed to have averaged 2x to 3x higher than in Goldenrod, it was profiled to get more insights of what happened inside it.

...

Note that the AbstractLogger.Info method in Okapi 4.3.3 total CPU time is about 3x higher than in Goldenrod.  This is confirmed by Okapi 4.3.3's metrics showing ProxyContext.logRequest and ProxyContext.logResponse methods' response times degrade over time. These two methods need to be investigated.


Database

The database CPU utilizations are about the same between the Honeysuckle and Goldenrod

...

Goldenrod's memory profile shows quick claims of memory over 30 minutes tests runs. 


Missing Indexes

Honeysuckle tests revealed the following missing indexes:

...

Code Block
WARNING: Doing LIKE search without index for accounts.jsonb->>'userId', CQL >>> SQL: userId == e96618a9-04ee-4fea-aa60-306a8f4dd89b >>> lower(f_unaccent(accounts.jsonb->>'userId')) LIKE lower(f_unaccent('e96618a9-04ee-4fea-aa60-306a8f4dd89b'))
WARNING: Doing LIKE search without index for accounts.jsonb->'status'>>'name', CQL >>> SQL: status.name <> Closed >>> lower(f_unaccent(accounts.jsonb>'status'->>'name')) NOT LIKE lower(f_unaccent('Closed'))
WARNING: Doing LIKE search without index for manualblocks.jsonb->>'userId', CQL >>> SQL: userId == a79b533d-8f29-4be1-9415-5f5cd936623b >>> lower(f_unaccent(manualblocks.jsonb->>'userId')) LIKE lower(f_unaccent('a79b533d-8f29-4be1-9415-5f5cd936623b'))

Appendix

CIRC-1014

https://issues.folio.org/browse/MODPATBLK-70

...