Check-in-check-out Test Report (Lotus)
















Overview

This is a report for a series of Check-in-check-out test runs against the Lotus release. 

Back End:

  • mod-circulation-23.0.1
  • mod-circulation-storage-14.0.0
  • mod-inventory-18.1.4
  • mod-inventory-storage-23.0.2
  • mod-authtoken-2.9.1
  • mod-pubsub-2.5.0
  • mod-patron-blocks-1.5.0
  • mod-feesfines-17.1.0
  • okapi-4.13.0

Front End:

  • Item Check-in (folio_checkin-7.0.0)
  • Item Check-out (folio_checkout-8.0.0)

Infrastructure:

  • 75 back-end modules deployed in 159 ECS tasks
  • 3 okapi ECS services
  • 9 m5.xlarge  EC2 instances
  • 2 db.r6g.xlarge AWS RDS instance
  • INFO logging level

High Level Summary

  • Lotus performs much better than Kiwi response times for check-ins are mid 600ms, checkout 900ms - 1000ms, with little variations between 1 and 20 users
  • Database performance is better and uses less CPU compared to Kiwi.
  • Worst-performing APIs are only POST /checkin-by-barcode and POST /checkout-by-barcode.  They have improved since Kiwi, now averaging in the mid 300ms even in the 20 users scenario.
    • GET /circulation/loans takes 185ms in the 20 users case.

Test Runs

Test

Virtual Users

Duration

1.

1 (with 1 mod-inventory and 1 mod-circulation task)30 mins

2.

5 (with 1 mod-inventory and 1 mod-circulation task)30 mins

3.

8 (with 1 mod-inventory and 1 mod-circulation task)30 mins

4.

20 (with 1 mod-inventory and 1 mod-circulation task)30 mins
5.1 (repeat with 2 inventory and circ tasks)30 mins
6.5 (repeat with 2 inventory and circ tasks)30 mins
7.8 (repeat with 2 inventory and circ tasks)30 mins
8.20 (repeat with 2 inventory and circ tasks)30 mins
9.20 users longevity24 hours

Results

Response Times (Average of all tests listed above, in seconds)


Average (seconds)50th %tile (seconds)75th %tile (seconds)95th %tile  (seconds)

Check-inCheck-outCheck-inCheck-outCheck-inCheck-outCheck-inCheck-out
1 user0.731.1880.6821.0840.7741.2411.812
5 users0.620.9760.5820.9150.6621.020.8441.44
8 users0.6150.9710.5810.9040.6541.0060.8361.428
20 users0.6611.0520.620.9670.7021.1070.9181.593

Response times are consistently good, sub-second for check-ins and a tad over 1 second for check-outs even in the 75th percentile. The variations between 1 and 20 users continue to get smaller (compared to previous releases)

Comparisons to Last Release

The following tables compare Kiwi test results against Lotus General release (deployed at Lotus Bugfest).

Response Time Comparison

In the tables below, the Delta columns express the differences between Kiwi and Lotus releases in percentage.

In general there are no regressions in performance but instead there are improvements in response times across the board.  Check in times improved by at least 15% and check out times consistently around 30%.  The 20 users' response times are also in very similar to 8 users, especially for checkouts. This means that the system is very stable from 1 to 20 users.

Note: LT = Lotus build, KW = Kiwi build


Average50th percentile 
Check-in KWCheck-in LTDeltaCheck-out KWCheck-out LTDeltaCheck-in KWCheck-in LTDeltaCheck-out KWCheck-out LTDelta
1 user0.8380.7312.89%1.5821.18824.91%0.7670.68211.08%1.4641.08425.96%
5 users0.730.6215.07%1.3760.97629.07%0.6760.58213.91%1.2720.91528.07%
8 users0.7580.61518.87%1.3920.97130.24%0.6740.58113.80%1.2280.90426.38%
20 users0.8990.66126.47%1.5061.05230.15%0.7310.6215.18%1.3030.96725.79%


75th percentile95th percentile 
Check-in KWCheck-in LTDeltaCheck-out KWCheck-out LTDeltaCheck-in KWCheck-in LTDeltaCheck-out KWCheck-in LTDelta
1 user0.9150.77415.41%1.7161.2427.74%1.2141.00017.63%2.4861.81227.11%
5 users0.7690.66213.91%1.4481.0229.56%1.0520.84419.77%1.4481.4400.55%
8 users0.8390.65422.05%1.441.00630.14%1.2020.83630.45%2.3431.42839.05%
20 users0.9310.70224.60%1.5351.10727.88%1.5310.91840.04%2.5041.59336.38%

"Worst" API Comparisons

The APIs identified in the table below were the ones that took over 100ms to execute in Lotus and Kiwi in the 75th percentile.  In Lotus these APIs are generally better especially with 5 or more concurrent users. Impressively the response times of POST checkout-by-barcode and POST check-in-by-barcode all are under 400ms! GET inventory/items' response times are well under 100ms now compared to being on the border in the Kiwi release.

Average Response Time in milliseconds. 

API

1 user  KW (75th %tile)

1 user LT (75th %tile)

5 users KW (75th %tile)

5 users LT (75th %tile) 8 users KW (75th %tile)8 users LT (75th %tile)20 users KW (75th %tile)20 users LT (75th %tile)
POST checkout-by-barcode550435436330428349461384
POST checkin-by-barcode 444386368298438315504348
GET circulation/loans305222262161267172303196
GET inventory/items12497967192679975

Longevity Test

Longevity test shows that Check Out response time increased as time went on, but not as aggressive as in Kiwi. 


Check InCheck Out
1st Hour0.572s0.995s
13th Hour0.526s1.248s
19th Hour0.541s1.368s

In the response time graph below the Checkout Controller time, which gathers all check-out API response times), increased over the 24-hours window, from just under 1s to 1.4s.  The vertical bands seen are not due to response times spiking but because of the number of users dropped to 0 at time.  It's unsure if this behavior is due to tooling.  Toward the last hour the check out times spiked. This is due to the testing tool erratically spinning up another Perfmeter that runs the same test.


The DB CPU utilization percentage also increased over time, from 16% to 23% by the end of the test. There are huge spikes every 20 minutes. These are due to the background tasks that run periodically and as the number of loans grew so did the physical resources needed to process all of them.

The number of connections also rose over time, from 344 to 383 connections. It's unclear what (DB connections) caused the DB to use more CPU resources as the test progressed.


Database's memory dipped a bit but no symptoms of memory leaks. The memory level bounced back up right after the test finished. 

Looking at the DB logs there is one query that stood out as being slow:

SELECT jsonb,id FROM fs09000000_mod_circulation_storage.loan WHERE ((((lower(f_unaccent(loan.jsonb->'status'->>'name')) LIKE lower(f_unaccent('Open'))) AND (CASE WHEN length(lower('2022-04-04T04:09:56.031Z')) <= 600 THEN left(lower(loan.jsonb->>'dueDate'),600) < lower('2022-04-04T04:09:56.031Z') ELSE left(lower(loan.jsonb->>'dueDate'),600) < left(lower('2022-04-04T04:09:56.031Z'),600) AND lower(loan.jsonb->>'dueDate') <= lower('2022-04-04T04:09:56.031Z') END)) AND (CASE WHEN length(lower(f_unaccent('Claimed returned'))) <= 600 THEN left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE lower(f_unaccent('Claimed returned')) ELSE left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE left(lower(f_unaccent('Claimed returned')),600) OR lower(f_unaccent(loan.jsonb->>'itemStatus')) NOT LIKE lower(f_unaccent('Claimed returned')) END)) AND (CASE WHEN length(lower(f_unaccent('Aged to lost'))) <= 600 THEN left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE lower(f_unaccent('Aged to lost')) ELSE left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE left(lower(f_unaccent('Aged to lost')),600) OR lower(f_unaccent(loan.jsonb->>'itemStatus')) NOT LIKE lower(f_unaccent('Aged to lost')) END)) AND (CASE WHEN length(lower(f_unaccent('Declared lost'))) <= 600 THEN left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE lower(f_unaccent('Declared lost')) ELSE left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE left(lower(f_unaccent('Declared lost')),600) OR lower(f_unaccent(loan.jsonb->>'itemStatus')) NOT LIKE lower(f_unaccent('Declared lost')) END) ORDER BY left(lower(f_unaccent(loan.jsonb->>'dueDate')),600), lower(f_unaccent(loan.jsonb->>'dueDate')) LIMIT 500 OFFSET 14500

This query took a little more than 1000ms to run.

Modules CPU Utilization During Longevity Test

Here is a view of the CPU utilization. A couple of observations:

  • mod-users and mod-patron-blocks seem to take up CPU resources in an identical cyclical way. 
  • Like Kiwi, Okapi uses about 450-470% CPU on average.
  • Other modules used less than 100% CPU on average.
  • mod-circulation spikes periodically due to processing background tasks
  • mod-search and mod-inventory-storage seem to have the same behavior, spiking toward the end of the test

Here is the Service CPU Utilization graph without Okapi for better clarity of other modules' behaviors.



There does not appear to be any memory leaks issues in Lotus. Only mod-inventory-storage seems to have the spikes but but they were not prolonged and the processes did not crash. 

The timeline highlighted below encompasses all 4 test runs (1, 5, 8, 20 users).

Modules CPUs and Memory Utilization

The relevant services overall seem to occupy CPU resources nominally. 



20-users testsAvgMax
mod-users35%35%
mod-pubsub70%71%
okapi456%456%
mod-circulation48%52%
mod-circulation-storage36%38%
mod-inventory45%50%
mod-inventory-storage113%156%
mod-patron-blocks17%19%
mod-feesfines21%21%

Lotus Okapi's CPU % for 20 users is about a little higher than in Kiwi. The other relevant modules actually consume less CPU.


Services' memory seem to be stable during the test runs, although these short test runs do not veal much. mod-inventory-storage seems to be hogging memory as it goes along, but during these short tests this behavior did not have any effect on the results. 


AvgMax
mod-users68%68%
mod-pubsub55%55%
okapi62%64%
mod-circulation62%62%
mod-circulation-storage72%72%
mod-inventory31%31%
mod-inventory-storage118%119%
mod-patron-blocks76%76%
mod-feesfines75%75%

Database

The freeable memory metric shows how much memory is being consumed and remaining during a test run.  In the graph below memory usage is pretty good and stable throughout the test runs but generally slightly trending down  after each run and bounced back almost to the starting point after each test. This does not mean there is a memory leak because the 24 hours longevity test does not reveal any leaks (see above).

Users TestedMB consumed
1222
5372
8374
20535


Database average CPU Utilization also improves and steady as well, compared to Kiwi GA. 

Users TestedKiwi %Lotus %
197
51310
814 (with a 78% spike)12
202418%


Database connections averages are lower than those in Kiwi

Users TestedKiwiLotus
1478350
5580375
8580375
20608400



Miscellaneous