Check-in-check-out Test Report (Kiwi)
















Overview

This is a report for a series of Check-in-check-out test runs against the Kiwi release. 

Back End:

  • mod-circulation-22.1.0
  • mod-circulation-storage-13.1.0
  • mod-inventory-18.0.0
  • mod-inventory-storage-22.0.0
  • mod-authtoken-2.9.0
  • mod-pubsub-2.4.0
  • mod-patron-blocks-1.4.0
  • mod-feesfines-17.0.0
  • okapi-4.9.0

Front End:

  • folio_circulation-6.0.0
  • Item Check-in (folio_checkin-6.0.0)
  • Item Check-out (folio_checkout-7.0.0)

Infrastructure:

  • 71 back-end modules deployed in 141 ECS tasks
  • 3 okapi ECS services
  • 6 m5.xlarge  EC2 instances
  • 2 db.r6g.xlarge AWS RDS instance
  • INFO logging level

High Level Summary

  • Kiwi performs much better than Juniper GA, response times for check-ins are <900ms, checkout <1500ms, with not much variation between 1 and 20 users
  • Database performance is better and uses much less CPU compared to Juniper GA
  • Worst-performing APIs are still POST /checkin-by-barcode and POST /checkout-by-barcode.  The response time are still about 500ms. GET /circulation/loans also takes more than 200ms. GET /inventory/item (by barcode) takes less than 100ms now.
  • Longevity test shows response times worsen over time, probably due to the growing DB CPU utilization.  CIRCSTORE-304 - Getting issue details... STATUS Potentially could address this situation

Test Runs

Test

Virtual Users

Duration

1.

130 mins

2.

530 mins

3.

830 mins

4.

2030 mins
5.1 (repeat)30 mins

Results

Response Times (Average of all tests listed above, in seconds)


Average (seconds)50th %tile (seconds)75th %tile (seconds)95th %tile  (seconds)

Check-inCheck-outCheck-inCheck-outCheck-inCheck-outCheck-inCheck-out
1 user0.8381.5820.7671.4640.9151.7160.9151.716
5 users0.731.3760.6761.2720.7691.4481.0522.061
8 users0.7581.3920.6741.2280.8391.441.2022.343
20 users0.8991.5060.7311.3030.9311.5351.5312.504

Response times are consistently good, sub-second for check-ins and around 1.5 seconds for check-outs even in the 75th percentile. Not much variations between 1 and 20 users. 

Comparisons to Last Release

The following tables compare Juniper HF3 test results against Kiwi General release (deployed at Bugfest).

Response Time Comparison

In general there is no regression in performance.  The response times between Kiwi and Juniper are very close to each other for 1-8 users load unless they were in the 95th percentile group or the 20 users load where Kiwi clearly out-perform Juniper.  In the tables below, the Delta columns express the differences between Juniper and Kiwi releases in percentage. Any percentage +/-5% is not statistically is within the margin of error.  It is also noteworthy that Kiwi seems to invoke the GET /automated-patron-blocks 3 times instead of once UICHKOUT-755 - Getting issue details... STATUS .  This call averages 25ms under all loads, so if 2 of these 3 calls were not needed (why would the UI calls it thrice?) then the Kiwi's Checkout average response times could improve by another 50ms. 

Note: JP = Juniper build, KW = Kiwi build


Average50th percentile 
Check-in JPCheck-in KWDeltaCheck-out JPCheck-out KWDeltaCheck-in JPCheck-in KWDeltaCheck-out JPCheck-out KWDelta
1 user0.9440.83811.23%1.5791.582-0.19%0.8350.7678.14%1.4111.464-3.76%
5 users0.8110.739.99%1.3591.376-1.25%0.7500.6769.87%1.231.272-3.41%
8 users0.8890.75814.74%1.4251.3922.32%0.7850.67414.14%1.2621.2282.69%
20 users1.3860.89935.14%2.211.50631.86%1.1720.73137.63%1.8871.30330.95%


75th percentile95th percentile 
Check-in JPCheck-in KWDeltaCheck-out JPCheck-out KWDeltaCheck-in JPCheck-in KWDeltaCheck-out JPCheck-out KWDelta
1 user0.9870.9157.29%1.6041.716-6.98%1.5331.21420.81%2.632.4865.48%
5 users0.8670.76911.30%1.4381.448-0.70%1.1791.05210.77%2.1111.44831.41%
8 users0.9580.83912.42%1.4811.442.77%1.4341.20216.18%2.3432.3430.00%
20 users1.5890.93141.41%2.51.53538.60%2.7551.53144.43%4.3692.50442.69%

If compared Kiwi response times against the Juniper results documented in Check-in-check-out Test Report (Juniper), then Kiwi's times clearly are better than Juniper's. This is because the numbers in those tests were from the Juniper General release, when the GET /automated-patron-blocks were not optimized. HF3's modules are closer to Kiwi and therefore performs a lot better than Juniper GA. 

"Worst" API Comparisons

The APIs identified in the table below were the ones that took over 100ms to execute in Juniper.  In Kiwi these APIs perform much better, especially under greater load (8 and 20 users) where, for example, POST /checkout-by-barcode and POST /check-in-by-barcode's time are almost 50% better. GET /inventory/items also improved greatly due to implementation of the unique barcode index in the mod_inventory_storage.item table. 

Average Response Time in milliseconds. 

API

1 user  JP (75th %tile) 

1 user  KW (75th %tile)

5 users JP (75th %tile)

5 users KW (75th %tile)

8 users JP (75th %tile)8 users KW (75th %tile)

20 users JP (75th %tile)

20 users KW (75th %tile)
POST checkout-by-barcode526550476436479428861461
POST checkin-by-barcode 457444402368489438852504
GET circulation/loans246305245262272267516303
GET inventory/items171124166961709225099

Longevity Test

Longevity test shows that the response time increased as time went on. 


Check InCheck Out
1st Hour0.821s1.476s
16th Hour1.129s3.343s
25th Hour0.999s3.758s

In the response time graph below the Checkout Controller time, which gathers all chekc-out API response times), increased over a 16-hours window, from <2s to over 5s.


The DB CPU utilization percentage also increased over time

It's unclear what (DB connections) caused the DB to use more CPU resources as the test progressed, but looking at the logs there is one query that stood out as being slow:

SELECT jsonb,id FROM fs09000000_mod_circulation_storage.loan 
WHERE (lower(fs09000000_mod_circulation_storage.f_unaccent(loan.jsonb->>'userId')) LIKE lower(fs09000000_mod_circulation_storage.f_unaccent('<user UUID>'))) 
AND (lower(fs09000000_mod_circulation_storage.f_unaccent(loan.jsonb->'status'->>'name')) NOT LIKE lower(fs09000000_mod_circulation_storage.f_unaccent('Closed')))

This query at first took a little more than 100ms to run, but toward the end of the test it got to over 500ms on average and spiked to over a second. 

Doing an EXPLAIN ANALYZE on the query shows that it was not using indexes in the query plan despite having the indexes to look up loans by userId and loan status name. 

Gather (cost=1000.00..60093.80 rows=780 width=966) (actual time=192.948..200.814 rows=1 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Parallel Seq Scan on loan (cost=0.00..59015.80 rows=325 width=966) (actual time=145.247..191.854 rows=0 loops=3)
 Filter: ((lower(fs09000000_mod_circulation_storage.f_unaccent((jsonb ->> 'userId'::text))) ~~ '664e4002-9eb2-4c9e-addd-04f308a8062c'::text) AND (lower(fs09000000_mod_circulation_storage.f_unaccent(((jsonb -> 'status'::text) ->> 'name'::text))) !~~ 'closed'::text))
 Rows Removed by Filter: 53759
Planning Time: 0.212 ms
Execution Time: 200.834 ms


Due to the left() function (more details in  CIRCSTORE-304 - Getting issue details... STATUS ), the indexes were not applied. After removing the left() function, the indexes were applied and when running another longevity test, this time the DB CPU utilization was constant and under 20%, as seen in the diagram below. 


Before and after fixing the left() function, there were spikes at half an hour interval, but these were due to background jobs processing the loans (age-to-lost, patron notices, etc..). As more loans got created during the test, the spikes got higher. 


Here is a view of the CPU utilization. A couple of observations:

  • mod-users seems to take up CPU resources in a cyclical way. 
  • Okapi uses about 450% CPU on average.

Here is the Service CPU Utilization graph without Okapi for better clarity of other modules' behaviors.

(Note that at around 8:00 (on the graph) there are extended spikes for all the modules, Okapi, and the DB. This is due to tooling. The tool for some reasons added another 20 concurrent users to the test. The results for this period had been discarded.)


There does not appear to be any memory leaks issues in Kiwi

Only mod-inventory-storage seems to have the spikes but but they were not prolonged and the processes did not crash. 



Modules CPUs and Memory Utilization

The relevant services overall seem to occupy CPU resources nominally. 



20-users testsAvgMax
mod-users41%47%
mod-pubsub50%63%
okapi400%439%
mod-circulation50%55%
mod-circulation-storage33%43%
mod-inventory45%50%
mod-inventory-storage98%162%
mod-patron-blocks13%17%
mod-feesfines23%26%

In the 8-users tests, mod-inventory spiked to almost 400% while averaged around 175%. This is an anomaly that can be disregarded. We have seen mod-inventory spiked while being idle or during a test run of any workflow before, mostly due to DI events_cache topic processing. In other tests this is not a problem, so it does not seem to be a bad trend here. 

Kiwi Okapi's CPU % for 20 users is rather high compared to the results documented in Juniper test run, in which it was averaging about 325%, spiking to around 380%. However, in the 5 and 8 users tests, Juniper Okapi was using more CPU. 


Services' memory seem to be stable during the test runs, although these short test runs do not veal much. mod-inventory-storage seems to be hogging memory as it goes along, but during these short tests this behavior did not have any effect on the results. 


AvgMax
mod-users60%60%
mod-pubsub49%52%
okapi98%103%
mod-circulation54%58%
mod-circulation-storage47%53%
mod-inventory26%31%
mod-inventory-storage68%, 97%, 115%115%
mod-patron-blocks87%90%
mod-feesfines87%88%

Database

The freeable memory metric shows how much memory is being consumed and remaining during a test run.  In the graph below memory usage is pretty good and stable throughout the test runs compared to Juniper GA that quickly drained memory due to client connections not closing. 

Users TestedMB consumed
1800
51200
81300
201300


Database aveage CPU Utilization also improves and steady as well, compared to Juniper GA. 

Users TestedKiwi %Juniper GA %
1915
51350
814 (with a 78% spike)78
2024100


Database connections averages:

Users TestedKiwi
1478
5580
8580
20608



Miscellaneous

  • Raw test data: