Check-in-check-out Test Report (Lotus)
Overview
This is a report for a series of Check-in-check-out test runs against the Lotus release.
Back End:
- mod-circulation-23.0.1
- mod-circulation-storage-14.0.0
- mod-inventory-18.1.4
- mod-inventory-storage-23.0.2
- mod-authtoken-2.9.1
- mod-pubsub-2.5.0
- mod-patron-blocks-1.5.0
- mod-feesfines-17.1.0
- okapi-4.13.0
Front End:
- Item Check-in (folio_checkin-7.0.0)
- Item Check-out (folio_checkout-8.0.0)
Infrastructure:
- 75 back-end modules deployed in 159 ECS tasks
- 3 okapi ECS services
- 9 m5.xlarge EC2 instances
- 2 db.r6g.xlarge AWS RDS instance
- INFO logging level
High Level Summary
- Lotus performs much better than Kiwi response times for check-ins are mid 600ms, checkout 900ms - 1000ms, with little variations between 1 and 20 users
- Database performance is better and uses less CPU compared to Kiwi.
- Worst-performing APIs are only POST /checkin-by-barcode and POST /checkout-by-barcode. They have improved since Kiwi, now averaging in the mid 300ms even in the 20 users scenario.
- GET /circulation/loans takes 185ms in the 20 users case.
Test Runs
Test | Virtual Users | Duration |
1. | 1 (with 1 mod-inventory and 1 mod-circulation task) | 30 mins |
2. | 5 (with 1 mod-inventory and 1 mod-circulation task) | 30 mins |
3. | 8 (with 1 mod-inventory and 1 mod-circulation task) | 30 mins |
4. | 20 (with 1 mod-inventory and 1 mod-circulation task) | 30 mins |
5. | 1 (repeat with 2 inventory and circ tasks) | 30 mins |
6. | 5 (repeat with 2 inventory and circ tasks) | 30 mins |
7. | 8 (repeat with 2 inventory and circ tasks) | 30 mins |
8. | 20 (repeat with 2 inventory and circ tasks) | 30 mins |
9. | 20 users longevity | 24 hours |
Results
Response Times (Average of all tests listed above, in seconds)
Average (seconds) | 50th %tile (seconds) | 75th %tile (seconds) | 95th %tile (seconds) | |||||
Check-in | Check-out | Check-in | Check-out | Check-in | Check-out | Check-in | Check-out | |
1 user | 0.73 | 1.188 | 0.682 | 1.084 | 0.774 | 1.24 | 1 | 1.812 |
5 users | 0.62 | 0.976 | 0.582 | 0.915 | 0.662 | 1.02 | 0.844 | 1.44 |
8 users | 0.615 | 0.971 | 0.581 | 0.904 | 0.654 | 1.006 | 0.836 | 1.428 |
20 users | 0.661 | 1.052 | 0.62 | 0.967 | 0.702 | 1.107 | 0.918 | 1.593 |
Response times are consistently good, sub-second for check-ins and a tad over 1 second for check-outs even in the 75th percentile. The variations between 1 and 20 users continue to get smaller (compared to previous releases)
Comparisons to Last Release
The following tables compare Kiwi test results against Lotus General release (deployed at Lotus Bugfest).
Response Time Comparison
In the tables below, the Delta columns express the differences between Kiwi and Lotus releases in percentage.
In general there are no regressions in performance but instead there are improvements in response times across the board. Check in times improved by at least 15% and check out times consistently around 30%. The 20 users' response times are also in very similar to 8 users, especially for checkouts. This means that the system is very stable from 1 to 20 users.
Note: LT = Lotus build, KW = Kiwi build
Average | 50th percentile | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Check-in KW | Check-in LT | Delta | Check-out KW | Check-out LT | Delta | Check-in KW | Check-in LT | Delta | Check-out KW | Check-out LT | Delta | |
1 user | 0.838 | 0.73 | 12.89% | 1.582 | 1.188 | 24.91% | 0.767 | 0.682 | 11.08% | 1.464 | 1.084 | 25.96% |
5 users | 0.73 | 0.62 | 15.07% | 1.376 | 0.976 | 29.07% | 0.676 | 0.582 | 13.91% | 1.272 | 0.915 | 28.07% |
8 users | 0.758 | 0.615 | 18.87% | 1.392 | 0.971 | 30.24% | 0.674 | 0.581 | 13.80% | 1.228 | 0.904 | 26.38% |
20 users | 0.899 | 0.661 | 26.47% | 1.506 | 1.052 | 30.15% | 0.731 | 0.62 | 15.18% | 1.303 | 0.967 | 25.79% |
75th percentile | 95th percentile | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Check-in KW | Check-in LT | Delta | Check-out KW | Check-out LT | Delta | Check-in KW | Check-in LT | Delta | Check-out KW | Check-in LT | Delta | |
1 user | 0.915 | 0.774 | 15.41% | 1.716 | 1.24 | 27.74% | 1.214 | 1.000 | 17.63% | 2.486 | 1.812 | 27.11% |
5 users | 0.769 | 0.662 | 13.91% | 1.448 | 1.02 | 29.56% | 1.052 | 0.844 | 19.77% | 1.448 | 1.440 | 0.55% |
8 users | 0.839 | 0.654 | 22.05% | 1.44 | 1.006 | 30.14% | 1.202 | 0.836 | 30.45% | 2.343 | 1.428 | 39.05% |
20 users | 0.931 | 0.702 | 24.60% | 1.535 | 1.107 | 27.88% | 1.531 | 0.918 | 40.04% | 2.504 | 1.593 | 36.38% |
"Worst" API Comparisons
The APIs identified in the table below were the ones that took over 100ms to execute in Lotus and Kiwi in the 75th percentile. In Lotus these APIs are generally better especially with 5 or more concurrent users. Impressively the response times of POST checkout-by-barcode and POST check-in-by-barcode all are under 400ms! GET inventory/items' response times are well under 100ms now compared to being on the border in the Kiwi release.
Average Response Time in milliseconds.
API | 1 user KW (75th %tile) | 1 user LT (75th %tile) | 5 users KW (75th %tile) | 5 users LT (75th %tile) | 8 users KW (75th %tile) | 8 users LT (75th %tile) | 20 users KW (75th %tile) | 20 users LT (75th %tile) |
---|---|---|---|---|---|---|---|---|
POST checkout-by-barcode | 550 | 435 | 436 | 330 | 428 | 349 | 461 | 384 |
POST checkin-by-barcode | 444 | 386 | 368 | 298 | 438 | 315 | 504 | 348 |
GET circulation/loans | 305 | 222 | 262 | 161 | 267 | 172 | 303 | 196 |
GET inventory/items | 124 | 97 | 96 | 71 | 92 | 67 | 99 | 75 |
Longevity Test
Longevity test shows that Check Out response time increased as time went on, but not as aggressive as in Kiwi.
Check In | Check Out | |
---|---|---|
1st Hour | 0.572s | 0.995s |
13th Hour | 0.526s | 1.248s |
19th Hour | 0.541s | 1.368s |
In the response time graph below the Checkout Controller time, which gathers all check-out API response times), increased over the 24-hours window, from just under 1s to 1.4s. The vertical bands seen are not due to response times spiking but because of the number of users dropped to 0 at time. It's unsure if this behavior is due to tooling. Toward the last hour the check out times spiked. This is due to the testing tool erratically spinning up another Perfmeter that runs the same test.
The DB CPU utilization percentage also increased over time, from 16% to 23% by the end of the test. There are huge spikes every 20 minutes. These are due to the background tasks that run periodically and as the number of loans grew so did the physical resources needed to process all of them.
The number of connections also rose over time, from 344 to 383 connections. It's unclear what (DB connections) caused the DB to use more CPU resources as the test progressed.
Database's memory dipped a bit but no symptoms of memory leaks. The memory level bounced back up right after the test finished.
Looking at the DB logs there is one query that stood out as being slow:
SELECT jsonb,id FROM fs09000000_mod_circulation_storage.loan WHERE ((((lower(f_unaccent(loan.jsonb->'status'->>'name')) LIKE lower(f_unaccent('Open'))) AND (CASE WHEN length(lower('2022-04-04T04:09:56.031Z')) <= 600 THEN left(lower(loan.jsonb->>'dueDate'),600) < lower('2022-04-04T04:09:56.031Z') ELSE left(lower(loan.jsonb->>'dueDate'),600) < left(lower('2022-04-04T04:09:56.031Z'),600) AND lower(loan.jsonb->>'dueDate') <= lower('2022-04-04T04:09:56.031Z') END)) AND (CASE WHEN length(lower(f_unaccent('Claimed returned'))) <= 600 THEN left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE lower(f_unaccent('Claimed returned')) ELSE left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE left(lower(f_unaccent('Claimed returned')),600) OR lower(f_unaccent(loan.jsonb->>'itemStatus')) NOT LIKE lower(f_unaccent('Claimed returned')) END)) AND (CASE WHEN length(lower(f_unaccent('Aged to lost'))) <= 600 THEN left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE lower(f_unaccent('Aged to lost')) ELSE left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE left(lower(f_unaccent('Aged to lost')),600) OR lower(f_unaccent(loan.jsonb->>'itemStatus')) NOT LIKE lower(f_unaccent('Aged to lost')) END)) AND (CASE WHEN length(lower(f_unaccent('Declared lost'))) <= 600 THEN left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE lower(f_unaccent('Declared lost')) ELSE left(lower(f_unaccent(loan.jsonb->>'itemStatus')),600) NOT LIKE left(lower(f_unaccent('Declared lost')),600) OR lower(f_unaccent(loan.jsonb->>'itemStatus')) NOT LIKE lower(f_unaccent('Declared lost')) END) ORDER BY left(lower(f_unaccent(loan.jsonb->>'dueDate')),600), lower(f_unaccent(loan.jsonb->>'dueDate')) LIMIT 500 OFFSET 14500
This query took a little more than 1000ms to run.
Modules CPU Utilization During Longevity Test
Here is a view of the CPU utilization. A couple of observations:
- mod-users and mod-patron-blocks seem to take up CPU resources in an identical cyclical way.
- Like Kiwi, Okapi uses about 450-470% CPU on average.
- Other modules used less than 100% CPU on average.
- mod-circulation spikes periodically due to processing background tasks
- mod-search and mod-inventory-storage seem to have the same behavior, spiking toward the end of the test
Here is the Service CPU Utilization graph without Okapi for better clarity of other modules' behaviors.
There does not appear to be any memory leaks issues in Lotus. Only mod-inventory-storage seems to have the spikes but but they were not prolonged and the processes did not crash.
The timeline highlighted below encompasses all 4 test runs (1, 5, 8, 20 users).
Modules CPUs and Memory Utilization
The relevant services overall seem to occupy CPU resources nominally.
20-users tests | Avg | Max |
---|---|---|
mod-users | 35% | 35% |
mod-pubsub | 70% | 71% |
okapi | 456% | 456% |
mod-circulation | 48% | 52% |
mod-circulation-storage | 36% | 38% |
mod-inventory | 45% | 50% |
mod-inventory-storage | 113% | 156% |
mod-patron-blocks | 17% | 19% |
mod-feesfines | 21% | 21% |
Lotus Okapi's CPU % for 20 users is about a little higher than in Kiwi. The other relevant modules actually consume less CPU.
Services' memory seem to be stable during the test runs, although these short test runs do not veal much. mod-inventory-storage seems to be hogging memory as it goes along, but during these short tests this behavior did not have any effect on the results.
Avg | Max | |
---|---|---|
mod-users | 68% | 68% |
mod-pubsub | 55% | 55% |
okapi | 62% | 64% |
mod-circulation | 62% | 62% |
mod-circulation-storage | 72% | 72% |
mod-inventory | 31% | 31% |
mod-inventory-storage | 118% | 119% |
mod-patron-blocks | 76% | 76% |
mod-feesfines | 75% | 75% |
Database
The freeable memory metric shows how much memory is being consumed and remaining during a test run. In the graph below memory usage is pretty good and stable throughout the test runs but generally slightly trending down after each run and bounced back almost to the starting point after each test. This does not mean there is a memory leak because the 24 hours longevity test does not reveal any leaks (see above).
Users Tested | MB consumed |
---|---|
1 | 222 |
5 | 372 |
8 | 374 |
20 | 535 |
Database average CPU Utilization also improves and steady as well, compared to Kiwi GA.
Users Tested | Kiwi % | Lotus % |
---|---|---|
1 | 9 | 7 |
5 | 13 | 10 |
8 | 14 (with a 78% spike) | 12 |
20 | 24 | 18% |
Database connections averages are lower than those in Kiwi
Users Tested | Kiwi | Lotus |
---|---|---|
1 | 478 | 350 |
5 | 580 | 375 |
8 | 580 | 375 |
20 | 608 | 400 |
Miscellaneous