Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

IN-PROGRESS

Table of Contents
outlinetrue

...

This is a report for a series of Check-in-check-out test runs against the Juniper Kiwi release. 

BackendBack End:

  • mod-circulation-22.1.0
  • mod-circulation-storage-13.1.0
  • mod-inventory-18.0.0
  • mod-inventory-storage-22.0.0
  • mod-authtoken-2.9.0
  • mod-pubsub-2.4.0
  • mod-patron-blocks-1.4.0
  • mod-feesfines-17.0.0
  • okapi-4.9.0

FrontendFront End:

  • folio_circulation-6.0.0
  • Item Check-in (folio_checkin-6.0.0)
  • Item Check-out (folio_checkout-7.0.0)

EnvironmentInfrastructure:

  • 71 back-end modules deployed in 141 ECS tasks
  • 3 okapi ECS services
  • 6 m5.xlarge  EC2 instances
  • 2 db.r6g.xlarge AWS RDS instance
  • INFO logging level

...

  • Kiwi performs much better than Juniper GA, response times for check-ins are <900ms, checkout <1500ms, with not much variation between 1 and 20 users
  • Database performance is better and uses much less CPU compared to Juniper GA
  • Worst-performing APIs are still POST /checkin-by-barcode and POST /checkout-by-barcode.  The response time are still about 500ms. GET /circulation/loans also takes more than 200ms. GET /inventory/item (by barcode) takes less than 100ms now.
  • Longevity test shows response times worsen over time, probably due to the growing DB CPU utilization. 
    Jira Legacy
    serverSystem JIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyCIRCSTORE-304
    Potentially could address this situation

Test Runs

Test

Virtual Users

Duration

1.

130 mins

2.

530 mins

3.

830 mins

4.

2030 mins
5.1 (repeat)30 mins

...

In general there is no regression in performance.  The response times between Kiwi and Juniper are very close to each other for 1-8 users load unless they were in the 95th percentile group or the 20 users load where Kiwi clearly out-perform Juniper.  In the tables below, the Delta columns express the differences between Juniper and Kiwi releases in percentage. Any percentage +/-5% is not statistically is within the margin of error.  It is also noteworthy that Kiwi seems to invoke the GET /automated-patron-blocks 3 times instead of once

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyUICHKOUT-755
.  This call averages 25ms under all loads, so if 2 of these 3 calls were not needed (why would the UI calls it thrice?) then the Kiwi's Checkout average response times could improve by another 50ms. 

Note: JP = Juniper build, KW = Kiwi build


Average50th percentile 
Check-in JPCheck-in KWDeltaCheck-out JPCheck-out KWDeltaCheck-in JPCheck-in KWDeltaCheck-out JPCheck-out KWDelta
1 user0.9440.83811.23%1.5791.582-0.19%0.8350.7678.14%1.4111.464-3.76%
5 users0.8110.739.99%1.3591.376-1.25%0.7500.6769.87%1.231.272-3.41%
8 users0.8890.75814.74%1.4251.3922.32%0.7850.67414.14%1.2621.2282.69%
20 users1.3860.89935.14%2.211.50631.86%1.1720.73137.63%1.8871.30330.95%

...

Average Response Time in milliseconds. Note: JP = Juniper build, KW = Juniper build 

API

1 user  JP (75th %tile) 

1 user  KW (75th %tile)

5 users JP (75th %tile)

5 users KW (75th %tile)

8 users JP (75th %tile)8 users KW (75th %tile)

20 users JP (75th %tile)

20 users KW (75th %tile)
POST checkout-by-barcode526550476436479428861461
POST checkin-by-barcode 457444402368489438852504
GET circulation/loans246305245262272267516303
GET inventory/items171124166961709225099

Longevity

...

Test

Longevity test shows that the response time increased as time went on. 


Check InCheck Out
1st Hour0.821s1.476s
16th Hour1.129s3.343s
25th Hour0.999s3.758s

In the response time graph below the Checkout Controller time, which gathers all chekc-out API response times), increased over a 16-hours window, from <2s to over 5s.

Image Added


The DB CPU utilization percentage also increased over time

...

Code Block
Gather (cost=1000.00..60093.80 rows=780 width=966) (actual time=192.948..200.814 rows=1 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Parallel Seq Scan on loan (cost=0.00..59015.80 rows=325 width=966) (actual time=145.247..191.854 rows=0 loops=3)
 Filter: ((lower(fs09000000_mod_circulation_storage.f_unaccent((jsonb ->> 'userId'::text))) ~~ '664e4002-9eb2-4c9e-addd-04f308a8062c'::text) AND (lower(fs09000000_mod_circulation_storage.f_unaccent(((jsonb -> 'status'::text) ->> 'name'::text))) !~~ 'closed'::text))
 Rows Removed by Filter: 53759
Planning Time: 0.212 ms
Execution Time: 200.834 ms


Due to the left() function (more details in 

Jira Legacy
serverSystem JIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyCIRCSTORE-304
), the indexes were not applied. After removing the left() function, the indexes were applied and when running another longevity test, this time the DB CPU utilization was constant and under 20%, as seen in the diagram below. 


Image Added

Before and after fixing the left() function, there were spikes at half an hour interval, but these were due to background jobs processing the loans (age-to-lost, patron notices, etc..). As more loans got created during the test, the spikes got higher. 


Here is a view of the CPU utilization. A couple of observations:

...

Here is the Service CPU Utilization graph without Okapi for better clarity of other modules' behaviors.

(Note that at around 8:00 (on the graph) there are extended spikes for all the modules, Okapi, and the DB. This is due to tooling. The tool for some reasons added another 20 concurrent users to the test. The results for this period had been discarded.)


There does not appear to be any memory leaks issues in Kiwi

Image Added

Only mod---

Image Removed

Image Removed

Image Removed

A closer look at the CPU graphs without Okapi.

Image Removedinventory-storage seems to have the spikes but but they were not prolonged and the processes did not crash. 


Image Added


Modules CPUs and Memory Utilization

...

In the 8-users tests, mod-inventory spiked to almost 400% while averaged around 175%. This is an anomaly that can be disregarded. We have seen mod-inventory spiked while being idle or during a test run of any workflow before, mostly due to DI events_cache topic processing. In other tests this is not a problem, so it does not seem to be a bad trend here. 

...

Miscellaneous

  • Raw test data: 
    View file
    nameCICO-kiwi.xlsx
    height250