GET /inventory/items (Check-in/out) API Report (Goldenrod)
- PERF-79Getting issue details... STATUS
Overview
In this testing effort the performance of GET /inventory/items and GET /item-storage/items running in the Goldenrod release is tested with 1, 5, 8, and 20 virtual users for 30 minutes.
Backend:
- mod-inventory-storage-19.3.1
- mod-inventory-16.0.1
- okapi-3.1.2
- mod-authtoken-2.5.1
- mod-permissions-5.11.2
Environment:
- 61 back-end modules deployed in 110 ECS services
- 3 okapi ECS services
- 8 m5.large EC2 instances
- 2 db.r5.xlarge AWS RDS instance (1 reader, 1 writer)
- INFO Okapi logging level
High Level Summary
GET /inventory/item?query=barcode==${itemBarcode} takes on average 830 ms for up to 8 users, and much slower for 20 users
- For every actual call to GET /inventory/item, there is a SELECT count_estimate() call made before that, and count_estimate() function call takes more time than the actual call.
- GET item by barcode is approximate ~42 ms faster if we get it directly from mod-inventory-storage->item-storage instead of going through business logic mod-inventory module
- mod-authtoken calls add up to at least 20% of the overall time
Test Runs and Results
GET /inventory/items?query=barcode==${itemBarcode} API
Test | Virtual Users | Duration | OKAPI log level | Profiled | Ramp up (secs) | Average (ms) | 75th %tile (ms) |
---|---|---|---|---|---|---|---|
1. | 1 | 30 min | INFO | No | 1 | 199 | 217 |
2. | 5 | 30 min | INFO | No | 20 | 431 | 431 |
3. | 8 | 30 min | INFO | No | 30 | 830 | 780 |
4. | 20 | 30 min | INFO | No | 40 | 4080 | 5147 |
Request to storage vs business module
If we go directly to storage module, we see gradual performance improvement and for 8 or more Users, performance improves exponentially.
Running this test on Goldenrod gives the following comparison:
Virtual Users | Duration | GET /inventory/items?query=barcode==${itemBarcode} Average (ms) | GET /item-storage/items?query=barcode==${itemBarcode} Average (ms) | Delta (ms) |
---|---|---|---|---|
1 | 30 min | 199 | 111 | 88 |
5 | 30 min | 431 | 412 | 19 |
8 | 30 min | 830 | 531 | 299 |
20 | 30 min | 4080 | 639 | 3441 |
Giraffe Graphs
Giraffe was used to graph the API call sequences from invoking GET /inventory/items and GET /item-storage/items API.
Instead of making a request to /inventory/item (calls made mod-inventory-->holdings-storage-->instance-storage-->locations), if we go directly to storage module /item-storage/item (calls made mod-inventory-storage-->item-storage), then we do see some performance improvement(~42 ms faster) but at the same time, client validation and checks are ignored from business module so it is not a feasible approach.
GET /inventory/items?query=barcode==${itemBarcode} takes ~151 ms
GET /item-storage/items?query=barcode==${itemBarcode} takes ~109 ms
Database
The database is relatively stable when the number of users is less than 10 but after that for 20 users, it spikes aggressively taking all CPU.
Virtual Users | CPU Range% |
---|---|
1 | 2% |
5 | 4%-5% |
8 | 2% |
20 | 100% |
Database Slow queries
There were no slow queries in this API call. Mostly count_estimate queries to get an item by barcode were found which took on average 60ms. For example:
SELECT count_estimate('SELECT jsonb,id FROM fs09000000_mod_inventory_storage.item WHERE lower(f_unaccent(item.jsonb->>''barcode'')) LIKE lower(f_unaccent(''28705124''))')
CPU Utilization
The following shows the CPU utilization percentage of the ECS services running mod-inventory and mod-inventory-storage modules. Initially, performance degrades from 1-5 users but then it gradually improves as the number of users increases.
1 user | 5 users | 8 users | 20 users | |||||
---|---|---|---|---|---|---|---|---|
Average % | Range % | Average % | Range % | Average % | Range % | Average % | Range % | |
mod-inventory | 13 | 12-13 | 24 | 22-24 | 23.5 | 22-23.5 | 23 | 22-23 |
mod-inventory-storage | 41 | 39-41 | 80 | 75-80 | 75 | 71-75 | 36 | 36-37.5 |
mod-inventory seems to be performing erratically at times. It needs to be focused on in a longevity test.
The graph below shows mod-circulation CPU usage. The written numbers indicate the number of virtual users participated in the test, and the arrow points to when the tests were running. Observations: the CPU usage increased over time for each test run performed. Immediately after each test run the CPU utilization came down but not back to the starting point. This indicates some kind of a leak. For the 20 users run there were a huge spike about 5 minutes into the test run. At this time the container was running out of memory (see the memory graph in the next section) so things became unstable.
Memory
Memory for mod-inventory and mod-inventory-storage was stable throughout the runs, only a spike here or there, but in a 30 minutes run they were consistent.
Module | 1 user (Average) | 5 users (Average) | 8 users (Average) | 20 users (Average) |
---|---|---|---|---|
mod-inventory | 55% | 63% | 63% | 70% |
mod-inventory-storage | 55% | 55% | 57% | 65% |
Recommended Improvements
- The following JIRAs have been created:
- RMB-724Getting issue details... STATUS
- RMB-718Getting issue details... STATUS
Appendix
- Details/Raw test runs data PERF-79.xlsx