In Progress
Jira Legacy server FOLIO Issue TrackerSystem Jira serverId 6ccf3fe401505d01-3301b853-368a3c2e-983e90f1-20c466b11a49ee9b165564fc key PERF-100
Success criteria:
- Scripts to generate the data be checked in
- A report showing all the queries and times.
- Analysis of why a query with the * token is slow or fast, if possible.
Table of Contents
Overview:
In PERF-83 it was observed that a CQL query with the * token at the end resulted in a faster response time, despite a much higher database usage.
There are many entities of different statuses in FOLIO. It'd be interesting to see if this pattern holds true for other statuses and CQL queries.
There is 15 K records in DB with Closed status, and 5 K records in DB for Open status.
In this user story, run the following tests:
GET /circulation/loans?limit=2147483647&query=(itemId==${current_item} and status.name=="Closed*")
GET /circulation/requests?(itemId==${item_id} and (status=="Open - Awaiting pickup*" or status=="Open - Awaiting delivery*"))
GET /circulation/requests?(requesterId==${userID} and pickupServicePointId=${servicePoints_Id} and status=="Open - Awaiting pickup*")
Version:
OKAPI - Version 3.1.1
...
Test | Virtual Users | Query | Duration (sec) | Rump Up (sec) |
---|---|---|---|---|
1 | 8 | /circulation/loans?limit=2147483647&query=(itemId==${current_item} and status.name=="Closed") | 1800 | 80 |
2 | 20 | /circulation/loans?limit=2147483647&query=(itemId==${current_item} and status.name=="Closed") | 200 | |
3 | 8 | /circulation/loans?limit=2147483647&query=(itemId==${current_item} and status.name=="Closed*") | 80 | |
4 | 20 | /circulation/loans?limit=2147483647&query=(itemId==${current_item} and status.name=="Closed*") | 200 | |
5 | 8 | /circulation/requests?(itemId==[current_item] and (status=="Open - Not yet filled" or status=="Open - Awaiting delivery")) | 80 | |
6 | 20 | /circulation/requests?(itemId==[current_item] and (status=="Open - Not yet filled" or status=="Open - Awaiting delivery")) | 200 | |
7 | 8 | /circulation/requests?(itemId==[item_id] and (status=="Open - Not yet filled*" or status=="Open - Awaiting delivery*")) | 80 | |
8 | 20 | /circulation/requests?(itemId==[item_id] and (status=="Open - Not yet filled*" or status=="Open - Awaiting delivery*")) | 200 | |
9 | 8 | /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled") | 80 | |
10 | 20 | /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled") | 200 | |
11 | 8 | /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled*") | 80 | |
12 | 20 | /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled*") | 200 |
...
For instance for "Closed" status query (GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed")) (test 1, 8 users)75 response time percentile without "*" token is 1.085 seconds and response time 75 percentile for same query with "*" token is 1,074 seconds so in this particular case delta is 11 ms. Same behavior can be seeing in next test for 20 users with same query: without "*" token 75 percentile of response time is 3.815 seconds and with "*" token it's 3.799, delta is 15 ms.
...
Test | Virtual Users | Request | Total | OK | 75th pct | 95th pct | 99th pct | Max | |
---|---|---|---|---|---|---|---|---|---|
1 | 8 | GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed") | link | 13788 | 13788 | 1.085 | 1.235 | 3.028 | 5.706 |
2 | 20 | GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed") | link | 12225 | 12143 | 3.815 | 4.736 | 6.181 | 30.171 |
3 | 8 | GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed*") | link | 13691 | 13582 | 1.074 | 1.226 | 3.205 | 32.6 |
4 | 20 | GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed*") | link | 12548 | 12548 | 3.799 | 4.715 | 5.46 | 9.462 |
5 | 8 | GET /circulation/requests?(itemId==[current_item] and (status==Open - Not yet filled or status==Open - Awaiting delivery)) | link | 69925 | 69925 | 0.204 | 0.24 | 0.309 | 4.328 |
6 | 20 | GET /circulation/requests?(itemId==[current_item] and (status==Open - Not yet filled or status==Open - Awaiting delivery)) | link | 86602 | 86602 | 0.423 | 0.513 | 0.661 | 15.306 |
7 | 8 | GET /circulation/requests?(itemId==[item_id] and (status=="Open - Awaiting pickup*" or status=="Open - Awaiting delivery*")) | link | 67688 | 67688 | 0.214 | 0.26 | 0.385 | 23.776 |
8 | 20 | GET /circulation/requests?(itemId==[item_id] and (status=="Open - Awaiting pickup*" or status=="Open - Awaiting delivery*")) | link | 78697 | 78697 | 0.48 | 0.607 | 0.969 | 8.646 |
9 | 8 | GET /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled") | link | 66713 | 66713 | 0.215 | 0.265 | 0.384 | 31.676 |
10 | 20 | GET /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled") | link | 79364 | 79364 | 0.477 | 0.603 | 0.99 | 3.834 |
11 | 8 | GET /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled*") | link | 66830 | 66830 | 0.217 | 0.269 | 0.401 | 2.991 |
12 | 20 | GET /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled*") | link | 78824 | 78824 | 0.482 | 0.611 | 0.923 | 4.893 |
...
GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed") 8 Users test run Resources usage:
...
GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed") 20 Users test run Resources usage:
...
As you see, in comparison to "Closed" status test set there is much less DB CPU usage. That may be explained by differences in data set. There is 15 K records in DB with Closed status, and 5 K records in DB for Open status.
RDS Performance Insights:
...
- As was proven - there is no big difference between queries used "*" token and without it. Difference between 75% response times fluctuates from +11 ms to -5 ms;
- There is dependency between data volume in DB. For instance Closed status records quantity was 15 K and Open status records was 5K. and the difference between 75 % response times are 750%. so volume testing is recommended to define performance degradation depends on data volume.
- The following WARNING statements of missing indexes were generated by mod-circulation during a test run:
WARNING: Doing LIKE search without index for accounts.jsonb->>'loanId', CQL >>> SQL: loanId == 5fb2c891-64eb-4b6c-b3a2-715ad91f593b >>> lower(f_unaccent(accounts.jsonb->>'loanId')) LIKE lower(f_unaccent('5fb2c891-64eb-4b6c-b3a2-715ad91f593b'))
...