In Progress
- PERF-100Getting issue details... STATUS
Success criteria:
- Scripts to generate the data be checked in
- A report showing all the queries and times.
- Analysis of why a query with the * token is slow or fast, if possible.
Overview:
In PERF-83 it was observed that a CQL query with the * token at the end resulted in a faster response time, despite a much higher database usage.
There are many entities of different statuses in FOLIO. It'd be interesting to see if this pattern holds true for other statuses and CQL queries.
In this user story, run the following tests:
GET /circulation/loans?limit=2147483647&query=(itemId==${current_item} and status.name=="Closed*")
GET /circulation/requests?(itemId==${item_id} and (status=="Open - Awaiting pickup*" or status=="Open - Awaiting delivery*"))
GET /circulation/requests?(requesterId==${userID} and pickupServicePointId=${servicePoints_Id} and status=="Open - Awaiting pickup*")
Version:
OKAPI - Version 3.1.1
OKAPI modules
- Okapi (okapi-2.35.2)
- Okapi (okapi-3.1.1)
- okapi 1.9
- okapi-proxy 1.9
- Circulation Module (mod-circulation-19.0.13)
- Circulation Storage Module (mod-circulation-storage-12.0.1)
circulation 9.3
feesfines (mod-feesfines-15.8.2)
feesfines 15.2
feesfines 15.0
Test Runs:
Test | Virtual Users | Query | Duration (sec) | Rump Up (sec) |
---|---|---|---|---|
1 | 8 | /circulation/loans?limit=2147483647&query=(itemId==${current_item} and status.name=="Closed") | 1800 | 80 |
2 | 20 | /circulation/loans?limit=2147483647&query=(itemId==${current_item} and status.name=="Closed") | 200 | |
3 | 8 | /circulation/loans?limit=2147483647&query=(itemId==${current_item} and status.name=="Closed*") | 80 | |
4 | 20 | /circulation/loans?limit=2147483647&query=(itemId==${current_item} and status.name=="Closed*") | 200 | |
5 | 8 | /circulation/requests?(itemId==[current_item] and (status=="Open - Not yet filled" or status=="Open - Awaiting delivery")) | 80 | |
6 | 20 | /circulation/requests?(itemId==[current_item] and (status=="Open - Not yet filled" or status=="Open - Awaiting delivery")) | 200 | |
7 | 8 | /circulation/requests?(itemId==[item_id] and (status=="Open - Not yet filled*" or status=="Open - Awaiting delivery*")) | 80 | |
8 | 20 | /circulation/requests?(itemId==[item_id] and (status=="Open - Not yet filled*" or status=="Open - Awaiting delivery*")) | 200 | |
9 | 8 | /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled") | 80 | |
10 | 20 | /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled") | 200 | |
11 | 8 | /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled*") | 80 | |
12 | 20 | /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled*") | 200 |
Results
Summary:
During test set we've defined that there is no major difference between queries includes "*" and without it for the same amount of data being retrieved.
For instance for "Closed" status query (GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed")) (test 1, 8 users)75 response time percentile without "*" token is 1.085 seconds and response time 75 percentile for same query with "*" token is 1,074 seconds so in this particular case delta is 11 ms. Same behavior can be seeing in next test for 20 users with same query: without "*" token 75 percentile of response time is 3.815 seconds and with "*" token it's 3.799, delta is 15 ms.
Test | Virtual Users | Request | Total | OK | 75th pct | 95th pct | 99th pct | Max | |
---|---|---|---|---|---|---|---|---|---|
1 | 8 | GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed") | link | 13788 | 13788 | 1.085 | 1.235 | 3.028 | 5.706 |
2 | 20 | GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed") | link | 12225 | 12143 | 3.815 | 4.736 | 6.181 | 30.171 |
3 | 8 | GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed*") | link | 13691 | 13582 | 1.074 | 1.226 | 3.205 | 32.6 |
4 | 20 | GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed*") | link | 12548 | 12548 | 3.799 | 4.715 | 5.46 | 9.462 |
5 | 8 | GET /circulation/requests?(itemId==[current_item] and (status==Open - Not yet filled or status==Open - Awaiting delivery)) | link | 69925 | 69925 | 0.204 | 0.24 | 0.309 | 4.328 |
6 | 20 | GET /circulation/requests?(itemId==[current_item] and (status==Open - Not yet filled or status==Open - Awaiting delivery)) | link | 86602 | 86602 | 0.423 | 0.513 | 0.661 | 15.306 |
7 | 8 | GET /circulation/requests?(itemId==[item_id] and (status=="Open - Awaiting pickup*" or status=="Open - Awaiting delivery*")) | link | 67688 | 67688 | 0.214 | 0.26 | 0.385 | 23.776 |
8 | 20 | GET /circulation/requests?(itemId==[item_id] and (status=="Open - Awaiting pickup*" or status=="Open - Awaiting delivery*")) | link | 78697 | 78697 | 0.48 | 0.607 | 0.969 | 8.646 |
9 | 8 | GET /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled") | link | 66713 | 66713 | 0.215 | 0.265 | 0.384 | 31.676 |
10 | 20 | GET /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled") | link | 79364 | 79364 | 0.477 | 0.603 | 0.99 | 3.834 |
11 | 8 | GET /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled*") | link | 66830 | 66830 | 0.217 | 0.269 | 0.401 | 2.991 |
12 | 20 | GET /circulation/requests?(requesterId==[requesterID] and pickupServicePointId=[pickPoint] and status=="Open - Not yet filled*") | link | 78824 | 78824 | 0.482 | 0.611 | 0.923 | 4.893 |
On chart below you can see differences between queries includes "*" token and without it. As you may see the difference between 75% response times fluctuates from +11 ms to -5 ms. Which is not a significant deviation.
Resource usage:
Taking into account the number of tests that have been performed, moreover tests for "open" status shows us pretty much similar response times, let's take closer look of several of them:
Test set for (GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed")):
RDS CPU usage (GET /circulation/loans?limit=2147483647&query=(itemId==[current_item] and status.name=="Closed")) 8 users test run:
RDS Performance Insights:
As you can see on chart above - most of waiting is CPU time. Moreover longest waits was made by mod feesfines queries in both cases:
Note: No slow queries were found on PGHero during whole test set.
Services CPU utilization:
Note: There is a little different scale of charts (can't be updated due to AWS CloudWatch data aggregation), however the CPU usage level are the same for query included "*" token and without it.