Check-in-check-out Test Report (Ramsons) [ECS]
Test status: PASSED
Overview In progress
- Regression testing of Check-In/Check-Out (CI/CO) fixed load tests on okapi based environment in Ramsons ECS release.
- The purposes of CI/CO testing:
- To define response times of transaction controllers for Check-In and Check-Out
- To find any trends for resource utilization and recommend improvements
- To check how system behaves over extended period during longevity test
- Compare results (current and previous)
Summary
- Common results:
- Tests #1, #2, #3, #4
- Test #5
- Comparison with Quesnelia results:
- CI/CO response times degradation (Tests #1, #2, #3, #4):
Test,# vUsers Check-Out Controller (CO) Check-In Controller (CI) 1 8 2 20 3 30 4 75
- CI/CO response times (test #5 - longevity test):
- 30 vUsers - 13% improved in CO and 8% degraded in CI flow.
- CI/CO response times degradation (Tests #1, #2, #3, #4):
Resources
- CPU utilization
- Memory consumption
- RDS CPU utilization average
- CPU (User) usage by broker
Recommendations & Jiras
- The previous results report:
- The current ticket: PERF-983 - [Ramsons] [ECS] CI/CO
- mod-serials-management-b affect DB connection growth 200 connection in average. Disabling this module do not affect response times or error rate but significantly decrease DB connection number.
Test Runs
The following table contains tests configuration information
Test # | vUsers | Ramp-up, sec | Duration, sec |
1 | 8 | 80 | 2700 |
2 | 20 | 200 | 2700 |
3 | 30 | 300 | 2700 |
4 | 75 | 750 | 2700 |
5 | 30 | 300 | 86400 |
Results
Errors:
- Error messages:
Response time
The table contains results of Check-in, Check-out tests in Ramsons release.
Test #1, #2, #3, #4
8 vUsers (test #1) | 20 vUsers (test #2) | 30 vUsers (test #3) | 75 vUsers (test #4) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Requests | Response Times (ms) | Response Times (ms) | Response Times (ms) | Response Times (ms) | ||||||||
Label | #Samples | 95th pct | Average | #Samples | 95th pct | Average | #Samples | 95th pct | Average | #Samples | 95th pct | Average |
Check-Out Controller | 1713 | 1450 | 1153 | 2605 | 1327 | 1123 | 2380 | 1278 | 1120 | 7290 | 1823 | 1637 |
Check-In Controller | 1295 | 838 | 590 | 1943 | 706 | 559 | 1804 | 737 | 556 | 5499 | 872 | 646 |
POST_circulation/check-out-by-barcode (Submit_barcode_checkout) | 1712 | 503 | 340 | 2604 | 450 | 324 | 2381 | 411 | 320 | 7285 | 589 | 400 |
POST_circulation/check-in-by-barcode (Submit_barcode_checkin) | 1297 | 452 | 298 | 1939 | 386 | 274 | 1796 | 383 | 269 | 5509 | 441 | 285 |
GET_circulation/loans (Submit_patron_barcode) | 1715 | 256 | 192 | 2603 | 238 | 188 | 2380 | 232 | 188 | 7282 | 378 | 243 |
GET_circulation/loans (Submit_barcode_checkout) | 1712 | 263 | 191 | 2604 | 235 | 189 | 2386 | 247 | 190 | 7289 | 383 | 242 |
GET_inventory/items (Submit_barcode_checkout) | 7283 | 299 | 136 |
Test #5
30 vUsers Longevity test | |||
---|---|---|---|
Requests | Samples, Response Times (ms) | ||
Label | #Samples | 95th pct | Average |
Check-Out Controller | 55329 | 1311 | 1095 |
Check-In Controller | 41677 | 621 | 493 |
POST_circulation/check-out-by-barcode (Submit_barcode_checkout) | 55327 | 394 | 312 |
POST_circulation/check-in-by-barcode (Submit_barcode_checkin) | 41678 | 310 | 216 |
GET_circulation/loans (Submit_patron_barcode) | 55329 | 226 | 184 |
GET_circulation/loans (Submit_barcode_checkout) | 55327 | 227 | 185 |
GET_inventory/items (Submit_barcode_checkout) | 55329 | 74 | 54 |
Comparisons
This table has comparison between average values of response times of Ramsons and Quesnelia releases
8 vUsers (test #1) | 20 vUsers (test #2) | 30 vUsers (test #3) | 75 vUsers (test #4) | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Requests | Response Times, milliseconds | |||||||||||||||
Quesnelia | Ramsons | Quesnelia | Ramsons | Quesnelia | Ramsons | Quesnelia | Ramsons | |||||||||
Label | Average | Delta,ms | Difference,% | Average | Delta,ms | Difference,% | Average | Delta,ms | Difference,% | Average | Delta,ms | Difference,% | ||||
Check-Out Controller | 910 | 1153 | 243 | 26.70% | 893 | 1123 | 230 | 25.76% | 899 | 1120 | 221 | 24.58% | 1176 | 1637 | 461 | 39.20% |
Check-In Controller | 476 | 590 | 114 | 23.95% | 441 | 559 | 118 | 26.76% | 435 | 556 | 121 | 27.82% | 546 | 646 | 100 | 18.32% |
Comparison of longevity test
30 vUsers Longevity (test #5) | |||
---|---|---|---|
Response Times, milliseconds | |||
Quesnelia | Ramsons | ||
Average | Average | Delta,ms | Difference,% |
1258 | 1095 | -163 | -12.96% |
458 | 493 | 35 | 7.64% |
API requests where response times >= 100 milliseconds
API | 75 vUsers Ramsons Average, ms |
---|---|
POST_circulation/check-out-by-barcode (Submit_barcode_checkout) | 400 |
POST_circulation/check-in-by-barcode (Submit_barcode_checkin) | 285 |
GET_circulation/loans (Submit_patron_barcode) | 243 |
GET_circulation/loans (Submit_barcode_checkout) | 242 |
GET_inventory/items (Submit_barcode_checkout) | 136 |
Resources Utilization
CPU Utilization
During 45 minute tests CPU utilized mostly during high load (75 vUsers) by okapi - 84%, mod-authtoken spiked every 3 minutes from 5 to 30%, mod-inventory-storage - 23%, mod-inventory - 17%, mod-pubsub - 17%, nginx-okapi - 10%, mod-circulation - 10%, mod-circulation-storage - 3%
During longevity CPU utilized mostly by okapi - 37%, mod-authtoken spiked every 3 minutes from 5 to 20%, mod-inventory - 12%, mod-pubsub - 11%, mod-circulation - 5%, mod-circulation-storage - 3%
Tests #1, #2, #3, #4
After applying CPU parameter = 0 in container revisions we can see a relative resource utilization by modules only
Test #5
Memory Consumption
Tests #1, #2, #3, #4 and test #5
Tests #1, #2, #3, #4
Test #5
RDS CPU Utilization
RDS CPU utilized:
8 vUsers - %, 20 vUsers - %, 30 vUsers - %, 75 vUsers - % During longevity test CPU
Tests #1, #2, #3, #4
Test #5
RDS Database Connections
For 45 minute and longevity tests RDS used max connections. Without test it was connections.
Tests #1, #2, #3, #4
Test #5
CPU (User) usage by broker
Edit: As MSK cluster is linked to all PTF clusters so the time range which can reflect only CI/CO during longevity test (test #5) - from midnight till 7 a.m. Max consumption rate here - 10%. Also we may observe impact of other CI/CO tests - the max consumption rate - 40% for all clusters (tests #1, #2, #3, #4).
Tests #1, #2, #3, #4
Test #5
Database load
Tests #1, #2, #3, #4 Test #5
Edit: During 45 minute tests (#1, #2, #3, #4) we see that the longest request is UPDATE fs09000000_mod_inventory_storage.item SET with 38 ms/request
During longevity test (#5) INSERT INTO fs09000000_mod_pubsub.audit_message - 41 ms and SELECT fs09000000_mod_inventory_storage.count_estimate - 107 ms.
Other observation is that we see a lot of UPDATE fs09000000_mod_login.auth_attempts and INSERT INTO fs09000000_mod_authtoken.refresh_tokens which is new. It may be connected to every 10 minutes token refresh.
Tests #1, #2, #3, #4
Test #5
Appendix
Infrastructure
PTF -environment rcon |
---|
|
DB table records size:
|
---|
Modules
Methodology/Approach
Description
Testing includes data preparation step and testing itself
- Data preparation for each test takes up to 20 minutes and consists of truncating involved in testing tables, populating data and updating statuses of items.
- Test itself depends on duration and virtual users number creating necessary load.
In Ramsons token expiration set to 10 minutes by default so to run any tests use new login implementation from the script. Pay attention to Backend Listener. Replace value of application parameter to make the results visible in Grafana dashboard.
Module configuration recommended setup
Update revision in source-record-storage module to exclude every 30 minutes SQL statements - delete rows in marc_indexers
(mi
) WITH deleted_rows
{ "name": "srs.marcIndexers.delete.interval.seconds", "value": "86400" },
Update mod-serials module. Set number of task with 0 to exclude significant database connection growth.
DB trigger setup in Ramsons
Usual PTF CI/CO data preparation script won’t work in Ramsons. To solve that disable trigger updatecompleteupdateddate_item_insert_update before data preparation for the tenant and enable it before test start.
The sql file was updated to do that step from the script.
Data preparation
First step
- To prepare data establish connection by AWS keys, then run from bash .sql script first (take from the title of code block and replace [PASSWORD] with correct password.
-- Disable trigger ALTER TABLE fs09000000_mod_inventory_storage.item DISABLE TRIGGER updatecompleteupdateddate_item_insert_update; TRUNCATE TABLE fs09000000_mod_patron_blocks.user_summary; TRUNCATE TABLE fs09000000_mod_circulation_storage.loan; TRUNCATE TABLE fs09000000_mod_circulation_storage.audit_loan; TRUNCATE TABLE fs09000000_mod_circulation_storage.request; TRUNCATE TABLE fs09000000_mod_circulation_storage.patron_action_session; TRUNCATE TABLE fs09000000_mod_circulation_storage.scheduled_notice; TRUNCATE TABLE fs09000000_mod_notify.notify_data; UPDATE fs09000000_mod_inventory_storage.item SET jsonb = jsonb_set(jsonb, '{status, name}', '"Available"') WHERE jsonb->'status'->>'name' != 'Available'; UPDATE fs09000000_mod_users.users SET jsonb = jsonb_set(jsonb, '{active}', '"true"') WHERE jsonb->'active' != 'true'; -- Enable trigger ALTER TABLE fs09000000_mod_inventory_storage.item ENABLE TRIGGER updatecompleteupdateddate_item_insert_update;
Second step
- Run command from scripts folder uploaded to S3 bucket ./circ-data-load.sh psql_rcp1.conf [tenant] - where replace [tenant] with tenant Id, change parameters in psql_rcp1.conf file with valid data.
- Troubleshooting:
- If the command executed from local machine you may encounter with too long query error message. To solve it use PGAdmin to run 2 long queries UPDATE ${TENANT}_mod_inventory_storage.item SET jsonb = jsonb_set(jsonb, '{status, name}', '\"Checked out\"') where id IN.
- Other possible issue - incorrect encoding (on Windows machine). To solve it just add ENCODING 'UTF8'
- Use pattern: copy ${TENANT}_mod_circulation_storage.loan(id, jsonb) FROM '${LOANS}' DELIMITER E'\t' ENCODING 'UTF8'
Use .jmx file script for Ramson release. If any changes were made then upload the artefacts to S3 bucket and AWS instance load generator.
To start test from AWS instance (load generator) use template for the command. Test locally before start.
8 vUsers - nohup jmeter -n -t /home/ptf/testdata/RCON/PERF-983_CICO/circulation_checkInCheckOut_rcon.jmx -l rcon_8vusers.jtl -e -o /home/ptf/testdata/RCON/PERF-983_CICO/results/8vusers -JGlobal_duration=2700 -JVUSERS=8 -JRAMP_UP=80 20vUsers - nohup jmeter -n -t /home/ptf/testdata/RCON/PERF-983_CICO/circulation_checkInCheckOut_rcon.jmx -l rcon_20vusers.jtl -e -o /home/ptf/testdata/RCON/PERF-983_CICO/results/20vusers -JGlobal_duration=2700 -JVUSERS=20 -JRAMP_UP=200 30vUsers - nohup jmeter -n -t /home/ptf/testdata/RCON/PERF-983_CICO/circulation_checkInCheckOut_rcon.jmx -l rcon_30vusers.jtl -e -o /home/ptf/testdata/RCON/PERF-983_CICO/results/30vusers -JGlobal_duration=2700 -JVUSERS=30 -JRAMP_UP=300 75vUsers - nohup jmeter -n -t /home/ptf/testdata/RCON/PERF-983_CICO/circulation_checkInCheckOut_rcon.jmx -l rcon_75vusers.jtl -e -o /home/ptf/testdata/RCON/PERF-983_CICO/results/75vusers -JGlobal_duration=2700 -JVUSERS=75 -JRAMP_UP=750 30vUsers - longevity nohup jmeter -n -t /home/ptf/testdata/RCON/PERF-983_CICO/circulation_checkInCheckOut_rcon.jmx -l rcon_30vusers_long.jtl -e -o /home/ptf/testdata/RCON/PERF-983_CICO/results/30vusers_long -JGlobal_duration=86400 -JVUSERS=30 -JRAMP_UP=300
Test CI/CO with 8, 20, 30, 75 concurrent users for 45 minutes each.
Test CI/CO with 30 users for 24 hours to detect any trends in memory.
To create widgets in AWS dashboard to monitor and collect CI/CO related modules parameters (service CPU and Memory) use these json:
File with raw data
Use the file to get raw data and comparison tables
To define the response times for requests that take longer than 100 milliseconds