Master Script normal load test - NLA report
Overview
In the scope of PERF-524 it's needed to run tests to answer questions:
- Can the current System accommodate an average load? Load model described here NLA load model investigation and creation.
- If not PERF-525
- What happened at peak times when all workflows are running at once?
- Typical KPIs:
- Service CPU
- Service Memory
- DB CPU
- DB Memory
- Response times
- Duration of long workflows
- Recommendations to improve on scaling up/out modules to accommodate peak times
Summary
- The current system can accommodate an average load only without Data Import. With data import, we will have 'HTTP 500 Internal Server Error. If the issue persists, please report it to EBSCO Connect.' for several workflows in the source-storage/records/{id}/formatted requests - PERF-582Getting issue details... STATUS , and the general response time will be longer up to 2 times for all other workflows.
- According to the ticket - PERF-525Getting issue details... STATUS , were performed a set of tests with changing parameters of DI modules. Results in the table show that for the configuration of mod-source-record-manager with 500 database connections count and mod-source-record-storage with 500 database connections count and 60-sec timeout it was not observed "HTTP 500 Internal Server Error." issue, but the response time for requests like GET /source-storage/records/id/formatted greatly increases during DI (up to 75 sec).
- Increasing the number of database connections to DB_MAXPOOLSIZE = 200 for mod-source-record-manager and mod-source-record-storage did not give any positive effect on 'HTTP 500 Internal Server Error. If the issue persists, please report it to EBSCO Connect.' but duration of Data Import job was decreased twice.
- Service CPU utilization did not exceed 36% even at the beginning of all processes together. Without spikes at the beginning and start of DI jobs average CPU usage was about 15%. Instance CPU Utilization did not exceed 22%.
- Service memory utilization was stable, and no memory leaks were suspected during tests.
Each FYR (Fiscal close - end of FY rollover) job is consuming a lot of DB CPU (each spike here corresponds to each FYR job). Approximately DB CPU usage was up to 92%
Recommendations & Jiras (Optional)
Jiras
- PERF-582Getting issue details... STATUS
Test Runs & Results
Test # | # configuration | Test duration | comments |
---|---|---|---|
1 | All workflows started at the same time | 1 hour | Res 500 ERROR during Data Import (DI) process |
2 | Data import and data export started with 15 min delay | 1 hour | Res 500 ERROR during Data Import process |
3 | Data import and data export started with 1 min delay | 1 hour | Res 500 ERROR during Data Import process |
4 | Data import started with 1 min delay and data export started with 20 min delay | 35 min | Res 500 ERROR during Data Import process |
5 | Same as Test# 4 Jenkins configuration with increased DB_MAXPOOLSIZE = 200 for mod_srs and mod_srm | 30 min | Res 500 ERROR during Data Import process. Data import duration decreased 2 times. |
6 | Test without Data import | 1 hour | NO errors |
Test results from 1st test run (1st, 2nd and 3rd test run results are similar):
Test # | Workflow name | Total time it takes to complete workflow | After environment improvement, no server errors were observed (23-06-2023)*** | Time-consuming requests for each workflow during DI for default configuration, finished with Response body: HTTP 500 Internal Server Error. If the issue persists, please report it to EBSCO Connect. | ||||
---|---|---|---|---|---|---|---|---|
Avg With DI (sec) | 95th pct with DI (sec) | Avg no DI (sec) | 95th pct no DI (sec) | With DI Avg | Without DI | |||
1 | CICO_Checkin | 1.238 | 1.506 | 1.054 | 1.591 | 1.188 | 1.089 | |
2 | CICO_Checkout | 2.156 | 2.829 | 1.650 | 1.948 | 2.040 | 1.933 | |
3 | IO_View invoices | 0.907 | 1.305 | 0.763 | 0.913 | 1.178 | 1.015 | |
4 | IO_Create invoices | 1.433 | 1.815 | 1.174 | 1.370 | 1.689 | 1.795 | |
5 | IO_Edit invoices | 1.983 | 2.422 | 1.581 | 1.897 | 2.276 | 1.015 | |
6 | IO_Delete invoices | 1.070 | 1.196 | 0.804 | 0.927 | 1.248 | 1.175 | |
7 | AIE_Approving Invoices | 1.752 | 2.211 | 1.453 | 1.940 | 3.208 | 2.740 | |
8 | VAR_View Authority records | 22.037 | 30.604 | 0.289 | 0.381 | 49.268 | 0.295 | VAR_GET /source-storage/records/marc_id/formatted |
9 | VTT_View MARC tag table | 41.272 | 61.935 | 0.987 | 1.284 | 75.006 | 1.052 | VTT_GET source-storage/records/{id}/formatted *2 |
10 | VH_View holdings records | 27.328 | 33.579 | 1.526 | 1.922 | 34.645 | 1.291 | VH_GET source-storage/records/{id}/formatted |
11 | VB_View Bib | 22.851 | 31.634 | 0.841 | 1.168 | 42.175 | 0.900 | VB_GET source-storage/records/{id}/formatted |
12 | PRO_View patron records | 0.672 | 1.118 | 0.566 | 0.883 | 0.359 | 0.638 | |
13 | PRO_Delete patron records | 0.892 | 1.336 | 0.638 | 1.070 | 0.728 | 0.763 | |
14 | PRO_Update patron records | 1.386 | 2.097 | 1.043 | 1.625 | 1.166 | 1.157 | |
15 | PRO_Create patron records | 1.547 | 1.979 | 1.098 | 1.261 | 1.571 | 1.286 | |
16 | LO_View Ledger | 0.122 | 0.458 | 0.050 | 0.088 | 0.125 | 0.047 | |
17 | LO_Create ledger | 0.684 | 0.840 | 0.616 | 0.761 | 0.542 | 0.629 | |
18 | LO_Edit ledger | 0.076 | 0.094 | 0.054 | 0.085 | 0.698 | 0.047 | |
19 | LO_Delete a ledger | 0.080 | 0.129 | 0.046 | 0.080 | 0.073 | 0.050 | |
20 | DE_Export bib "Default instances export job profile" | 11 sec (5000 records) | - | 5 sec (5000 records) | - | 6 sec (5000 records) | 5 sec (5000 records) | |
21 | DE_Export holdings "Default holdings export job profile" | 3 min 16 sec (5000 records) | - | 26 sec (5000 records) | - | 28 sec (5000 records) | 27 sec (5000 records) | |
22 | DE_Export authority records "Default authority export job profile" | 8 sec (5000 records) | - | 3 sec (5000 records) | - | 3 sec (5000 records) | 3 sec (5000 records) | |
23 | DI "DISC HRID match" | 1sec (1 record) | - | - | - | 1 sec | - | |
24 | DI "DS LA edeposit records update" | 17 min 5 sec | - | - | - | 3 min 9 sec | - | |
25 | DI "DISC New edeposit records" | - | - | - | - | 13 sec | - | |
26 | DI "DISC New NON edeposit records" | 4 sec (5 records) | - | - | - | 3 sec | - | |
27 | IRO_View item records | 24.771 | 32.327 | 1.289 | 1.649 | 37.804 | 1.449 | IRO_GET source-storage/records/{id}/formatted |
28 | IRO_update item records | 17.226 | 33.950 | 0.998 | 1.250 | - | 1.033 | IRO_GET source-storage/records/{id}/formatted |
29 | IRO_delete item records | 23.686 | 31.497 | 0.927 | 1.099 | 43.122 | 1.042 | IRO_GET source-storage/records/{id}/formatted |
30 | MPS_Monitoring Pick Slips and Requests GET /circulation/requests | 0.434 | 0.527 | 0.359 | 0.480 | 0.324 | 0.384 | |
31 | MPS_Monitoring Pick Slips and Requests GET /circulation/pick-slips/ | 0.106 | 0.312 | 0.112 | 0.256 | 0.086 | 0.116 | |
32 | MPS_Monitoring Pick Slips and Requests | 0.297 | 0.297 | 0.303 | 0.303 | 0.136 | 0.184 | |
33 | ULR_Users loan renewal | 1.899 | 2.395 | 1.467 | 1.661 | 1.939 | 1.564 | |
34 | ILR_Item-level requests | 0.835 | 1.267 | 0.669 | 0.973 | 0.722 | 0.717 | |
35 | VRO_View vendor records | 1.315 | 2.305 | 0.713 | 1.165 | 0.379 | 0.773 | |
36 | VRO_Edit vendor records | 7.429 | 9.980 | 5.199 | 6.190 | 8.109 | 6.036 | |
37 | VRO_Create vendor records | 1.436 | 1.829 | 1.064 | 1.200 | 1.579 | 1.189 | |
38 | VRO_Delete vendor records | 0.635 | 0.885 | 0.412 | 0.522 | 0.266 | 0.458 | |
39 | POO_Create purchase orders | 1.974 | 2.393 | 1.625 | 1.733 | 1.726 | 1.840 | |
40 | POO_View purchase orders | 1.502 | 1.603 | 1.205 | 1.435 | 0.809 | 1.006 | |
41 | POO_Edit purchase orders | 3.118 | 3.914 | 2.076 | 2.984 | 1.095 | 1.829 | |
42 | POO_Delete purchase orders | 2.036 | 3.387 | 1.432 | 1.830 | 0.779 | 1.110 | |
43 | RIH_Retrieving instances and holdings | 19.737 | 30.084 | 0.035 | 0.073 | 12.515 | 0.039 | RIH_GET source-storage/source-records |
44 | ETT_Edit MARC tag table | 118.391 | 128.508 | 3.424 | 4.257 | 87.175 | 3.599 | ETT_GET /source-storage/records/instance_id/formatted ETT_GET /records-editor/records |
45 | FYR_Fiscal close - end of FY rollover | 10 min 30 s | - | 11 min | - | - | 13 min | |
46 | VIR_Blacklight: View an inventory record JMeter script | 1.030 | 1.602 | 0.821 | 1.042 | 0.934 | 0.839 | |
47 | BLS_Blacklight: Create a Request JMeter script | 1.336 | 1.605 | 1.122 | 1.404 | 1.139 | 1.156 | |
48 | PRV_Blacklight: Create a View Patron record JMeter script | 0.106 | 0.136 | 0.073 | 0.110 | 0.078 | 0.073 | |
49 | VIH_View instance holdings details | 21.241 | 31.698 | 1.456 | 1.572 | 47.835 | 1.535 | VIH_GET /source-storage/records/instanceId/formatted |
*Note that workflows that have response times or durations in red are the ones that are at least 2 times higher than when running without Data Import jobs.
*** Test run results with the configuration of mod-source-record-manager with 500 database connections count and mod-source-record-storage with 500 database connections count and 60-sec timeout.
Throughput graphs
For Test#1, Test#3 and Test#4 graphs are almost the same.
Memory Utilization
Memory utilization of the most memory-consuming modules & DI modules:
- mod-inventory - 98%
- mod-orders - 73%
- mod-source-record-manager - 44%
- mod-source-record-storage - 23,5%
- mod-inventory-storage - 31%
- data-import - 22%
- mod-di-converter-storage - 31%
This graph represents memory usage of 3 first test runs and shows that no memory leak is suspected for all of the modules.
Service CPU Utilization
CPU usage did not exceed 36 % for all modules. We can observe spikes in CPU usage of DI modules at the beginning of the Data Import jobs. Without DI spikes average CPU usage was about 15%.
Test# 1 - Test# 4 DI duration - 17 min.
Test# 5 DI duration - 8 min.
Most CPU-consuming modules:
- mod-inventory - 36%
- mod-authtoken -25%
- mod-data-import - 16%
- mod-di-converter-storage -15,6%
- mod-quick-marc - 15%
- mod-finance-storage - 14%
- nginx-okapi - 14%
- okapi - 11%
- mod-tags -10,6%
- others - usage less than 10%
- mod-source-record-manager - up to 7%
- mod-source-record-storage - up to 4%
Instance CPU Utilization
RDS CPU Utilization
Each FYR (Fiscal close - end of FY rollover) job is consuming a lot of DB CPU (each spike here corresponds to each FYR job).
Approximately DB CPU usage is up to 92%
RDS Database Connections
Test# 1 - Test# 4 for part of the test with DI job- 620 connections count.
Test# 5 for part of the test with DI job- 820 connections count. (allocated additional 200 connections for DI modules)
Appendix
Infrastructure
Records count :
- mod_source_record_storage.marc_records_lb = 7300919
- mod_source_record_storage.raw_records_lb = 7300919
- mod_source_record_storage.records_lb = 7300919
- mod_source_record_storage.marc_indexers = 245032159 (all records)
- mod_source_record_storage.marc_indexers with field_no 010 = 1008129
- mod_source_record_storage.marc_indexers with field_no 035 = 8968420
- mod_inventory_storage.authority = 852215
- mod_inventory_storage.holdings_record = 6091403
- mod_inventory_storage.instance = 5581816
- mod_inventory_storage.item = 5705915
PTF -environment ncp3
- 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, one reader, and one writer
Name API Name Memory GIB vCPUs max_connections R6G Extra Large db.r6g.xlarge 32 GiB 4 vCPUs 2731 - MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
- Kafka topics partitioning: - 2 partitions for DI topics
Modules memory and CPU parameters
Modules | Version | Task Definition | Running Tasks | CPU | Memory | MemoryReservation | MaxMetaspaceSize | Xmx |
---|---|---|---|---|---|---|---|---|
mod-inventory-storage | 26.0.0 | 1 | 2 | 1024 | 2208 | 1952 | 384 | 1440 |
mod-inventory | 20.0.4 | 1 | 2 | 1024 | 2880 | 2592 | 512 | 1814 |
mod-tags | 2.0.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-gobi | 2.6.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
mod-remote-storage | 2.0.2 | 1 | 2 | 1024 | 4920 | 4472 | 512 | 3960 |
mod-invoice-storage | 5.6.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
edge-sip2 | 3.0.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-users-bl | 7.5.0 | 1 | 2 | 512 | 1440 | 1152 | 128 | 922 |
edge-rtac | 2.6.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-feesfines | 18.2.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-rtac | 3.5.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-erm-usage-harvester | 4.3.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-search | 2.0.1 | 1 | 2 | 400 | 2592 | 2480 | 1024 | 1440 |
mod-service-interaction | 2.2.2 | 1 | 2 | 256 | 2048 | 1844 | 512 | 1290 |
edge-ncip | 1.8.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-authtoken | 2.13.0 | 1 | 2 | 512 | 1440 | 1152 | 128 | 922 |
mod-permissions | 6.3.1 | 2 | 2 | 512 | 1684 | 1544 | 512 | 1024 |
mod-circulation-storage | 16.0.0 | 1 | 2 | 1024 | 1536 | 1440 | 512 | 896 |
mod-ncip | 1.13.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-pubsub | 2.9.1 | 1 | 2 | 1024 | 1536 | 1440 | 512 | 922 |
edge-orders | 2.8.1 | 1 | 2 | 1024 | 1536 | 1440 | 512 | 922 |
mod-circulation | 23.5.4 | 1 | 2 | 1536 | 2880 | 2592 | 128 | 700 |
edge-caiasoft | 2.0.0 | 1 | 2 | 128 | 1024 | 896 | - | - |
mod-data-export | 4.7.1 | 1 | 1 | 1024 | 1024 | 896 | 128 | 768 |
mod-organizations-storage | 4.5.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
mod-source-record-storage | 5.6.5 | 1 | 2 | 2048 | 5600 | 5000 | 512 | 3600 |
mod-copycat | 1.4.0 | 1 | 2 | 896 | 1024 | 896 | 128 | 768 |
mod-bulk-operations | 1.0.5 | 1 | 2 | 1024 | 3072 | 2600 | 512 | 1536 |
mod-quick-marc | 3.0.0 | 1 | 1 | 128 | 2288 | 2176 | 512 | 1664 |
mod-audit | 2.7.0 | 1 | 2 | 1024 | 1024 | 896 | 128 | 768 |
mod-oai-pmh | 3.11.3 | 1 | 2 | 1024 | 2248 | 2000 | 512 | 1440 |
edge-connexion | 1.0.6 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-kb-ebsco-java | 3.13.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-patron | 5.5.2 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-email | 1.15.3 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-password-validator | 3.0.0 | 1 | 2 | 128 | 1440 | 1298 | 512 | 768 |
mod-login | 7.9.0 | 1 | 2 | 1024 | 1440 | 1298 | 512 | 768 |
mod-data-export-worker | 3.0.12 | 1 | 2 | 1024 | 3072 | 2600 | 512 | 2048 |
mod-agreements | 5.5.2 | 1 | 2 | 128 | 3096 | 2580 | 512 | 2048 |
edge-oai-pmh | 2.6.1 | 1 | 2 | 1024 | 1512 | 1360 | 512 | 1440 |
mod-eusage-reports | 1.3.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-orders-storage | 13.5.0 | 1 | 2 | 512 | 1024 | 896 | 128 | 700 |
mod-notify | 3.0.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-source-record-manager | 3.6.2 | 1 | 2 | 2048 | 5600 | 5000 | 512 | 3600 |
mod-di-converter-storage | 2.0.2 | 2 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-template-engine | 1.18.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-user-import | 3.7.2 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-finance-storage | 8.4.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
mod-users | 19.1.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-sender | 1.10.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-graphql | 1.11.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-licenses | 4.3.1 | 1 | 2 | 128 | 2480 | 2312 | 512 | 1792 |
mod-invoice-b | 5.6.2 | 1 | 2 | 512 | 1440 | 1152 | 128 | 922 |
mod-event-config | 2.5.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-calendar | 2.4.2 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-erm-usage | 4.5.2 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-patron-blocks | 1.8.0 | 1 | 2 | 1024 | 1024 | 896 | 128 | 768 |
mod-data-import | 2.7.1 | 1 | 1 | 256 | 2048 | 1844 | 512 | 1292 |
mod-ebsconet | 2.0.0 | 1 | 2 | 128 | 1248 | 1024 | 256 | 700 |
edge-dematic | 2.0.0 | 1 | 2 | 128 | 1024 | 896 | - | - |
mod-task-list | 5.0.1 | 1 | 1 | 128 | 1024 | 896 | 128 | 768 |
mod-courses | 1.4.7 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-inventory-update | 3.0.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-login-saml | 2.6.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-orders | 12.6.6 | 1 | 2 | 1024 | 2048 | 1440 (Recommended to change to 1544) | 512 | 1024 |
mod-configuration | 5.9.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-organizations | 1.7.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
mod-notes | 5.0.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 322 |
mod-finance | 4.7.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
mod-data-export-spring | 2.0.1 | 1 | 1 | 256 | 2048 | 1844 | 256 | 1292 |
edge-patron | 4.11.0 | 1 | 2 | 256 | 1024 | 896 | 128 | 768 |
okapi | 5.0.1 | 2 | 3 | 1024 | 1684 | 1440 | 512 | 922 |
nginx-okapi | 2022.03.02 | 1 | 2 | 128 | 1024 | 896 | - | - |
pub-okapi | 2022.03.02 | 1 | 2 | 128 | 1024 | 896 | - | 768 |
Methodology/Approach
To test Baseline for normal NLA library usage the JMeter scripts were used.
Tested with different DI delays:
- From test start
- 1 min delay
- 20 min delay
- without DI
Data was gathered from 2 periods with and without data import.
- DI - data import
- FYR - Fiscal close - end of FY rollover