IN PROGRESS
Overview
In the scope of - PERF-531Getting issue details... STATUS it's needed to run tests to answer questions:
- Can the current System can accommodate a high use case load ?
- If not PERF-525
- What happened at peak times when all workflows are running at once?
- Typical KPIs:
- Service CPU
- Service Memory
- DB CPU
- DB Memory
- Response times
- Durations of long workflows
- Recommendations to improve on scaling up/out modules to accommodate peak times
Summary
tested without Data Import due to - PERF-582Getting issue details... STATUS
- The current system can accommodate an average load only without Data Import. With data import, we will have 'HTTP 500 Internal Server Error. If the issue persists, please report it to EBSCO Connect.' for several workflows in the source-storage/records/{id}/formatted requests - PERF-582Getting issue details... STATUS , and the general response time will be longer up to 2 times for all other workflows.
- Service CPU utilization did not exceed 36% even at the beginning of all processes together. Without spikes at the beginning and start of DI jobs average CPU usage was about 15%. Instance CPU Utilization did not exceed 22%.
- Service memory utilization was stable, and no memory leaks were suspected during tests.
Each FYR (Fiscal close - end of FY rollover) job is consuming a lot of DB CPU (each spike here corresponds to each FYR job). Approximately DB CPU usage was up to 92%
- Increasing the number of database connections to DB_MAXPOOLSIZE = 200 for mod-source-record-manager and mod-source-record-storage did not give any positive effect on 'HTTP 500 Internal Server Error. If the issue persists, please report it to EBSCO Connect.' but duration of Data Import job was decreased twice.
Recommendations & Jiras (Optional)
Jiras
- PERF-582Getting issue details... STATUS
Test Runs & Results
Test # | # configuration | Test duration | comments |
---|---|---|---|
1 | All workflows started at the same time. Test without Data import | 1 hour | NO server errors |
2 | All workflows started at the same time. Test without Data import | 1 hour | NO server errors |
3 | All workflows started at the same time. Test without Data import | 1 hour | NO server errors |
Test results from 1st test run (1st, 2nd and 3rd test run results are similar):
Test # | Workflow name | Total time it takes to complete workflow | Avg diff | |||
---|---|---|---|---|---|---|
Avg no DI (sec) high load | 95th pct no DI (sec) high load | Avg no DI (sec) normal load | 95th pct no DI (sec) normal load | |||
1 | Checkin | 1.054 | 1.591 | |||
2 | Checkout | 1.650 | 1.948 | |||
3 | View invoices | 0.763 | 0.913 | |||
4 | Create invoices | 1.174 | 1.370 | |||
5 | Edit invoices | 1.581 | 1.897 | |||
6 | Delete invoices | 0.804 | 0.927 | |||
7 | Approving Invoices | 1.453 | 1.940 | |||
8 | View Authority records | 0.289 | 0.381 | |||
9 | View MARC tag table | 0.987 | 1.284 | |||
10 | View holdings records | 1.526 | 1.922 | |||
11 | View Bib | 0.841 | 1.168 | |||
12 | View patron records | 0.566 | 0.883 | |||
13 | Delete patron records | 0.638 | 1.070 | |||
14 | Update patron records | 1.043 | 1.625 | |||
15 | Create patron records | 1.098 | 1.261 | |||
16 | View Ledger | 0.050 | 0.088 | |||
17 | Create ledger | 0.616 | 0.761 | |||
18 | Edit ledger | 0.054 | 0.085 | |||
19 | Delete a ledger | 0.046 | 0.080 | |||
20 | Export bib "Default instances export job profile" | - | 5 sec (5000 records) | - | ||
21 | Export holdings "Default holdings export job profile" | - | 26 sec (5000 records) | - | ||
22 | Export authority records "Default authority export job profile" | - | 3 sec (5000 records) | - | ||
23 | DI "DISC HRID match" | - | - | - | - | |
24 | DI "DS LA edeposit records update" | - | - | - | - | |
25 | DI "DISC New edeposit records" | - | - | - | - | |
26 | DI "DISC New NON edeposit records" | - | - | - | - | |
27 | View item records | 1.289 | 1.649 | |||
28 | update item records | 0.998 | 1.250 | |||
29 | delete item records | 0.927 | 1.099 | |||
30 | Monitoring Pick Slips and Requests GET /circulation/requests | 0.359 | 0.480 | |||
31 | Monitoring Pick Slips and Requests GET /circulation/pick-slips/ | 0.112 | 0.256 | |||
32 | Monitoring Pick Slips and Requests | 0.303 | 0.303 | |||
33 | Users loan renewal | 1.467 | 1.661 | |||
34 | Item-level requests | 0.669 | 0.973 | |||
35 | View vendor records | 0.713 | 1.165 | |||
36 | Edit vendor records | 5.199 | 6.190 | |||
37 | Create vendor records | 1.064 | 1.200 | |||
38 | Delete vendor records | 0.412 | 0.522 | |||
39 | Create purchase orders | 1.625 | 1.733 | |||
40 | View purchase orders | 1.205 | 1.435 | |||
41 | Edit purchase orders | 2.076 | 2.984 | |||
42 | Delete purchase orders | 1.432 | 1.830 | |||
43 | Retrieving instances and holdings | 0.035 | 0.073 | |||
44 | Edit MARC tag table | 3.424 | 4.257 | |||
45 | Fiscal close - end of FY rollover | - | 11 min | - | ||
46 | Blacklight: View an inventory record JMeter script | 0.821 | 1.042 | |||
47 | Blacklight: Create a Request JMeter script | 1.122 | 1.404 | |||
48 | Blacklight: Create a View Patron record JMeter script | 0.073 | 0.110 |
Throughput graphs
For Test#1, Test#2 and Test#3 graphs are almost the same.
Memory Utilization
This graph represents memory usage of 3 first test runs and shows that no memory leak is suspected for all of the modules.
Service CPU Utilization
CPU usage did not exceed 36 % for all modules. We can observe spikes in CPU usage of DI modules at the beginning of the Data Import jobs. Without DI spikes average CPU usage was about 15%.
Most CPU-consuming modules:
- mod-inventory - 36%
- mod-authtoken -25%
- mod-data-import - 16%
- mod-di-converter-storage -15,6%
- mod-quick-marc - 15%
- mod-finance-storage - 14%
- nginx-okapi - 14%
- okapi - 11%
- mod-tags -10,6%
- others - usage less than 10%
- mod-source-record-manager - up to 7%
- mod-source-record-storage - up to 4%
Instance CPU Utilization
RDS CPU Utilization
Each FYR (Fiscal close - end of FY rollover) job is consuming a lot of DB CPU (each spike here corresponds to each FYR job).
Approximately DB CPU usage is up to 92%
RDS Database Connections
Test# 1 - Test# 3 - 420 connections count.
Appendix
Infrastructure
PTF -environment ncp3
- 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, one reader, and one writer
Name API Name Memory GIB vCPUs max_connections R6G Extra Large db.r6g.xlarge 32 GiB 4 vCPUs 2731 - MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
- Kafka topics partitioning: - 2 partitions for DI topics
Modules memory and CPU parameters
Modules | Version | Task Definition | Running Tasks | CPU | Memory | MemoryReservation | MaxMetaspaceSize | Xmx |
---|---|---|---|---|---|---|---|---|
mod-inventory-storage | 26.0.0 | 1 | 2 | 1024 | 2208 | 1952 | 384 | 1440 |
mod-inventory | 20.0.4 | 1 | 2 | 1024 | 2880 | 2592 | 512 | 1814 |
mod-tags | 2.0.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-gobi | 2.6.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
mod-remote-storage | 2.0.2 | 1 | 2 | 1024 | 4920 | 4472 | 512 | 3960 |
mod-invoice-storage | 5.6.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
edge-sip2 | 3.0.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-users-bl | 7.5.0 | 1 | 2 | 512 | 1440 | 1152 | 128 | 922 |
edge-rtac | 2.6.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-feesfines | 18.2.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-rtac | 3.5.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-erm-usage-harvester | 4.3.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-search | 2.0.1 | 1 | 2 | 400 | 2592 | 2480 | 1024 | 1440 |
mod-service-interaction | 2.2.2 | 1 | 2 | 256 | 2048 | 1844 | 512 | 1290 |
edge-ncip | 1.8.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-authtoken | 2.13.0 | 1 | 2 | 512 | 1440 | 1152 | 128 | 922 |
mod-permissions | 6.3.1 | 2 | 2 | 512 | 1684 | 1544 | 512 | 1024 |
mod-circulation-storage | 16.0.0 | 1 | 2 | 1024 | 1536 | 1440 | 512 | 896 |
mod-ncip | 1.13.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-pubsub | 2.9.1 | 1 | 2 | 1024 | 1536 | 1440 | 512 | 922 |
edge-orders | 2.8.1 | 1 | 2 | 1024 | 1536 | 1440 | 512 | 922 |
mod-circulation | 23.5.4 | 1 | 2 | 1536 | 2880 | 2592 | 128 | 700 |
edge-caiasoft | 2.0.0 | 1 | 2 | 128 | 1024 | 896 | - | - |
mod-data-export | 4.7.1 | 1 | 1 | 1024 | 1024 | 896 | 128 | 768 |
mod-organizations-storage | 4.5.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
mod-source-record-storage | 5.6.5 | 1 | 2 | 2048 | 5600 | 5000 | 512 | 3600 |
mod-copycat | 1.4.0 | 1 | 2 | 896 | 1024 | 896 | 128 | 768 |
mod-bulk-operations | 1.0.5 | 1 | 2 | 1024 | 3072 | 2600 | 512 | 1536 |
mod-quick-marc | 3.0.0 | 1 | 1 | 128 | 2288 | 2176 | 512 | 1664 |
mod-audit | 2.7.0 | 1 | 2 | 1024 | 1024 | 896 | 128 | 768 |
mod-oai-pmh | 3.11.3 | 1 | 2 | 1024 | 2248 | 2000 | 512 | 1440 |
edge-connexion | 1.0.6 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-kb-ebsco-java | 3.13.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-patron | 5.5.2 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-email | 1.15.3 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-password-validator | 3.0.0 | 1 | 2 | 128 | 1440 | 1298 | 512 | 768 |
mod-login | 7.9.0 | 1 | 2 | 1024 | 1440 | 1298 | 512 | 768 |
mod-data-export-worker | 3.0.12 | 1 | 2 | 1024 | 3072 | 2600 | 512 | 2048 |
mod-agreements | 5.5.2 | 1 | 2 | 128 | 3096 | 2580 | 512 | 2048 |
edge-oai-pmh | 2.6.1 | 1 | 2 | 1024 | 1512 | 1360 | 512 | 1440 |
mod-eusage-reports | 1.3.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-orders-storage | 13.5.0 | 1 | 2 | 512 | 1024 | 896 | 128 | 700 |
mod-notify | 3.0.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-source-record-manager | 3.6.2 | 1 | 2 | 2048 | 5600 | 5000 | 512 | 3600 |
mod-di-converter-storage | 2.0.2 | 2 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-template-engine | 1.18.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-user-import | 3.7.2 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-finance-storage | 8.4.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
mod-users | 19.1.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-sender | 1.10.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-graphql | 1.11.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-licenses | 4.3.1 | 1 | 2 | 128 | 2480 | 2312 | 512 | 1792 |
mod-invoice-b | 5.6.2 | 1 | 2 | 512 | 1440 | 1152 | 128 | 922 |
mod-event-config | 2.5.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-calendar | 2.4.2 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-erm-usage | 4.5.2 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-patron-blocks | 1.8.0 | 1 | 2 | 1024 | 1024 | 896 | 128 | 768 |
mod-data-import | 2.7.1 | 1 | 1 | 256 | 2048 | 1844 | 512 | 1292 |
mod-ebsconet | 2.0.0 | 1 | 2 | 128 | 1248 | 1024 | 256 | 700 |
edge-dematic | 2.0.0 | 1 | 2 | 128 | 1024 | 896 | - | - |
mod-task-list | 5.0.1 | 1 | 1 | 128 | 1024 | 896 | 128 | 768 |
mod-courses | 1.4.7 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-inventory-update | 3.0.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-login-saml | 2.6.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-orders | 12.6.6 | 1 | 2 | 1024 | 2048 | 1440 | 512 | 1024 |
mod-configuration | 5.9.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-organizations | 1.7.0 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
mod-notes | 5.0.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 322 |
mod-finance | 4.7.1 | 1 | 2 | 128 | 1024 | 896 | 128 | 700 |
mod-data-export-spring | 2.0.1 | 1 | 1 | 256 | 2048 | 1844 | 256 | 1292 |
edge-patron | 4.11.0 | 1 | 2 | 256 | 1024 | 896 | 128 | 768 |
okapi | 5.0.1 | 2 | 3 | 1024 | 1684 | 1440 | 512 | 922 |
nginx-okapi | 2022.03.02 | 1 | 2 | 128 | 1024 | 896 | - | - |
pub-okapi | 2022.03.02 | 1 | 2 | 128 | 1024 | 896 | - | 768 |
Methodology/Approach
To test Baseline for normal NLA library usage the JMeter scripts were used.
Tested with different DI delays:
- From test start
- 1 min delay
- 20 min delay
- without DI
Data was gathered from 2 periods with and without data import.
- DI - data import
- FYR - Fiscal close - end of FY rollover