Member tenants sharing local instances
Overview
This document contains the results of testing Sharing local instances(SLI) for MARC Source records. https://folio-org.atlassian.net/browse/PERF-755
After improvement, the fixed version mod-inventory:20.1.7-SNAPSHOT.487 was deployed on the Pcon cluster. https://folio-org.atlassian.net/browse/MODINV-950
Summary
Duration on all of the tenants cs00000int_0001-cs00000int_0004 for 1 SLI process is about the same and the average value is about 2 seconds. For 2 parallel SLIs duration is about 2.2 Seconds, for 3 parallel SLIs 2.15 seconds and for 4 parallel SLIs is 2.1.
No memory leak is suspected for SLI modules. Memory consumption was quite low in comparison to the "before-test" state.
Maximal CPU utilization was about 17% on mod-inventory and mod-quick-marc.
RDS CPU utilization were from 15% for 1VU and up to 20% for 4VU;
Recommendations and Jiras
Test SLI in parallel with other workflows
Test results
Test parameters inventory.sharing.di.status.poll.interval.seconds= 1 and inventory.sharing.di.status.poll.number = 5(default)
Test 1. 1 Virtual user working sequentially on each of the 4 tenants.
Tenant | TTL REQ, COUNT | THRGHPT, REQ/SEC | ERRORS, COUNT | MIN, MS | MEDIAN, MS | PCT95, MS | MAX, MS |
|---|---|---|---|---|---|---|---|
cs00000int_0001 | 100 | 0.298 | 4 | 1081 | 2255 | 2611 | 3719 |
cs00000int_0002 | 100 | 0.389 | 0 | 1006 | 2218 | 2579 | 3159 |
cs00000int_0003 | 100 | 0.395 | 1 | 602 | 2076 | 2150 | 15233 |
cs00000int_0004 | 100 | 0.422 | 1 | 554 | 1940 | 2183 | 2989 |
Test 2. 2 Virtual users working parallel on 2 tenants.
Tenant | TTL REQ, COUNT | THRGHPT, REQ/SEC | ERRORS, COUNT | MIN, MS | MEDIAN, MS | PCT95, MS | MAX, MS |
|---|---|---|---|---|---|---|---|
cs00000int_0001 | 100 | 0.364 | 6 | 1005 | 2466 | 2689 | 16047 |
cs00000int_0002 | 100 | 0.397 | 0 | 1019 | 2212 | 2583 | 2900 |
Test 3. 3 Virtual users working parallel on 3 tenants
Tenant | TTL REQ, COUNT | THRGHPT, REQ/SEC | ERRORS, COUNT | MIN, MS | MEDIAN, MS | PCT95, MS | MAX, MS |
|---|---|---|---|---|---|---|---|
cs00000int_0001 | 100 | 0.334 | 4 | 530 | 2280 | 2641 | 3304 |
cs00000int_0002 | 100 | 0.329 | 1 | 1001 | 2383 | 2567 | 15985 |
cs00000int_0003 | 100 | 0.385 | 2 | 520 | 1910 | 2160 | 2955 |
Test 4. 4 Virtual users working parallel on 4 tenants
Tenant | TTL REQ, COUNT | THRGHPT, REQ/SEC | ERRORS, COUNT | MIN, MS | MEDIAN, MS | PCT95, MS | MAX, MS |
|---|---|---|---|---|---|---|---|
cs00000int_0001 | 100 | 0.313 | 4 | 1096 | 2357 | 2661 | 3437 |
cs00000int_0002 | 100 | 0.31 | 0 | 973 | 2254 | 2584 | 3237 |
cs00000int_0003 | 100 | 0.357 | 3 | 517 | 1884 | 2117 | 2724 |
cs00000int_0004 | 100 | 0.321 | 1 | 1098 | 2283 | 2916 | 3145 |
Additional tests were performed to check the duration of SLI with the next parameters inventory.sharing.di.status.poll.interval.seconds= 2 and inventory.sharing.di.status.poll.number = 5(default)
Test 1.a 1 Virtual user working sequentially on each of the 4 tenants.
Mod-inventory parameters | di.status.poll.interval.seconds=1 di.status.poll.number=5 | di.status.poll.interval.seconds=2 di.status.poll.number=5 | di.status.poll.interval.seconds=3 di.status.poll.number=5 | ||||
|---|---|---|---|---|---|---|---|
Tenant | TTL REQ, COUNT | MEDIAN, MS | PCT95, MS | MEDIAN, MS | PCT95, MS | MEDIAN, MS | PCT95, MS |
cs00000int_0001 | 100 | 2255 | 2611 | 3348 | 3873 | 4227 | 4672 |
cs00000int_0002 | 100 | 2218 | 2579 | 3324 | 3784 | 3820 | 4702 |
cs00000int_0003 | 100 | 2076 | 2150 | 2980 | 3184 | 3956 | 4146 |
cs00000int_0004 | 100 | 1940 | 2183 | 2669 | 2951 | 3926 | 4145 |
Test 2.a 2 Virtual users working parallel on 2 tenants.
Mod-inventory parameters | di.status.poll.interval.seconds=1 di.status.poll.number=5 | di.status.poll.interval.seconds=2 di.status.poll.number=5 | di.status.poll.interval.seconds=3 di.status.poll.number=5 | ||||
|---|---|---|---|---|---|---|---|
Tenant | TTL REQ, COUNT | MEDIAN, MS | PCT95, MS | MEDIAN, MS | PCT95, MS | MEDIAN, MS | PCT95, MS |
cs00000int_0001 | 100 | 2466 | 2689 | 3190 | 3725 | 4107 | 4922 |
cs00000int_0002 | 100 | 2212 | 2583 | 3013 | 3762 | 3934 | 4816 |
Test 3.a 3 Virtual users working parallel on 2 tenants.
Mod-inventory parameters | di.status.poll.interval.seconds=1 di.status.poll.number=5 | di.status.poll.interval.seconds=2 di.status.poll.number=5 | di.status.poll.interval.seconds=3 di.status.poll.number=5 | ||||
|---|---|---|---|---|---|---|---|
Tenant | TTL REQ, COUNT | MEDIAN, MS | PCT95, MS | MEDIAN, MS | PCT95, MS | MEDIAN, MS | PCT95, MS |
cs00000int_0001 | 100 | 2280 | 2641 | 3017 | 3701 | 4020 | 4639 |
cs00000int_0002 | 100 | 2383 | 2567 | 3340 | 3753 | 3734 | 4710 |
cs00000int_0003 | 100 | 1910 | 2160 | 2939 | 3154 | 3891 | 4222 |
Resource Utilization Test 1,2,3,4.
Below are the resource utilization graphs for all tests.
Memory Utilization
Memory usage was stable over 4 tests no memory leak is suspected for all modules, on the graph there are 10 most memory-consuming services.
Service CPU Utilization
CPU utilization increased only during the SLI process and all modules came back to default numbers after all SLI was finished.
RDS CPU Utilization
For 1VU avarage RDS CPU Utilization was asbout 14% for all 4 tenants. 2VU ~ 15%; 3VU~17% and 4VU ~20%.
RDS Database Connections
The average number of DB connections before the test was 450. SLI for 1VU~510; SLI for 2VU~520; SLI for 3VU~530; SLI for 4VU~540;
Average active sessions (AAS)
Database load sliced by SQL
Errors
Failed request response. All of the failed requests have the same errors.
{"errors":[{"message":"ERROR: duplicate key value violates unique constraint \"uq_instance_id_source_tenant_id_target_tenant_id\"\n Detail: Key (instance_id, source_tenant_id, target_tenant_id)=(cf2a6947-e1bb-4e8f-ad43-f3ecd600b8f4, cs00000int_0004, cs00000int) already exists.","type":"-1","code":"VALIDATION_ERROR"}]}
Appendix
Environment: PCON
Record parameters, on each of the Тenants
Tenant cs00000int_0001: Number of shared instances 1
695139 and not shared
606035.
Source
MARC = 2185072 and
FOLIO = 116102;
Tenant cs00000int_0002: Number of shared instances 1695139 and not shared 1009666. Source MARC = 2559671 and
FOLIO= 145134;
Tenant cs00000int_0003: Number of shared instances 1695139 and not shared 800515. Source MARC = 2380417 and FOLIO= 115237;
Tenant cs00000int_0004: Number of shared instances 1695139 and not shared 787757. Source MARC = 2367659 and FOLIO= 115237;
PTF -environment ncp5
10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instance, writer I
nstance class
db.r6g.xlarge vCPU 4
RAM
32 GB
Infrastructure
Mod-inventory:20.1.7-SNAPSHOT is a release of mod-inventory:20.1.6 + changes to add two configuration options. Parameters could be changes from task definition
"name":"JAVA_OPTS", -Dinventory.sharing.di.status.poll.interval.seconds=2