IN PROGRESS
...
The purpose for this test is to define how concurrent DI affect duration of DI jobs on the central tenant and to check possible issues during smoke test with 50k DI Create job running concurrently on all 3 tenants.
Ticket:
Jira Legacy | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Summary
Data import duration approximately doubling in 10k and 25k tests when increasing the number of concurrent jobs on different tenants. This trend
Test Runs
Test # | Scenario | Load level | Comment |
---|---|---|---|
1 | DI MARC Bib Create | 10K, 25K concurrently (with 5 min pause) on 2 and 3 tenants | |
2 | DI MARC Bib Update | 10K, 25K concurrently (with 5 min pause) on 2 and 3 tenants | |
3 | DI MARC Bib Create | 50k concurrently on 3 tenants - smoke test |
...
Comparison
Service CPU Utilization
Expand | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
DI Create jobs
DI Update jobs
Service Memory Utilization
Expand | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
DI Create jobs
DI Update jobs
DB CPU Utilization
RDS CPU utilization was 97% for all Create jobs and 94% for Update jobs
Create jobs
Update jobs
DB Connections
DB connections for 2 tenants Create jobs - 710, for 3 tenants Create jobs - 870
DB connections for 2 tenants Create jobs - 630, for 3 tenants Create jobs - 785
DB connections needed for every additional job processing concurrently on different tenant - 150.
Create jobs
Update jobs
DB load
Appendix
Errors & Exceptions
...
Expand | |||||
---|---|---|---|---|---|
| |||||
index [pcp1_instance_fs09000000], id [f7aea9b8-614e-4050-9dbd-e2f8a884c06b], message [OpenSearchException[OpenSearch exception [type=circuit_breaking_exception, reason=[parent] Data too large, data for [indices:data/write/bulk[s]] would be [16502737514/15.3gb], which is larger than the limit of [16320875724/15.1gb], real usage: [16499671264/15.3gb], new bytes reserved: [3066250/2.9mb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=3103382/2.9mb]]]]
org.folio.search.exception.SearchOperationException: Failed to perform elasticsearch request [index=pcp1_contributor_fs09000000, type=bulkApi, message: 30,000 milliseconds timeout on connection http-outgoing-265 [ACTIVE]]
WARN essageBatchProcessor Failed to process batch, attempting to process resources one by one
|
Infrastructure
PTF -environment pcp1
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
2 database instances, writer/reader
Name Memory GIB vCPUs max_connections db.r6g.xlarge
32 GiB 4 vCPUs 2731 - MSK tenant
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
...