Table of Contents |
---|
The purpose of the document is to assess reindexing performance on a consortium environment. Calculate reindex time and size of index.
Recommendations & Jiras
- Original ticket to test:
Jira Legacy server System JiraJIRA serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key PERF-635 - Additional info ElasticSearch Reindex Performance Recommendations
Test Summary
- Reindexing process for consortium environment takes:
- 3 hours for 3 tenants in parallel (1.7M instances);
- 2 hours for central tenant reindexing (1.2M instances);
- 1 hour for secondary tenant reindexing (353K instances);
- 20 minutes for secondary tenant reindexing (202K instances).
Reindexing on 3 tenants in parallel takes the same time as reindexing on main tenant only. If to start in sequential order from secondary tenants it may take more time for main tenant afterwards (25 minutes more). The reason may be less resource utilization from CPU side (mod-search-44% against 38% and mod-inventory-storage - 20% against 19%).
...
- Duration depends not only on instance number, but also its type (source). cs00000int_0001 tenant has a lot of shared instances (343K) but much less unshared ones (10K) compared to cs00000int_0002 (200K unshared). Data details can be found here: Datastructure
- High CPU utilization is observed on nginx-okapi module - up to 413% during 3 tenants test.
- CPU utilization for mod-inventory-storage reached 102% during test on central tenant. mod-search CPU utilization was about 13-27% during all the tests.
- No memory leaks suspected.
Test Runs /Results
Test # | Instances number | Test Conditions reindexing on Poppy release, consortium environment | Duration * | Notes |
1. 2023-11-28 09:20-12:20 UTC | 1766108 | In parallel: 3 tenants | 3 hours |
|
2. 2023-11-29 08:50 - 10:50 UTC | 1212927 | Sequential: cs00000int | 2 hours | |
3. 2023-11-29 14:05 - 15:05 UTC | 353179 | Sequential: cs00000int_0001 | 1 hour | |
4. 2023-11-2915:22 - 15:42 UTC | 200002 | Sequential: cs00000int_0002 | 20 min |
*Duration depends not only on instance number, but also it's type (source). Data details can be found here: Datastructure
Indexing size
...
All the data from the tables below were capruted after each test was finished.
In parallel: 3 tenants | |||||||||
---|---|---|---|---|---|---|---|---|---|
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size |
green | open | pcon_instance_subject_cs00000int | oo6lG2KjRBm68SlF4cQf-A | 4 | 2 | 851506955467 | 188613 | 3.4gb | 1.1gb |
green | open | pcon_contributor_cs00000int | D9izLpcOQWmqZFJhl6kyyA | 4 | 2 | 855300929045 | 231525 | 1.8gb | 685mb |
green | open | pcon_authority_cs00000int | 5CbWdgFrQSSBRlcNHPk7-A | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | pcon_instance_cs00000int | 6i5vkM6kRHOOu4ahRqESkA | 4 | 2 | 12041901422889 | 1498 | 23.7gb | 7.8gb |
Results from get-request for reindex monitoring:
...
Test #2 (cs00000int main tenant, sequentalsequential)
Maximum CPU utilization:
...
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequentalsequential)
Maximum CPU utilization:
...
Test #2 (cs00000int main tenant, sequentalsequential)
Memory utilization:
mod-search - 37% → 50%
...
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequentalsequential)
Memory utilization:
mod-search - 33% → 49%
...
Test #2 (cs00000int main tenant, sequentalsequential)
Maximum DB CPU utilization - 56%
...
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequentalsequential)
Maximum DB CPU utilization - 36%
...
Test #2 (cs00000int main tenant, sequentalsequential)
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequentalsequential)
Open Search CPU
Test #1 (3 tenants in parallel)
...
Test #2 (cs00000int main tenant, sequentalsequential)
Maximum CPU utilization - 47%
...
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequentalsequential)
Maximum CPU utilization - 53%
Open Search
...
Indexing Data Rate
Test #1 (3 tenants in parallel)
...
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequentalsequential)
Open Search
...
Indexing Latency
...
Test #1 (3 tenants in parallel)
Test #2 (cs00000int main tenant, sequental)
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequental)
Appendix
Infrastructure
PTF-environment pcon
- 10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
- 2 instances of db.r6g.xlarge database instances, one reader and one writer
- MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Data structure
...
)
Test #2 (cs00000int main tenant, sequential)
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequential)
Appendix
Infrastructure
PTF-environment pcon
- 10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
- 2 instances of db.r6g.xlarge database instances, one reader and one writer
- MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Data structure
Tenant | Source | Instance number | Instances sum |
---|---|---|---|
cs00000int | FOLIO | 115035 | 1212927 |
MARC | 1097892 | ||
cs00000int_0001 | CONSORTIUM-FOLIO | 38712 | 353179 |
CONSORTIUM-MARC | 304467 | ||
FOLIO | 1000 | ||
MARC | 9000 | ||
cs00000int_0002 | CONSORTIUM-MARC | 4 | 200002 |
FOLIO | 30000 | ||
MARC | 169998 |
Module versions
Methodology/Approach
- Use consortium cluster for testing (pcon in our case).
- Configure the environment in accordance with Infrastructure parameters parameters to the one that FSE commonly uses.
- Run reindex, get the results for indexing time, index size. Use Steps for testing process#Reindex for details.
- Compare results.