Overview
The purpose of the document is to assess reindexing performance on a consortium environment. Calculate reindex time and size of index.
Recommendations & Jiras
- Original ticket to test: - PERF-635Getting issue details... STATUS
- Additional info ElasticSearch Reindex Performance Recommendations
Test Summary
- Reindexing process takes
- 3 hours for 3 tenants in parallel
- 2 hours for central tenant reindexing (1.2M instances)
- 1 hour for secondary tenant (353K instances)
- 20 minutes for secondary tenant (202K instances)
Reindexing on 3 tenants in parallel takes the same time as reindexing on main tenant only. If to start in sequential order from secondary tenants it may take more time for main tenant afterwards (25 minutes more). The reason may be less resource utilization from CPU side (mod-search-44% against 38% and mod-inventory-storage - 20% against 19%).
Comparing interim results of indexing size it was found that the major time during reindexing was taken by contributor and instance_subject indexing (at least 7 hours out of 9 hours 45 min).
Test Runs /Results
Test # | Instances number | Test Conditions reindexing on Poppy release, consortium environment | Duration * | Notes |
1. 2023-11-28 09:20-12:20 UTC | 1766108 | In parallel: 3 tenants | 3 hours |
|
2. 2023-11-29 08:50 - 10:50 UTC | 1212927 | Sequential: cs00000int | 2 hours | |
3. 2023-11-29 14:05 - 15:05 UTC | 353179 | Sequential: cs00000int_0001 | 1 hour | |
4. 2023-11-2915:22 - 15:42 UTC | 200002 | Sequential: cs00000int_0002 | 20 min |
*Duration depends not only on instance number, but also it's type (source). Data details can be found here: Datastructure
Indexing size
In parallel: 3 tenants | |||||||||
---|---|---|---|---|---|---|---|---|---|
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size |
green | open | pcon_instance_subject_cs00000int | oo6lG2KjRBm68SlF4cQf-A | 4 | 2 | 851506 | 188613 | 3.4gb | 1.1gb |
green | open | pcon_contributor_cs00000int | D9izLpcOQWmqZFJhl6kyyA | 4 | 2 | 855300 | 231525 | 1.8gb | 685mb |
green | open | pcon_authority_cs00000int | 5CbWdgFrQSSBRlcNHPk7-A | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | pcon_instance_cs00000int | 6i5vkM6kRHOOu4ahRqESkA | 4 | 2 | 1204190 | 1498 | 23.7gb | 7.8gb |
Results from get-request for reindex monitoring:
Tenant | Reindex id | Get request reindex |
---|---|---|
cs00000int | 5dfca883-6236-438c-b4a4-bfd2274cdc0b | 1212891 |
cs00000int_001 | 86a89e41-2858-4a9c-9e58-578ccf677413 | 353179 |
cs00000int_002 | 01ec9821-aca8-4a28-8c02-844807191ca9 | 200002 |
SUM | 1766072 |
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size |
---|---|---|---|---|---|---|---|---|---|
Sequential: cs00000int | |||||||||
green | open | pcon_instance_subject_cs00000int | XMqrUxTkTNKz4rIS1fi9Ug | 4 | 2 | 862411 | 106300 | 3.2gb | 1gb |
green | open | pcon_contributor_cs00000int | TGA0HARfRLGXMDuFyQcmfg | 4 | 2 | 835533 | 162792 | 1.5gb | 560.8mb |
green | open | pcon_authority_cs00000int | 5CbWdgFrQSSBRlcNHPk7-A | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | pcon_instance_cs00000int | R_Goc_w2T8CRnfvfbbsxHg | 4 | 2 | 1212891 | 0 | 25.9gb | 8.5gb |
Sequential: cs00000int_0001 | |||||||||
green | open | pcon_instance_subject_cs00000int | XMqrUxTkTNKz4rIS1fi9Ug | 4 | 2 | 865366 | 83466 | 4.6gb | 1.5gb |
green | open | pcon_contributor_cs00000int | TGA0HARfRLGXMDuFyQcmfg | 4 | 2 | 839478 | 90380 | 2.3gb | 806mb |
green | open | pcon_authority_cs00000int | 5CbWdgFrQSSBRlcNHPk7-A | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | pcon_instance_cs00000int | R_Goc_w2T8CRnfvfbbsxHg | 4 | 2 | 1222891 | 161584 | 28.5gb | 9.4gb |
Sequential: cs00000int_0002 | |||||||||
green | open | pcon_instance_subject_cs00000int | XMqrUxTkTNKz4rIS1fi9Ug | 4 | 2 | 955467 | 126534 | 4.5gb | 1.2gb |
green | open | pcon_contributor_cs00000int | TGA0HARfRLGXMDuFyQcmfg | 4 | 2 | 929045 | 189949 | 2.7gb | 1gb |
green | open | pcon_authority_cs00000int | 5CbWdgFrQSSBRlcNHPk7-A | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | pcon_instance_cs00000int | R_Goc_w2T8CRnfvfbbsxHg | 4 | 2 | 1422889 | 161588 | 32.6gb | 10.9gb |
Results from get-request for reindex monitoring:
Tenant | Reindex id | Get request reindex |
---|---|---|
cs00000int | bb944cf4-b99f-4aa3-b13e-f5c92dc630ed | 1212891 |
cs00000int_001 | cf943d63-50db-4085-9629-783d7acdc67b | 353179 |
cs00000int_002 | c62e2662-6a21-47fd-9bb8-eea11364c2c1 | 200002 |
SUM | 1766072 |
Service CPU Utilization
Test #1 (3 tenants in parallel)
Maximum CPU utilization:
nginx-okapi - 413%
mod-inventory-storage - 95%
okapi - 73%
mod-search - 27%
Test #2 (cs00000int main tenant, sequental)
Maximum CPU utilization:
nginx-okapi - 332%
mod-inventory-storage - 102%
okapi - 63%
mod-search - 27%
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequental)
Maximum CPU utilization:
nginx-okapi - 285%
mod-inventory-storage - 98%
okapi - 63%
mod-search - 13%
Memory Utilization
Test #1 (3 tenants in parallel)
Memory utilization:
mod-search - 38% → 50%
mod-inventory-storage - 11% → 31%
Test #2 (cs00000int main tenant, sequental)
Memory utilization:
mod-search - 37% → 50%
mod-inventory-storage - 13% → 21%
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequental)
Memory utilization:
mod-search - 33% → 49%
mod-inventory-storage - 26% → 31%
DB CPU Utilization
Test #1 (3 tenants in parallel)
Maximum DB CPU utilization - 37%
Test #2 (cs00000int main tenant, sequental)
Maximum DB CPU utilization - 56%
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequental)
Maximum DB CPU utilization - 36%
DB Connections
Test #1 (3 tenants in parallel)
Test #2 (cs00000int main tenant, sequental)
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequental)
Open Search CPU
Test #1 (3 tenants in parallel)
Maximum CPU utilization - 57%
Test #2 (cs00000int main tenant, sequental)
Maximum CPU utilization - 47%
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequental)
Maximum CPU utilization - 53%
Open Search Ibdexing Data Rate
Test #1 (3 tenants in parallel)
Test #2 (cs00000int main tenant, sequental)
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequental)
Open Search Ibdexing Latency
Test #1 (3 tenants in parallel)
Test #2 (cs00000int main tenant, sequental)
Test #3 and #4 (cs00000int_0001, cs00000int_0002 secondary tenants, sequental)
Appendix
Infrastructure
PTF-environment pcon
- 10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
- 2 instances of db.r6g.xlarge database instances, one reader and one writer
- MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Data structure
Tenant | Source | Instance number | Instances sum |
---|---|---|---|
cs00000int | FOLIO | 115035 | 1212927 |
MARC | 1097892 | ||
cs00000int_0001 | CONSORTIUM-FOLIO | 38712 | 353179 |
CONSORTIUM-MARC | 304467 | ||
FOLIO | 1000 | ||
MARC | 9000 | ||
cs00000int_0002 | CONSORTIUM-MARC | 4 | 200002 |
FOLIO | 30000 | ||
MARC | 169998 |
Methodology/Approach
- Use consortium cluster for testing (pcon in our case).
- Configure the environment in accordance with Infrastructure parameters to the one that FSE commonly uses.
- Run reindex, get the results for indexing time, index size. Use Steps for testing process#Reindex for details.
- Compare results.