mod-search: Test Reindexing of Instances (Poppy) multi-tenant (non-consortia)
Overview
- The purpose of the document is to to identify how ECS-related changes are affecting reindexing on multi tenant environment. Calculate reindex time and size of index.
Recommendations & Jiras
- Original ticket to test: PERF-635.
- Additional info ElasticSearch Reindex Performance Recommendations
- Consider optimization of contributor and instance_subject reindexing
Test Summary
Reindexing on 3 tenants in parallel takes the same time as reindexing on main tenant only. If to start in sequential order from secondary tenants it may take more time for main tenant afterwards (25 minutes more). The reason may be less resource utilization from CPU side (mod-search-44% against 38% and mod-inventory-storage - 20% against 19%).
Comparing interim results of indexing size it was found that the major time during reindexing was taken by contributor and instance_subject indexing (at least 7 hours out of 9 hours 45 min).
Test Runs /Results
Test # | Instances number | Test Conditions reindexing on Poppy release | Duration | Notes |
1. 2023_10_12 08:47 - 18:32 UTC | In parallel: 3 tenants | 9 hours 45 min |
| |
2. 2023-10-13 09:04 - 09:07 UTC | 100032 | In sequential: fs09000002 | 3 min | |
3. 2023-10-13 09:15 - 09:18 UTC | 100055 | In sequential: fs09000003 | 3 min | |
4. 2023-10-13 09:29 - 19:37 UTC | 10,733,729 | In sequential: fs09000000 | 10 hours 8 min |
Indexing size
Test #1 Here I would like to share interim results during reindexing on 3 tenants. In 30 minutes after start we observe that all instances on secondary tenants have been already indexed except of ocp2_contributor_fs09000002 without 1 record. In 2 hours after the start we observe that docs.count column for instances on main tenant has all records but reindexing goes on with previously mentioned contributor_fs09000000 and instance_subject_fs09000000.
So we may conclude that major of time for reindexing goes to contributor indexing (at least 7 hours out of 9 hours 45 min).
ALL tenants | AFTER START: < date: Thu, 12 Oct 2023 09:18:08 GMT | < date: Thu, 12 Oct 2023 10:01:46 GMT | < date: Thu, 12 Oct 2023 10:53:10 GMT | AFTER FINISH: < date: Fri, 13 Oct 2023 08:07:06 GMT | ||||||||||||||||||||||||||
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size | pri | rep | docs.count | docs.deleted | store.size | pri.store.size | pri | rep | docs.count | docs.deleted | store.size | pri.store.size | pri | rep | docs.count | docs.deleted | store.size | pri.store.size | |||
green | open | ocp2_instance_fs09000000 | NyClr4AYR9CUqEPeQOuzkg | 4 | 2 | 4215098 | 0 | 60.8gb | 20.5gb | 4 | 2 | 8161694 | 0 | 105.5gb | 35.1gb | 4 | 2 | 10733729 | 0 | 140.1gb | 46.5gb | 4 | 2 | 10733729 | 0 | 124.2gb | 41.3gb | |||
green | open | ocp2_instance_fs09000003 | QoBX9DqsR7S8XLJ-VpFPUA | 4 | 2 | 100055 | 3 | 554.8mb | 181.5mb | 4 | 2 | 100055 | 3 | 554.8mb | 181.5mb | 4 | 2 | 100055 | 3 | 554.8mb | 181.5mb | 4 | 2 | 100055 | 3 | 554.8mb | 181.5mb | |||
green | open | ocp2_instance_fs09000002 | 86NVpQwqSAWeUmmd1506Gg | 4 | 2 | 100032 | 0 | 550.3mb | 180mb | 4 | 2 | 100032 | 0 | 550.3mb | 180mb | 4 | 2 | 100032 | 0 | 550.3mb | 180mb | 4 | 2 | 100032 | 0 | 550.3mb | 180mb | |||
green | open | .kibana_1 | 9Q4bvyKCRpiiwcPNcLFs9g | 1 | 2 | 1 | 0 | 15.5kb | 5.1kb | 1 | 2 | 1 | 0 | 15.5kb | 5.1kb | 1 | 2 | 1 | 0 | 15.5kb | 5.1kb | 1 | 2 | 1 | 0 | 15.5kb | 5.1kb | |||
green | open | ocp2_authority_fs09000003 | RGz9CTCoT7CoT3H3ge5gIA | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | |||
green | open | .opensearch-observability | kPAJ8TqaR06AQZFYekHeyA | 1 | 2 | 0 | 0 | 624b | 208b | 1 | 2 | 0 | 0 | 624b | 208b | 1 | 2 | 0 | 0 | 624b | 208b | 1 | 2 | 0 | 0 | 624b | 208b | |||
green | open | ocp2_authority_fs09000002 | vFMI3x8bTuqp5HkAL3jk8A | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | |||
green | open | ocp2_authority_fs09000000 | YCS4y0GFTfy3EpbJm99X0g | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | |||
green | open | ocp2_contributor_fs09000000 | rmhGKwfISJiayaxyO8C03w | 4 | 2 | 1967140 | 192787 | 8.1gb | 2.5gb | 4 | 2 | 3057421 | 640057 | 11.8gb | 4.2gb | 4 | 2 | 3249679 | 711694 | 36.1gb | 12.8gb | 4 | 2 | 4076098 | 332226 | 18.8gb | 5.1gb | |||
green | open | ocp2_instance_subject_fs09000000 | N4X98pwsTRGs9ZiO8ia-0A | 4 | 2 | 1837177 | 375006 | 18.2gb | 6.1gb | 4 | 2 | 2643916 | 541576 | 32.5gb | 11.2gb | 4 | 2 | 3848684 | 800020 | 14gb | 4.4gb | 4 | 2 | 4633985 | 614340 | 12.1gb | 4.1gb | |||
green | open | ocp2_contributor_fs09000003 | aBDxrr83SpaxBlqORzFdUQ | 4 | 2 | 372 | 81 | 1gb | 249.2mb | 4 | 2 | 372 | 62 | 517.4mb | 169.2mb | 4 | 2 | 372 | 62 | 391.8mb | 43.6mb | 4 | 2 | 372 | 62 | 391.8mb | 43.6mb | |||
green | open | ocp2_instance_subject_fs09000002 | 0CwHeq53T3yHkY62fDoNyw | 4 | 2 | 90 | 0 | 608.2kb | 58.2kb | 4 | 2 | 90 | 0 | 608.2kb | 58.2kb | 4 | 2 | 90 | 0 | 608.2kb | 58.2kb | 4 | 2 | 90 | 0 | 608.2kb | 58.2kb | |||
green | open | ocp2_contributor_fs09000002 | 0XXn4kfrTAG5Wpdouwyaxg | 4 | 2 | 368 | 100 | 600.2mb | 194.9mb | 4 | 2 | 369 | 383 | 1.1gb | 338.9mb | 4 | 2 | 369 | 161 | 460.8mb | 90.5mb | 4 | 2 | 369 | 161 | 460.8mb | 90.5mb | |||
green | open | ocp2_instance_subject_fs09000003 | Dsibsz4NQ2WxvQQviAO4zA | 4 | 2 | 95 | 0 | 391.4kb | 59.5kb | 4 | 2 | 95 | 0 | 391.4kb | 59.5kb | 4 | 2 | 95 | 0 | 391.4kb | 59.5kb | 4 | 2 | 95 | 0 | 391.4kb | 59.5kb |
Test #2
fs09000002 | |||||||||
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size |
green | open | ocp2_instance_fs09000002 | Pd2I7_Q8Rj2bNgSP4vSzIg | 4 | 2 | 100032 | 0 | 579mb | 190.3mb |
green | open | ocp2_authority_fs09000002 | vFMI3x8bTuqp5HkAL3jk8A | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | ocp2_instance_subject_fs09000002 | 3vfuyqMBRlGtmblChed_LA | 4 | 2 | 94 | 5 | 413.9kb | 136.5kb |
green | open | ocp2_contributor_fs09000002 | xgXcKSDWR3W2IYC6JiFqSA | 4 | 2 | 369 | 1238 | 696.8mb | 165.4mb |
Test #3 | |||||||||
fs09000003 | |||||||||
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size |
green | open | ocp2_instance_fs09000003 | YjbGBqXpRG6ap03WlRFokw | 4 | 2 | 100055 | 0 | 579.8mb | 190.8mb |
green | open | ocp2_authority_fs09000003 | RGz9CTCoT7CoT3H3ge5gIA | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | ocp2_contributor_fs09000003 | O4SdQSXqT_y2UapCuer6GQ | 4 | 2 | 372 | 61 | 853mb | 259.1mb |
green | open | ocp2_instance_subject_fs09000003 | 2P6hQELJTWKYqvJ5l-5jTw | 4 | 2 | 98 | 6 | 413.6kb | 136.3kb |
Test #4 | |||||||||
fs09000000 | |||||||||
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size |
green | open | ocp2_instance_fs09000000 | VIviDunkSkinkiZJuMjKUQ | 4 | 2 | 10733729 | 0 | 119.6gb | 39.9gb |
green | open | ocp2_authority_fs09000000 | YCS4y0GFTfy3EpbJm99X0g | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | ocp2_contributor_fs09000000 | 2kdpF4fZTaGtFmG_ypXq6A | 4 | 2 | 4666659 | 612503 | 12.2gb | 4.1gb |
green | open | ocp2_instance_subject_fs09000000 | q3wktVUgTsmElTq6Iv-p2A | 4 | 2 | 4097011 | 658839 | 17.8gb | 6.1gb |
Memory Usage
Test #1
mod-search max. 70% during an hour, 60% - during second hour
mod-inventory-storage avr. 28%
Test #2, 3 secondary tenants
mod-search max. 50%
mod-inventory-storage avr. 26%
Test #4 main tenant
mod-search max. 70% during an hour, 60% - during second hour
mod-inventory-storage avr. 28%
Memory consumption
Test # | Module | Initial spike |
---|---|---|
1 | mod-search | 70% |
mod-inventory-storage | 28% | |
2 | mod-search | 50% |
mod-inventory-storage | 26% | |
3 | mod-search | 50% |
mod-inventory-storage | 26% | |
4 | mod-search | 70% |
mod-inventory-storage | 28% |
CPU Utilization
Test #1
mod-search - 44%
mod-inventory-storage - 20%
Test #2, 3 secondary tenants
mod-search - 10%
mod-inventory-storage - 10%
Test #4 main tenant
mod-search - 38%
mod-inventory-storage - 19%
CPU utilization
Test # | Module | Initial spike |
---|---|---|
1 | mod-search | 44% |
mod-inventory-storage | 20% | |
2 | mod-search | 10% |
mod-inventory-storage | 10% | |
3 | mod-search | 10% |
mod-inventory-storage | 10% | |
4 | mod-search | 38% |
mod-inventory-storage | 19% |
RDS CPU Utilization
Test #1
Utilization max. - 72%
Test #2, 3 secondary tenants
Utilization max. - 42%
Test #4 main tenant
Utilization max. - 65%
Open Search KPIs
CPU utilization
Test #1
Max. 89%
Test #2, 3 secondary tenants
Max. 32%
Test #4 main tenant
Max. 88%
Indexing rate
Test #1
Test #2, 3 secondary tenants
Test #4 main tenant
Indexing latency
Test #1
Test #2, 3 secondary tenants
Test #4 main tenant
Appendix
Infrastructure
PTF -environment ocp2
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
- 2 instances of db.r6g.xlarge database instances, one reader, and one writer
- MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Number of instances
ocp2 -10M
Kafka partitioning
Poppy | |
---|---|
inventory.item | 50 |
inventory.instance | 50 |
inventory.holdings-record | 50 |
inventory.bound-with | 50 |
inventory.authorit | 50 |
search.instance-contributor | 50 |
Modules memory and CPU parameters:
Modules | Version | Task Definition | Running Tasks | CPU | Memory | MemoryReservation | MaxMetaspaceSize | Xmx |
---|---|---|---|---|---|---|---|---|
mod-search Poppy | 3.0.0-SNAPSHOT.156 | 6 | 8 | 2048 | 2592 | 2480 | 512 | 1440 |
mod-inventory-storage Poppy | 26.1.0-SNAPSHOT.685 | 11 | 2 | 2048 | 4096 | 3690 | 512 | 3076 |
mod-search Orchid | 2.1.0-SNAPSHOT.108 | 3 | 8 | 400 | 2592 | 2480 | 1024 | 1440 |
mod-inventory-storage Orchid | 26.1.0-SNAPSHOT.644 | 3 | 2 | 1024 | 1952 | 2208 | 512 | 1440 |
Methodology/Approach
- Use PTF's "Bugfest" Poppy cluster, which has 10M records, to test with (2 times)
- Configure the environment in accordance with Infrastructure parameters to the one that FSE commonly uses
- Reindex on the new Poppy environment, get the results for indexing time, index size
- Create a comparison of results between sequential order and parallel start.