Overview
- The purpose of the document is to to identify how ECS-related changes are affecting reindexing on multi tenant environment. Calculate reindex time and size of index.
Recommendations & Jiras
- Original ticket to test: PERF-635.
- Additional info ElasticSearch Reindex Performance Recommendations
Test Summary
Test Runs /Results
Test # | DB instances | Test Conditions reindexing on Poppy release | Duration | Notes |
1. 2023_10_12 08:47 - 18:32 UTC | In parallel: 3 tenants | 9 hours 45 min |
| |
2. 2023-10-13 09:04 - 09:07 UTC | 100032 | In sequential: fs09000002 | 3 min | |
3. 2023-10-13 09:15 - 09:18 UTC | 100055 | In sequential: fs09000003 | 3 min | |
4. 2023-10-13 09:29 - 19:37 UTC | 10,733,729 | In sequential: fs09000000 | 10 hours 8 min |
Indexing size
Test #1 Here I would like to share interim results during reindexing on 3 tenants. In 30 minutes after start we observe that all instances on secondary tenants have been already indexed except of ocp2_contributor_fs09000002 without 1 record. In 2 hours after the start we observe that docs.count column for instances on main tenant has all records but reindexing goes on with previously mentioned contributor_fs09000000 and instance_subject_fs09000000.
So we may conclude that major of time for reindexing goes to contributor indexing (at least 7 hours out of 9 hours 45 min).
ALL tenants | AFTER START: < date: Thu, 12 Oct 2023 09:18:08 GMT | < date: Thu, 12 Oct 2023 10:01:46 GMT | < date: Thu, 12 Oct 2023 10:53:10 GMT | AFTER FINISH: < date: Fri, 13 Oct 2023 08:07:06 GMT | ||||||||||||||||||||||||||
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size | pri | rep | docs.count | docs.deleted | store.size | pri.store.size | pri | rep | docs.count | docs.deleted | store.size | pri.store.size | pri | rep | docs.count | docs.deleted | store.size | pri.store.size | |||
green | open | ocp2_instance_fs09000000 | NyClr4AYR9CUqEPeQOuzkg | 4 | 2 | 4215098 | 0 | 60.8gb | 20.5gb | 4 | 2 | 8161694 | 0 | 105.5gb | 35.1gb | 4 | 2 | 10733729 | 0 | 140.1gb | 46.5gb | 4 | 2 | 10733729 | 0 | 124.2gb | 41.3gb | |||
green | open | ocp2_instance_fs09000003 | QoBX9DqsR7S8XLJ-VpFPUA | 4 | 2 | 100055 | 3 | 554.8mb | 181.5mb | 4 | 2 | 100055 | 3 | 554.8mb | 181.5mb | 4 | 2 | 100055 | 3 | 554.8mb | 181.5mb | 4 | 2 | 100055 | 3 | 554.8mb | 181.5mb | |||
green | open | ocp2_instance_fs09000002 | 86NVpQwqSAWeUmmd1506Gg | 4 | 2 | 100032 | 0 | 550.3mb | 180mb | 4 | 2 | 100032 | 0 | 550.3mb | 180mb | 4 | 2 | 100032 | 0 | 550.3mb | 180mb | 4 | 2 | 100032 | 0 | 550.3mb | 180mb | |||
green | open | .kibana_1 | 9Q4bvyKCRpiiwcPNcLFs9g | 1 | 2 | 1 | 0 | 15.5kb | 5.1kb | 1 | 2 | 1 | 0 | 15.5kb | 5.1kb | 1 | 2 | 1 | 0 | 15.5kb | 5.1kb | 1 | 2 | 1 | 0 | 15.5kb | 5.1kb | |||
green | open | ocp2_authority_fs09000003 | RGz9CTCoT7CoT3H3ge5gIA | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | |||
green | open | .opensearch-observability | kPAJ8TqaR06AQZFYekHeyA | 1 | 2 | 0 | 0 | 624b | 208b | 1 | 2 | 0 | 0 | 624b | 208b | 1 | 2 | 0 | 0 | 624b | 208b | 1 | 2 | 0 | 0 | 624b | 208b | |||
green | open | ocp2_authority_fs09000002 | vFMI3x8bTuqp5HkAL3jk8A | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | |||
green | open | ocp2_authority_fs09000000 | YCS4y0GFTfy3EpbJm99X0g | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | 4 | 2 | 0 | 0 | 2.4kb | 832b | |||
green | open | ocp2_contributor_fs09000000 | rmhGKwfISJiayaxyO8C03w | 4 | 2 | 1967140 | 192787 | 8.1gb | 2.5gb | 4 | 2 | 3057421 | 640057 | 11.8gb | 4.2gb | 4 | 2 | 3249679 | 711694 | 36.1gb | 12.8gb | 4 | 2 | 4076098 | 332226 | 18.8gb | 5.1gb | |||
green | open | ocp2_instance_subject_fs09000000 | N4X98pwsTRGs9ZiO8ia-0A | 4 | 2 | 1837177 | 375006 | 18.2gb | 6.1gb | 4 | 2 | 2643916 | 541576 | 32.5gb | 11.2gb | 4 | 2 | 3848684 | 800020 | 14gb | 4.4gb | 4 | 2 | 4633985 | 614340 | 12.1gb | 4.1gb | |||
green | open | ocp2_contributor_fs09000003 | aBDxrr83SpaxBlqORzFdUQ | 4 | 2 | 372 | 81 | 1gb | 249.2mb | 4 | 2 | 372 | 62 | 517.4mb | 169.2mb | 4 | 2 | 372 | 62 | 391.8mb | 43.6mb | 4 | 2 | 372 | 62 | 391.8mb | 43.6mb | |||
green | open | ocp2_instance_subject_fs09000002 | 0CwHeq53T3yHkY62fDoNyw | 4 | 2 | 90 | 0 | 608.2kb | 58.2kb | 4 | 2 | 90 | 0 | 608.2kb | 58.2kb | 4 | 2 | 90 | 0 | 608.2kb | 58.2kb | 4 | 2 | 90 | 0 | 608.2kb | 58.2kb | |||
green | open | ocp2_contributor_fs09000002 | 0XXn4kfrTAG5Wpdouwyaxg | 4 | 2 | 368 | 100 | 600.2mb | 194.9mb | 4 | 2 | 369 | 383 | 1.1gb | 338.9mb | 4 | 2 | 369 | 161 | 460.8mb | 90.5mb | 4 | 2 | 369 | 161 | 460.8mb | 90.5mb | |||
green | open | ocp2_instance_subject_fs09000003 | Dsibsz4NQ2WxvQQviAO4zA | 4 | 2 | 95 | 0 | 391.4kb | 59.5kb | 4 | 2 | 95 | 0 | 391.4kb | 59.5kb | 4 | 2 | 95 | 0 | 391.4kb | 59.5kb | 4 | 2 | 95 | 0 | 391.4kb | 59.5kb |
Test #2
fs09000002 | |||||||||
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size |
green | open | ocp2_instance_fs09000002 | Pd2I7_Q8Rj2bNgSP4vSzIg | 4 | 2 | 100032 | 0 | 579mb | 190.3mb |
green | open | ocp2_authority_fs09000002 | vFMI3x8bTuqp5HkAL3jk8A | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | ocp2_instance_subject_fs09000002 | 3vfuyqMBRlGtmblChed_LA | 4 | 2 | 94 | 5 | 413.9kb | 136.5kb |
green | open | ocp2_contributor_fs09000002 | xgXcKSDWR3W2IYC6JiFqSA | 4 | 2 | 369 | 1238 | 696.8mb | 165.4mb |
Test #3 | |||||||||
fs09000003 | |||||||||
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size |
green | open | ocp2_instance_fs09000003 | YjbGBqXpRG6ap03WlRFokw | 4 | 2 | 100055 | 0 | 579.8mb | 190.8mb |
green | open | ocp2_authority_fs09000003 | RGz9CTCoT7CoT3H3ge5gIA | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | ocp2_contributor_fs09000003 | O4SdQSXqT_y2UapCuer6GQ | 4 | 2 | 372 | 61 | 853mb | 259.1mb |
green | open | ocp2_instance_subject_fs09000003 | 2P6hQELJTWKYqvJ5l-5jTw | 4 | 2 | 98 | 6 | 413.6kb | 136.3kb |
Test #4 | |||||||||
fs09000000 | |||||||||
health | status | index | uuid | pri | rep | docs.count | docs.deleted | store.size | pri.store.size |
green | open | ocp2_instance_fs09000000 | VIviDunkSkinkiZJuMjKUQ | 4 | 2 | 10733729 | 0 | 119.6gb | 39.9gb |
green | open | ocp2_authority_fs09000000 | YCS4y0GFTfy3EpbJm99X0g | 4 | 2 | 0 | 0 | 2.4kb | 832b |
green | open | ocp2_contributor_fs09000000 | 2kdpF4fZTaGtFmG_ypXq6A | 4 | 2 | 4666659 | 612503 | 12.2gb | 4.1gb |
green | open | ocp2_instance_subject_fs09000000 | q3wktVUgTsmElTq6Iv-p2A | 4 | 2 | 4097011 | 658839 | 17.8gb | 6.1gb |
Memory Usage
Test #1
mod-search max. 70% during an hour, 60% - during second hour
mod-inventory-storage avr. 28%
Test #2, 3 secondary tenants
mod-search max. 50%
mod-inventory-storage avr. 26%
Test #4 main tenant
mod-search max. 70% during an hour, 60% - during second hour
mod-inventory-storage avr. 28%
Memory consumption
Test # | Module | Initial spike |
---|---|---|
1 | mod-search | 70% |
mod-inventory-storage | 28% | |
2 | mod-search | 50% |
mod-inventory-storage | 26% | |
3 | mod-search | 50% |
mod-inventory-storage | 26% | |
4 | mod-search | 70% |
mod-inventory-storage | 28% |
CPU Utilization
Test #1
mod-search - 44%
mod-inventory-storage - 20%
Test #2, 3 secondary tenants
mod-search - 10%
mod-inventory-storage - 10%
Test #4 main tenant
mod-search - 10%
mod-inventory-storage - 10%
CPU utilization
Test # | Module | Initial spike |
---|---|---|
1 | mod-search | 44% |
mod-inventory-storage | 20% | |
2 | mod-search | 10% |
mod-inventory-storage | 10% | |
3 | mod-search | 10% |
mod-inventory-storage | 10% | |
4 | mod-search | 38% |
mod-inventory-storage | 19% |
RDS CPU Utilization
Test #1
Utilization max. - 72%
Test #2, 3 secondary tenants
Utilization max. - 42%
Test #4
Utilization max. - 65%
Open Search KPIs
Indexing rate Poppy
Test #2
Duration: 9 hours 47 min
Test #3,4
Test #5
Test #6
Indexing rate Orchid
Duration: 11 hours 10 min
Indexing latency Poppy
Test #2
After 7 hours of indexing latency grew from 35 ms to 55 ms during 1 minute
Max: 79.2 ms
Avr.: 39.6 ms
Test #3,4
Test #5
Test #6
Indexing latency Orchid
Appendix
Infrastructure
Tests #2,3,4
PTF -environment ocp2
- 8 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
- 2 instances of db.r6.xlarge database instances, one reader, and one writer
- MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 310 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Tests #5,6
PTF -environment ocp2
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
- 2 instances of db.r6g.xlarge database instances, one reader, and one writer
- MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Number of instances
ocp2 -10M
Kafka partitioning
Orchid | Poppy | |
---|---|---|
inventory.item | 50 | 50 |
inventory.instance | 50 | 50 |
inventory.holdings-record | 50 | 50 |
inventory.bound-with | 50 | 50 |
inventory.authorit | 50 | 50 |
search.instance-contributor | 50 | 50 |
Modules memory and CPU parameters: tests #2,3,4
Modules | Version | Task Definition | Running Tasks | CPU | Memory | MemoryReservation | MaxMetaspaceSize | Xmx |
---|---|---|---|---|---|---|---|---|
mod-search Poppy | 3.0.0-SNAPSHOT.151 | 6 | 8 | 2048 | 2592 | 2480 | 512 | 1440 |
mod-inventory-storage Poppy | 26.1.0-SNAPSHOT.685 | 11 | 2 | 2048 | 4096 | 3690 | 512 | 3076 |
mod-search Orchid | 2.1.0-SNAPSHOT.108 | 3 | 8 | 400 | 2592 | 2480 | 1024 | 1440 |
mod-inventory-storage Orchid | 26.1.0-SNAPSHOT.644 | 3 | 2 | 1024 | 1952 | 2208 | 512 | 1440 |
Modules memory and CPU parameters: tests #5,6
Modules | Version | Task Definition | Running Tasks | CPU | Memory | MemoryReservation | MaxMetaspaceSize | Xmx |
---|---|---|---|---|---|---|---|---|
mod-search Poppy | 3.0.0-SNAPSHOT.156 | 13 | 8 | 2048 | 2592 | 2480 | 512 | 1440 |
mod-inventory-storage Poppy | 26.1.0-SNAPSHOT.685 | 11 | 2 | 2048 | 4096 | 3690 | 512 | 3076 |
Methodology/Approach
- Use PTF's "Bugfest" Poppy cluster, which has 10M records, to test with (2 times)
- Configure the environment in accordance with Infrastructure parameters to the one that FSE commonly uses
- Reindex on the new Poppy environment, get the results for indexing time, index size
- Compare with results of https://issues.folio.org/browse/PERF-430