Steps:
Create a namespace with bugfest dataset
For successful indexation on extensive datasets (such as bugfest). It would help if you had an environment (namespace) with next resources:
- DB: RDS
- Kafka: Shared (AWS MSK). But also possible built-in with at least 50Gb of disk space
- OpenSearch: Shared (AWS OpenSearch)
So with Project job provision a namespace with restore from RDS snapshot.
Upgrade the environment to the latest version. (If needed)
Check Kafka topics
Before starting of indexation ensure that topics for modules responsible for indexation have 50 partitions.
You could check this information with Kafka UI
Topics:
- inventory.instance
- search.instance-contributor
- search.instance-subject
Pic. 1 Example "Kafka UI topics & partitions"
Scale-up OpenSearch
As indexation is a heavy process that has a high CPU and memory resources consumptions, it is required (and strongly recommended) to scale up the shared OpenSearch AWS service.
r6g.xlarge → r6g.2xlarge
Scale-up backend modules
For better performance, please scale up backend modules.
You could perform this operation via Rancher in the Deployment section
Modules:
- mod-search (1 → 4)
- mod-inventory-storage (1 → 2)
Pic. 1 Example "Backend module scale up"