Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Overview

The document's purpose is to assess reindexing performance on a Ramsons release. Calculate reindex time and size of reindexing.

Implementation of the feature

Jira Legacy
serverSystem Jira
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyUXPROD-4892

Jira ticket:

Jira Legacy
serverSystem Jira
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-984

...

  • Reindex could be done in 3 hours and 4 minutes (db.r6g.8xlarge) for 13 million instances for all tenants. It is a new feature and this reindex was started for the central tenant but done for all tenants in parallel. Reindex time matches requirements (Expected response time: Whole reindexing procedure should take under 6 hours ).

  • Service CPU utilization was up to 60% for mod-search and 5% for mod-inventory-storage. For all other services CPU did not exceed 4%.

  • Memory utilization was stable and no memory leaks or OOM issues were observed.

  • RDS CPU utilization was about up to 28% for db.r6g.8xlarge.

Test Runs /Results

Test #

Start time

End time

Instances number

Test Conditions

reindexing on Ramsons release, consortium environment

Duration

Notes


1

2024-10-22T13:02:35

2024-10-22T16:06:18

13,777,503 *

In parallel: all tenants

3 hours 4 minutes

  • mod-search:

  1. task count = 4

  2. Mem Hard Limit = 2592

  3. Mem Soft Limit = 2480

  4. Xmx = -XX:MaxRAMPercentage=85.0

  • mod-inventory-storage task count = 4

  • open search Data nodes instance scaled up to r6g.4xlarge.search

...

Subrange of reindexing process from 13:02 - 16:06UTC. A Graph was added to see detailed behavior aggregated on the graph above.

...

CPU utilization percentage for all data nodes

...

Memory usage percentage for all data nodes

Average JVM Memory Pressure

...

Maximum memory utilization (SysMemoryUtilization)

...

Appendix

Infrastructure

PTF-environment rcon

  • 9 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1 

  • 1 instance of db.r6g.8xlarge database, writer instance.

  • MSK - fse-tenant
    4 kafka.m7g.xlarge brokers in 2 zones

    • Apache Kafka version 3.7.x

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3

  • OpenSearch ptf-reindex-testcluster

    • OpenSearch version 2.13

    • Data nodes

      • Availability Zone(s) - 2-AZ without standby

      • Instance type - r6g.4xlarge.search

      • Number of nodes - 4

      • EBS volume size (GiB) - 300

      • Provisioned IOPS - 3000IOPS

      • Provisioned Throughput (MiB/s) - 250 MiB/s

    • Dedicated master nodes
      Enabled - No

...

  • Use consortium cluster for testing (rcon in our case).

  • Configure the environment according to Infrastructure parameters and requirements that are in the ticket

    Jira Legacy
    serverSystem Jira
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyPERF-889

  • Reindex process was started from the JMeter script using POST request /search/index/instance-records/reindex/fullwithout any parameters on central tenant. For all other tenants in consortium cluster reindex will be performed automatically.

  • Reindex, get the results for indexing time and size from GET /search/index/instance-records/reindex/status

  • Script on the http://github.com/folio-org/perf-testing/mod-search