Table of Contents

outline	true

Table of Contents
outline true

Overview

The primary objective of testing was to evaluate the performance of the Baseline MCPT Environment configuration while attempting to optimize costs by adjusting instance types and reducing the number of instances. The tests were designed to compare the performance outcomes across different configurations, including variations in instance types and counts within multiple Auto Scaling Groups (ASGs). By systematically modifying these variables, the goal was to maintain or improve the performance observed in the baseline configuration while achieving cost efficiency.

...

Test #	Description	Status
Test 1	CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 10 Instance Type: r7g.xlarge for others services.Instance type: m6g.2xlarge. Instances count: 10.	Completed
Test 2CPU=	2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 10 Instance Type: m6g.2xlarge for others modulesInstance type: m6g.2xlarge. Instances count: 10 (Repeat Test 1).	Completed
Test 3	CPU=2 was set for all services, used Used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 10 5 Instance Type: m6g.2xlarge for others services (Repeat Test 2) for others services.	Completed
Test 4	CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modulesUsed two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 5 Instance Type: m6g.2xlarge for others services (Repeat Test 3).	Completed
Test 5	CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 3 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 service and 5 Instance Type: r7g.xlarge for others modules (Repeat Test 4) for others modules.	Completed
Test 6	CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 3 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 service and 5 Instance Type: r7g r7g.xlarge for others for others modules . Test was run without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow workflows (Repeat Test 5).	Completed
Test 7	CPU=2 was set for all moduleswas set for all modules except CPU=2048 for mod-search, used two autoscaling groups, 1st with 6 3 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 service and 5 Instance Type: r7g r7g.xlarge for others modules. Test was run without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow workflows (Repeat Test 6) for others modules.	Completed

Test Results

This table contains durations for all Workflows.

...

Service CPU Utilization

Here we can see that okapi modules used 42k% CPU power parameter CPU=2 for module.

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99% average with ERW: Exporting Receiving Information.

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99% average with ERW: Exporting Receiving Information

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99% maximum.

...

Service CPU Utilization

Here we can see that okapi used 43k% of the CPU power of parameter CPU=2.

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99%.

...

Service CPU Utilization

Here we can see that okapi used 46k% of the CPU power of parameter CPU=2.

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99%.

...

Service CPU Utilization

Here we can see that okapi used 46k% of the CPU power of parameter CPU=2.

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99%.

...

Test №7 - 8

Introduction:

Test 7: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Fixed Load (average case without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow) MOBIUS test was run (Repeat Test 6).
Test 8: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Fixed Load (average case) MOBIUS test was run.

Objective: The objective of these tests was to validate the consistency of performance observed in Test 5 and Test 6. This was achieved by repeating the same configuration.

Results: We see performance improvements for Test 8.

Instance CPU Utilization

Service CPU Utilization

Here we can see that okapi used 48k% of the CPU power of parameter CPU=2.

Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99%.

DB Connections

Max number of DB connections was 6010.

DB load

Top SQL-queries

Test №9 - 10

Introduction:

Test 9: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Fixed Load (average case without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow) MOBIUS test was run (Repeat Test 7 after Terminate Instances r7g.xlarge).
Test 10: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Fixed Load (average case) MOBIUS test was run (Repeat Test 8 after Terminate Instances r7g.xlarge).

Objective: The objective of these tests was to validate the consistency of performance observed in Test 7 and Test 8. This was achieved by repeating the same configuration and applying new random task locations per instance after terminating the r7g.xlarge instances.

Results: Performance result were worse on 50% for several workflows after terminating the r7g.xlarge instances and applying new random task locations per instance.

Instance CPU Utilization

Service CPU Utilization

Here we can see that okapi used 48k% of the CPU power of parameter CPU=2.

Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99%.

DB Connections

Max number of DB connections was 5150.

DB load

Top SQL-queries

...

Test №11

Introduction: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 11 Instance Type: r7g.xlarge for others services. Fixed Load (average case) MOBIUS test was run.

Objective: The goal of this test was to replace three c7g.large instances with one additional r7g.xlarge instance for all services, while retaining three c7g.large instances specifically for the okapi service.

Results: The performance was worse compared to when we used the additional three c7g.large instances for services with smaller workloads.

Instance CPU Utilization

Service CPU Utilization

Here we can see that okapi used 45k% of the CPU power of parameter CPU=2.

Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99%.

DB Connections

Max number of DB connections was 5900.

DB load

Top SQL-queries

Test №12 - 13

Introduction: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 11 Instance Type: r7g.xlarge for others services. Fixed Load (average case) MOBIUS test was run.

Objective: The objective of this test was to validate the degradation of performance observed in Test 11 by repeating the same configuration.

Results: We confirmed performance degradation with this configuration.

Instance CPU Utilization

Service CPU Utilization

Here we can see that okapi used 43k% of the CPU power of parameter CPU=2.

Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99%.

DB Connections

Max number of DB connections was 6200.

DB load

Top SQL-queries

...

Test №14

Introduction: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 11 Instance Type: r7g.xlarge for others services. Fixed Load (average case) MOBIUS test was run.

Objective: The objective of this test was to validate the degradation of performance observed in Test 11 by repeating the same configuration.

Results: We confirmed performance degradation with this configuration.

Instance CPU Utilization

Service CPU Utilization

Here we can see that okapi used 45k% of the CPU power of parameter CPU=2.

Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99%.

DB Connections

Max number of DB connections was 5900.

DB load

Top SQL-queries

...

Test №15

Introduction: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi and mod-authtoken service and 11 Instance Type: r7g.xlarge for others services. Fixed Load (average case) MOBIUS test was run.

Objective:

Results: We confirmed performance degradation with this configuration.

Instance CPU Utilization

Service CPU Utilization

Here we can see that okapi used 38k% of the CPU power of parameter CPU=2.

Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99%.

DB Connections

Max number of DB connections was 5150.

DB load

Top SQL-queries

...

Test №16 - 17

Introduction: Test 16: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 5 Instance Type: r7g.2xlarge for others services. Fixed Load (average case) MOBIUS test was run. Test 17: Repeat Test 16 with the same environment configuration.

Objective: The objective of this test was to reduce the number of instances in the main group from 10 to 5, while using higher capacity r7g.2xlarge instances instead of r7g.xlarge

Results: We observed performance improvements in Test 16 compared to Test 8, and by repeating the same configuration in Test 17, we confirmed these performance improvements.

Instance CPU Utilization

Service CPU Utilization

Here we can see that okapi used 45k% of the CPU power of parameter CPU=2.

Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.

Kafka metrics

DB CPU Utilization

DB CPU was 99%.

DB Connections

Max number of DB connections was 6100.

DB load

Top SQL-queries

...

PTF - Baseline QCP1 environment configuration

10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instance, writer

Name Memory GIB vCPUs
db.r6g.xlarge
32 GB 4 vCPUs
Open Search ptf-test
- Data nodes
  - Instance type - r6g.2xlarge.search
  - Number of nodes - 4
  - Version: OpenSearch_2_7_R20240502
- Dedicated master nodes
  - Instance type - r6g.large.search
  - Number of nodes - 3
MSK fse-tenant
- 2 brokers, kafka.m7g.xlarge brokers in 2 zones
- Apache Kafka version 3.7.x
- EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3

...

Baseline QCP1 Environment configuration: Parameter srs.marcIndexers.delete.interval.seconds=86400 for mod-source-record-storage, number of tasks to launch for service mod-marc-migrations-b was set zero. Instance type: m6g.2xlarge. Instances count: 10. Database db.r6g.xlarge, Amazon OpenSearch Service ptf-test: r6g.2хlarge.search (4 nodes).

...

Version	Old Version 5	New Version 6
Changes made by	Stanislav Nehrii	Stanislav Nehrii
Saved on	Sep 16, 2024	Sep 16, 2024

Name	Memory GIB	vCPUs
db.r6g.xlarge	32 GB	4 vCPUs

Page Comparison

Versions Compared

Key

Overview

Test Results

Service CPU Utilization

Kafka metrics

DB CPU Utilization

Kafka metrics

DB CPU Utilization

Kafka metrics

DB CPU Utilization

Service CPU Utilization

Kafka metrics

DB CPU Utilization

Service CPU Utilization

Kafka metrics

DB CPU Utilization

Kafka metrics

DB CPU Utilization

Test №7 - 8

Instance CPU Utilization

Service CPU Utilization

Service Memory Utilization

Kafka metrics

DB CPU Utilization

DB Connections

DB load

Top SQL-queries

Test №9 - 10

Instance CPU Utilization

Service CPU Utilization

Service Memory Utilization

Kafka metrics

DB CPU Utilization

DB Connections

DB load

Top SQL-queries

Test №11

Instance CPU Utilization

Service CPU Utilization

Service Memory Utilization

Kafka metrics

DB CPU Utilization

DB Connections

DB load

Top SQL-queries

Test №12 - 13

Instance CPU Utilization

Service CPU Utilization

Service Memory Utilization

Kafka metrics

DB CPU Utilization

DB Connections

DB load

Top SQL-queries

Test №14

Instance CPU Utilization

Service CPU Utilization

Service Memory Utilization

Kafka metrics

DB CPU Utilization

DB Connections

DB load

Top SQL-queries

Test №15

Instance CPU Utilization

Service CPU Utilization

Service Memory Utilization

Kafka metrics

DB CPU Utilization

DB Connections

DB load

Top SQL-queries

Test №16 - 17

Instance CPU Utilization

Service CPU Utilization

Service Memory Utilization

Kafka metrics

DB CPU Utilization