Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
outlinetrue
Table of Contents
outlinetrue

Overview

  • The primary objective of testing was to evaluate the performance of the Baseline MCPT Environment configuration while attempting to optimize costs by adjusting instance types and reducing the number of instances. The tests were designed to compare the performance outcomes across different configurations, including variations in instance types and counts within multiple Auto Scaling Groups (ASGs). By systematically modifying these variables, the goal was to maintain or improve the performance observed in the baseline configuration while achieving cost efficiency.

...

Test #DescriptionStatus
Test 1CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 10 Instance Type: r7g.xlarge for others services.Instance type: m6g.2xlargeInstances count: 10. Completed
Test 2CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 10 Instance Type: m6g.2xlarge for others modulesInstance type: m6g.2xlargeInstances count: 10 (Repeat Test 1)Completed
Test 3CPU=2 was set for all services, used Used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 10  5 Instance Type: m6g.2xlarge for others services (Repeat Test 2) for others services.Completed
Test 4CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modulesUsed two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 5 Instance Type: m6g.2xlarge for others services (Repeat Test 3).Completed
Test 5CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 3 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10  service and 5 Instance Type: r7g.xlarge for others modules (Repeat Test 4) for others modules.Completed
Test 6CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 3 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10  service and 5 Instance Type: r7g r7g.xlarge for others for others modules . Test was run without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow workflows (Repeat Test 5).Completed
Test 7CPU=2 was set for all moduleswas set for all modules except CPU=2048 for mod-search, used two autoscaling groups, 1st with 6 3 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10  service and 5 Instance Type: r7g r7g.xlarge for others modules. Test was run without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow workflows (Repeat Test 6) for others modules.Completed

Test Results

This table contains durations for all Workflows. 

...

Service CPU Utilization

Here we can see that okapi modules used 42k% CPU power parameter CPU=2 for module.

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics




DB CPU Utilization

DB CPU was 99% average with ERW: Exporting Receiving Information.

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics



DB CPU Utilization

DB CPU was 99% average with ERW: Exporting Receiving Information

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics




DB CPU Utilization

DB CPU was 99% maximum.

...

Service CPU Utilization

Here we can see that okapi used 43k% of the CPU power of parameter CPU=2.

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.


Kafka metrics






DB CPU Utilization

DB CPU was 99%.

...

Service CPU Utilization

Here we can see that okapi used 46k% of the CPU power of parameter CPU=2.

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics




DB CPU Utilization

DB CPU was 99%.

...

Service CPU Utilization

Here we can see that okapi used 46k% of the CPU power of parameter CPU=2.

...

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.


Kafka metrics




DB CPU Utilization

DB CPU was 99%.

...

Test №7 - 8

Introduction:  

  • Test 7: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Fixed Load (average case without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow) MOBIUS test was run (Repeat Test 6).
  • Test 8: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Fixed Load (average case) MOBIUS test was run.

Objective: The objective of these tests was to validate the consistency of performance observed in Test 5 and Test 6. This was achieved by repeating the same configuration.

Results: We see performance improvements for Test 8.

Instance CPU Utilization


Service CPU Utilization

Here we can see that okapi used 48k% of the CPU power of parameter CPU=2.


Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics





DB CPU Utilization

DB CPU was 99%.


DB Connections

Max number of DB connections was 6010.


DB load


Top SQL-queries



Test №9 - 10

Introduction: 

  • Test 9: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Fixed Load (average case without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow) MOBIUS test was run (Repeat Test 7 after Terminate Instances r7g.xlarge).
  • Test 10: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Fixed Load (average case) MOBIUS test was run (Repeat Test 8 after Terminate Instances r7g.xlarge).

Objective: The objective of these tests was to validate the consistency of performance observed in Test 7 and Test 8. This was achieved by repeating the same configuration and applying new random task locations per instance after terminating the r7g.xlarge instances.

Results: Performance result were worse on 50% for several workflows after terminating the r7g.xlarge instances and applying new random task locations per instance.

Instance CPU Utilization


Service CPU Utilization

Here we can see that okapi used 48k% of the CPU power of parameter CPU=2.


Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics






DB CPU Utilization

DB CPU was 99%.


DB Connections

Max number of DB connections was 5150.


DB load


Top SQL-queries


...

Test №11

Introduction: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 11 Instance Type: r7g.xlarge for others services. Fixed Load (average case) MOBIUS test was run.

Objective: The goal of this test was to replace three c7g.large instances with one additional r7g.xlarge instance for all services, while retaining three c7g.large instances specifically for the okapi service.

Results: The performance was worse compared to when we used the additional three c7g.large instances for services with smaller workloads.

Instance CPU Utilization


Service CPU Utilization

Here we can see that okapi used 45k% of the CPU power of parameter CPU=2.


Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics





DB CPU Utilization

DB CPU was 99%.


DB Connections

Max number of DB connections was 5900.


DB load


Top SQL-queries



Test №12 - 13

Introduction: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 11 Instance Type: r7g.xlarge for others services. Fixed Load (average case) MOBIUS test was run.

Objective: The objective of this test was to validate the degradation of performance observed in Test 11 by repeating the same configuration.

Results: We confirmed performance degradation with this configuration.

Instance CPU Utilization


Service CPU Utilization

Here we can see that okapi used 43k% of the CPU power of parameter CPU=2.


Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics







DB CPU Utilization

DB CPU was 99%.


DB Connections

Max number of DB connections was 6200.


DB load


Top SQL-queries



...

Test №14

Introduction: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 11 Instance Type: r7g.xlarge for others services. Fixed Load (average case) MOBIUS test was run.

Objective: The objective of this test was to validate the degradation of performance observed in Test 11 by repeating the same configuration.

Results: We confirmed performance degradation with this configuration.

Instance CPU Utilization


Service CPU Utilization

Here we can see that okapi used 45k% of the CPU power of parameter CPU=2.


Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics





DB CPU Utilization

DB CPU was 99%.


DB Connections

Max number of DB connections was 5900.


DB load


Top SQL-queries


...

Test №15

Introduction: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi and mod-authtoken service and 11 Instance Type: r7g.xlarge for others services. Fixed Load (average case) MOBIUS test was run.

Objective: 

Results: We confirmed performance degradation with this configuration.

Instance CPU Utilization


Service CPU Utilization

Here we can see that okapi used 38k% of the CPU power of parameter CPU=2.


Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics





DB CPU Utilization

DB CPU was 99%.


DB Connections

Max number of DB connections was 5150.


DB load


Top SQL-queries


...

Test №16 - 17

Introduction: Test 16: The Baseline MCPT Environment configuration was applied, and CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 5 Instance Type: r7g.2xlarge for others services. Fixed Load (average case) MOBIUS test was run. Test 17: Repeat Test 16 with the same environment configuration

Objective: The objective of this test was to reduce the number of instances in the main group from 10 to 5, while using higher capacity r7g.2xlarge instances instead of r7g.xlarge

Results: We observed performance improvements in Test 16 compared to Test 8, and by repeating the same configuration in Test 17, we confirmed these performance improvements.

Instance CPU Utilization


Service CPU Utilization

Here we can see that okapi used 45k% of the CPU power of parameter CPU=2.


Service Memory Utilization

Here we can't see any sign of memory leaks on every module. Memory shows stable trend.



Kafka metrics





DB CPU Utilization

DB CPU was 99%.


DB Connections

Max number of DB connections was 6100.


DB load


Top SQL-queries


...

PTF - Baseline QCP1 environment configuration

  • 10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 1 database  instance, writer


    NameMemory GIBvCPUs

    db.r6g.xlarge

    32 GB4 vCPUs


  • Open Search ptf-test 
    • Data nodes
      • Instance type - r6g.2xlarge.search
      • Number of nodes - 4
      • Version: OpenSearch_2_7_R20240502
    • Dedicated master nodes
      • Instance type - r6g.large.search
      • Number of nodes - 3
  • MSK fse-tenant
    • brokers, kafka.m7g.xlarge brokers in 2 zones
    • Apache Kafka version 3.7.x 

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3

...

Baseline QCP1 Environment configuration: Parameter srs.marcIndexers.delete.interval.seconds=86400 for mod-source-record-storage,  number of tasks to launch for service mod-marc-migrations-b was set zero. Instance type: m6g.2xlargeInstances count: 10Database db.r6g.xlargeAmazon OpenSearch Service  ptf-testr6g.2хlarge.search (4 nodes).

...