PTF - Performance Instance Resources Optimization - QCP1

PTF - Performance Instance Resources Optimization - QCP1

Overview

  • The primary objective of testing was to evaluate the performance of the Baseline MCPT Environment configuration while attempting to optimize costs by adjusting instance types and reducing the number of instances. The tests were designed to compare the performance outcomes across different configurations, including variations in instance types and counts within multiple Auto Scaling Groups (ASGs). By systematically modifying these variables, the goal was to maintain or improve the performance observed in the baseline configuration while achieving cost efficiency.

https://folio-org.atlassian.net/browse/PERF-962 

Summary

  • Through a series of experiments involving different placement strategies, instance types, and total instance counts, we found that the performance remained consistent when using these configurations: 

    • three c7g.large instances dedicated to the okapi service alongside five r7g.2xlarge instances for all other services, with the CPU parameter set to 2 for all services.

    • five r7g.2xlarge instances for all services, with the CPU parameter set to 2 for all services.

  • Optimized environment configurations offers a 20-40% cost reduction compared to the existing setup, making it a more economical option without compromising on performance.

  • Configurations with three c7g.large instances for the okapi service and five r7g.2xlarge instances for all other services show the best performance across all experiments.

  • In fact, some workflows show better performance with this new setup than correct infrastructures.

  • The CPU utilization on EC2 level better now - around 30-60%, previously it was under 20%.

AWS Configuration Costs

Cluster

Instance Type

Cost per Month
(USD)

Number of Instances

Total Cost per Cluster
(USD)

Cluster

Instance Type

Cost per Month
(USD)

Number of Instances

Total Cost per Cluster
(USD)

QCP1

m6g.2xlarge

$221.76

10

$2,217.60

MCPT

m6g.2xlarge

$221.76

14

$3,104.64

Optimized Infrastructure
Two Auto Scaling Groups

c7g.large

$52.20

3

$1,698.84

r7g.2xlarge

$308.45

5

Optimized Infrastructure
One Auto Scaling Groups

r7g.2xlarge

$308.45

5

$1,542.25

 

Cost Comparison (Before vs After)

Cluster

Previous Total Cost
(USD)

New Total Cost
(USD)

Percentage Saving
(%)

Cluster

Previous Total Cost
(USD)

New Total Cost
(USD)

Percentage Saving
(%)

QCP1

$2,217.60

$1,698.84

23.39%

MCPT

$3,104.64

$1,698.84

45.28%

Test Runs

Test #

Description

Status

Test #

Description

Status

Test 1

Instance type: m6g.2xlargeInstances count: 10

Completed

Test 2

Instance type: m6g.2xlargeInstances count: 10 (Repeat Test 1)

Completed

Test 3

Used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 5 Instance Type: m6g.2xlarge for others services.

Completed

Test 4

Used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 5 Instance Type: m6g.2xlarge for others services (Repeat Test 3).

Completed

Test 5

CPU=2 was set for all modules, used two autoscaling groups, 1st with 3 Instance Type: c7g.large, 3 of them for okapi service and 5 Instance Type: r7g.xlarge for others modules.

Completed

Test 6

CPU=2 was set for all modules, used two autoscaling groups, 1st with 3 Instance Type: c7g.large, 3 of them for okapi service and 5 Instance Type: r7g.xlarge for others modules (Repeat Test 5).

Completed

Test 7

CPU=2 was set for all modules except CPU=2048 for mod-search, used two autoscaling groups, 1st with 3 Instance Type: c7g.large, 3 of them for okapi service and 5 Instance Type: r7g.xlarge for others modules.

Completed

Test 8

CPU=2 was set for all modules, used ONE autoscaling group with 5 Instance Type:  c7g.large for all services.

 

Test 9

CPU=2 was set for all modules, used ONE autoscaling group with 5 Instance Type:  c7g.large for all services (Repeat Test 8).

 

Test 10

CPU=2 was set for all modules except CPU=2048 for mod-search, used ONE autoscaling group with 5 Instance Type:  c7g.large for all services.

 

Test 11

CPU=2 was set for all modules except CPU=2048 for mod-search, used ONE autoscaling group with 5 Instance Type:  c7g.large for all services (Repeat Test 10).

 

Test 12

CPU=2 was set for all modules except CPU=2048 for mod-search, used ONE autoscaling group with 5 Instance Type:  c7g.large for all services (Repeat Test 11).

 

Test Results

This table contains durations for all Workflows. 

Workflows

Test 1

Test 2
(Repeat 1)

Test 3

Test 4
(Repeat 3)

Test 5 

Test 6
(Repeat 5)

Test 7

Test 8

Test 9
(Repeat 8)

Test 10

Test 11
(Repeat 10)

Test 12
(Repeat 11)

Workflows

Test 1

Test 2
(Repeat 1)

Test 3

Test 4
(Repeat 3)

Test 5 

Test 6
(Repeat 5)

Test 7

Test 8

Test 9
(Repeat 8)

Test 10

Test 11
(Repeat 10)

Test 12
(Repeat 11)

 

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

Average response time
(milliseconds)

Errors

DATA IMPORT

0:52:03

 

0:44:55

 

0:46:07

 

0:47:09

 

0:51:41

 

0:58:35

 

1:00:03

 

0:43:53

 

0:47:06

 

0:55:30

 

0:45:46

 

0:44:09

 

DATA EXPORT

0:58:11

 

0:44:43

 

0:47:41

 

0:50:32

 

0:38:59

 

0:45:53

 

0:48:26

not finished for main

0:45:41

 

0:48:49

 

0:56:49

 

0:42:16

 

0:44:26

 

CICO_TC_Check-In Controller

1163

0%

948

0%

932

0%

958

0%

849

0%

895

0%

940

0%

912

0%

993

0%

1176

0%

892

0%

967

0%

CICO_TC_Check-Out Controller

1697

0%

1481

0%

1408

0%

1428

0%

1318

0%

1318

0%

1367

0%

1345

0%

1445

0%

1675

0%

1371

0%

1467

0%

DE_Exporting MARC Bib records workflow

2528

0%

3818

0%

3675

0%

2830

0%

1918

0%

2223

0%

1865

0%

3363

0%

2420

0%

1872

0%

3398

0%

5033

0%

ILR_TC: Create ILR

1023

0%

874