PTF - Performance Instance Resources Optimization - MCPT
- 1 Overview
- 2 Summary
- 3 Test Runs
- 4 Test Results
- 5 Comparison
- 6 Test №1
- 7 Test №2
- 8 Test №3
- 9 Test №4
- 10 Test №5
- 10.1 Instance CPU Utilization
- 10.2 Service CPU Utilization
- 10.3 Service Memory Utilization
- 10.4 Kafka metrics
- 10.5 DB CPU Utilization
- 10.6 DB Connections
- 10.7 DB load
- 10.8 Top SQL-queries
- 11 Test №6
- 11.1 Instance CPU Utilization
- 11.2 Service Memory Utilization
- 11.3 Kafka metrics
- 11.4 DB CPU Utilization
- 11.5 DB Connections
- 11.6 DB load
- 11.7 Top SQL-queries
- 12 Test №7 - 8
- 12.1 Instance CPU Utilization
- 12.2 Service CPU Utilization
- 12.3 Service Memory Utilization
- 12.4 Kafka metrics
- 12.5 DB CPU Utilization
- 12.6 DB Connections
- 12.7 DB load
- 12.8 Top SQL-queries
- 13 Test №9 - 10
- 13.1 Instance CPU Utilization
- 13.2 Service CPU Utilization
- 13.3 Service Memory Utilization
- 13.4 Kafka metrics
- 13.5 DB CPU Utilization
- 13.6 DB Connections
- 13.7 DB load
- 13.8 Top SQL-queries
- 14 Test №11
- 14.1 Instance CPU Utilization
- 14.2 Service CPU Utilization
- 14.3 Service Memory Utilization
- 14.4 Kafka metrics
- 14.5 DB CPU Utilization
- 14.6 DB Connections
- 14.7 DB load
- 14.8 Top SQL-queries
- 15 Test №12 - 13
- 15.1 Instance CPU Utilization
- 15.2 Service CPU Utilization
- 15.3 Service Memory Utilization
- 15.4 Kafka metrics
- 15.5 DB CPU Utilization
- 15.6 DB Connections
- 15.7 DB load
- 15.8 Top SQL-queries
- 16 Test №14
- 16.1 Instance CPU Utilization
- 16.2 Service CPU Utilization
- 16.3 Service Memory Utilization
- 16.4 Kafka metrics
- 16.5 DB CPU Utilization
- 16.6 DB Connections
- 16.7 DB load
- 16.8 Top SQL-queries
- 17 Test №15
- 17.1 Instance CPU Utilization
- 17.2 Service CPU Utilization
- 17.3 Service Memory Utilization
- 17.4 Kafka metrics
- 17.5 DB CPU Utilization
- 17.6 DB Connections
- 17.7 DB load
- 17.8 Top SQL-queries
- 18 Test №16 - 17
- 18.1 Instance CPU Utilization
- 18.2 Service CPU Utilization
- 18.3 Service Memory Utilization
- 18.4 Kafka metrics
- 18.5 DB CPU Utilization
- 18.6 DB Connections
- 18.7 DB load
- 18.8 Top SQL-queries
- 19 Appendix
- 19.1 Infrastructure
- 20 Methodology/Approach
Overview
The primary objective of testing was to evaluate the performance of the Baseline MCPT Environment configuration while attempting to optimize costs by adjusting instance types and reducing the number of instances. The tests were designed to compare the performance outcomes across different configurations, including variations in instance types and counts within multiple Auto Scaling Groups (ASGs). By systematically modifying these variables, the goal was to maintain or improve the performance observed in the baseline configuration while achieving cost efficiency.
https://folio-org.atlassian.net/browse/PERF-961
Summary
Through a series of experiments involving different placement strategies, instance types, and total instance counts, we found that the performance remained consistent when using these two configuration:
six
c7g.largeinstances (three dedicated to theokapiservice and three allocated tomod-courses,mod-sender,mod-tasks-list,mod-gobi,edge-dematic,mod-erm-usage,mod-eusage-reports,mod-notify, andmod-data-importservices) alongside tenr7g.xlargeinstances for all other services, with the CPU parameter set to 2 for all services.three c7g.largeinstances dedicated to theokapiservice alongside fiver7g.2xlargeinstances for all other services, with the CPU parameter set to 2 for all services.
Notably, both environment configuration offers a 40% cost reduction compared to the existing setup, making it a more economical option without compromising on performance.
Configurations with three c7g.large instances for the okapi service and five r7g.2xlarge instances for all other services show the best performance across all experiments. So it will be more tests in this ticket.
Tests had 100% errors count for AIE_TC: Create Invoices, AIE_TC: Invoices Approve, AIE_TC: Paying Invoices, TC: Receiving-an-Order-Line, Unreceiving-a-Piece and Unreceiving-a-Piece Workflows because data was not regenerated.
Test Runs
Test # | Description | Status |
|---|---|---|
Test 1 | CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 10 Instance Type: r7g.xlarge for others services. | Completed |
Test 2 | CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 10 Instance Type: m6g.2xlarge for others modules. | Completed |
Test 3 | CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 10 Instance Type: m6g.2xlarge for others services (Repeat Test 2). | Completed |
Test 4 | CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. | Completed |
Test 5 | CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules (Repeat Test 4). | Completed |
Test 6 | CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Test was run without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow workflows. | Completed |
Test 7 | CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Test was run without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow workflows (Repeat Test 6). | Completed |
Test 8 | CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. | Completed |
Test 9 | CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules. Test was run without DE_Exporting MARC Bib records custom workflow, DE_Exporting MARC Bib records workflow, and OPIH_/oai/records workflow workflows (Repeat Test 7 after Terminate Instances r7g.xlarge). | Completed |
Test 10 | CPU=2 was set for all modules, used two autoscaling groups, 1st with 6 Instance Type: c7g.large, 3 of them for okapi service and other 3 for mod-cources, mod-sender, mod-tasks-list, mod-gobi, edge-dematic, mod-erm-usage, mod-eusage-reports, mod-notify, mod-data-import services and 10 Instance Type: r7g.xlarge for others modules (Repeat Test 8 after Terminate Instances r7g.xlarge). | Completed |
Test 11 | CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 11 Instance Type: r7g.xlarge for others services. | Completed |
Test 12 | CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 11 Instance Type: r7g.xlarge for others services (Repeat Test 11). | Completed |
Test 13 | CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 11 Instance Type: r7g.xlarge for others services (Repeat Test 11). | Completed |
Test 14 | CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 11 Instance Type: r7g.xlarge for others services (Repeat Test 11 after terminate 11 instances). | Completed |
Test 15 | CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi and mod-authtoken services and 11 Instance Type: r7g.xlarge for others services. | Completed |
Test 16 | CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 5 Instance Type: r7g.2xlarge for others services. | Completed |
Test 17 | CPU=2 was set for all services, used two autoscaling groups, 1st with 3 Instance Type: c7g.large for okapi service and 5 Instance Type: r7g.2xlarge for others services (Repeat Test 16). | Completed |
Test Results
This table contains durations for all Workflows.
Workflows | Test 1 | Test 2 | Test 3 (repeat 2) | Test 4 | Test 5 (repeat 4) | Test 6 | Test 7 (repeat 6 ) | Test 8 | Test 9 (repeat 7 after Terminate) | Test 10 (repeat 8 after Terminate) | Test 11 | Test 12 | Test 13 | Test 14 (Terminate) | Test 15 | Test 16 | Test 17 | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | Average response time (milliseconds) | Errors | |
AIE_TC: Create Invoices | 10227 | 100% | 7425 | 100% | 7645 | 100.00% | 8563 | 100.00% | 8132 | 100.00% | 7383 | 100.00% | 7575 | 100.00% | 7359 | 100.00% | 10061 | 100.00% | 11269 | 100.00% | 8607 | 100% | 11352 | 100% | 8293 | 100% | 9397 | 100% | 10687 | 100% | 8138 | 100% | 7063 | 100% |
AIE_TC: Invoices Approve | 4649 | 100% | 3281 | 100% | 3252 | 100.00% | 3442 | 100.00% | 3175 | 100.00% | 3081 | 100.00% | 2966 | 100.00% | 3100 | 100.00% | 5464 | 100.00% | 7132 | 100.00% | 3438 | 100% | 4792 | 100% | 3506 | 100% | 4052 | 100% | 5297 | 100% | 3054 | 100% | 2873 | 100% |
AIE_TC: Paying Invoices | 5686 | 100% | 3375 | 100% | 3481 | 100.00% | 3439 | 100.00% | 3250 | 100.00% | 3079 | 100.00% | 2948 | 100.00% | 3115 | 100.00% | 4504 | 100.00% | 4245 | 100.00% | 3580 | 100% | 4635 | 100% | 3548 | 100% | 4387 | 100% | 6265 | 100% | 3093 | 100% | 2811 | 100% |
CICO_TC_Check-In Controller | 3828 | 0% | ||||||||||||||||||||||||||||||||