[Orchid] Check-IN/Check-OUT Capacity test

Overview

The target of the test is to investigate the capacity of the Check-In/Check-Out activities.

PERF-459 - Getting issue details... STATUS

Summary

  • Test results (Test #1 & Test #2) show approx. the same saturation point - ~134 vUsers
  • The start point of the Performance degradation zone (Test #1 & Test#2):
    • check-in: ~110 vUsers & ~98 vUsers
    • check-out: ~75 vUsers & ~77 vUsers

Recommendations

  • Up to ~75 vUsers could be used for benchmark testing if it's necessary and based on business non-functional requirements
  • Run fixed load tests for the confidence of the Check-In/Check-Out activities under the 'comfort' load


Test Runs 

Test #

Test Conditions

Duration 

Load generator size (recommended)Load generator Memory(GiB) (recommended)

1.

CI/CO >200 int. 1 min125 mint3.medium1
2.CI/CO >200 int. 1 min125 mint3.medium1

Results

Response Times 

Grafana: 

Test #1 http://carrier-io.int.folio.ebsco.com/grafana/d/elIt9zCnz/jmeter-performance-test-copy?orgId=1&from=1677754860000&to=1677765659000&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_orchid&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All&var-Request=Check-Out%20Controller

~134 vUsers peak - Capacity point (~683 op), throughput growth has been decreased and response issues appeared (5** status code of 'okapi' due to unavailability of related modules) (see Grafana snapshot Test #1 - Throughput)

Based on Investigation of variability of test results:

  • AVG/Median check-in: [400;500] ms
  • AVG/Median check-out: [700;800] ms

~110 vUsers check-in - potential start point of performance degradation zone, the response time has grown up, a significant amount of requests has a response time of more than 500 ms (see Grafana snapshot Test #1 - Response time heatmap (option 'Check-in'))

~75 vUsers check-out - potential start point of performance degradation zone, the response time has grown up, a significant amount of requests has a response time of more than 800 ms (see Grafana snapshot Test #1 - Response time heatmap (option 'Check-out'))

Test #2 http://carrier-io.int.folio.ebsco.com/grafana/d/elIt9zCnz/jmeter-performance-test-copy?orgId=1&from=1677824973753&to=1677834380143&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_orchid&var-env=int&var-grouping=1m&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All&var-Request=Check-In%20Controller

~134 vUsers peak Capacity point (~680 op), throughput growth has been decreased and response issues appeared (5** status code of 'okapi' due to unavailability of related modules) (see Grafana snapshot Test #2 - Throughput)

Based on Investigation of variability of test results:

  • AVG/Median check-in: [400;500] ms
  • AVG/Median check-out: [700;800] ms

~98 vUsers check-in - potential start point of performance degradation zone, the response time has grown up, a significant amount of requests has a response time of more than 500 ms (see Grafana snapshot Test #2 - Response time heatmap (option 'Check-in'))

~62 vUsers check-out - potential start point of performance degradation zone, the response time has grown up, a significant amount of requests has a response time of more than 800 ms (see Grafana snapshot Test #2 - Response time heatmap (option 'Check-out'))

Memory Utilization

Test №1

mod-circulation was a little grown up 58.4% > 67.5%

Test №2

CPU Utilization 

Test №1

nginx-okapi: 0.1% >1'012%

mod-users-b: 0.5% > 254.6%

Test №2

nginx-okapi: 0.1% >853.7%

mod-users-b: 0.6% > 308%

RDS CPU Utilization 

Test №1

DB Writer: up to 81.6%

Test №2


DB Writer: up to 79.12%

Appendix

Infrastructure

PTF -environment ncp5 [ environment name] 

  • 8 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1 [Number of ECS instances, instance type, location region]
  • 2 instances of db.r6.xlarge database instances: Writer & reader instances
  • MSK ptf-kakfa-3 [ kafka configurations]
    • 4 kafka.m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3


Modules memory and CPU parameters:

ModuleSoftLimitXMXRevisionVersiondesiredCountCPUUnitsRWSplitEnabledHardLimitMetaspaceMaxMetaspaceSize
mod-inventory-storage-b195214403579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-inventory-storage:26.1.0-SNAPSHOT.64421024False2208384512
mod-inventory-b259218147579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-inventory:20.0.0-SNAPSHOT.39221024False2880384512
okapi-b14409221579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/okapi:5.1.0-SNAPSHOT.135231024False1684384512
mod-source-record-storage-b14409084579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-source-record-storage:5.7.0-SNAPSHOT.17021024False1536384512
mod-source-record-manager-b368820484579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-source-record-manager:3.6.0-SNAPSHOT.19721024False4096384512
mod-data-import-cs-b8967681579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-data-import-converter-storage:1.16.0-SNAPSHOT.1320128False102488128
mod-data-import-b184412924579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-data-import:2.7.0-SNAPSHOT.1011256False2048384512
mod-quick-marc-b217616643579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-quick-marc:3.0.0-SNAPSHOT.791128False2288384512
mod-feesfines-b8967683579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-feesfines:18.3.0-SNAPSHOT.1412128False102488128
mod-patron-blocks-b8967684579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-patron-blocks:1.9.0-SNAPSHOT.9021024False102488128
mod-pubsub-b14409224579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-pubsub:2.10.0-SNAPSHOT.12421024False1536384512
mod-authtoken-b11529223579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-authtoken:2.14.0-SNAPSHOT.2382512False144088128
mod-circulation-storage-b14408963579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-circulation-storage:16.1.0-SNAPSHOT.30521024False1536384512
mod-circulation-b8967683579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-circulation:23.5.0-SNAPSHOT.55621024False102488128
mod-configuration-b8967683579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-configuration:5.9.2-SNAPSHOT.2912128False102488128
mod-users-bl-b11529224579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-users-bl:7.6.0-SNAPSHOT.2302512False144088128
mod-remote-storage-b500039603579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-remote-storage:2.0.0-SNAPSHOT.8312024False520010241024
mod-users-b8967684579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-users:19.2.0-SNAPSHOT.5842128False102488128

Methodology/Approach

  1. Executed check ins with up to 200 concurrent vUsers and the interval between vUsers is 1 min.  (On carrier-io use the artifact "circulation_checkInCheckOut_orchid.zip")
  2. Scripts used:
    1. DB Refresh - checkin-checkout-db-restore.sql 
    2. DB Update - circ-data-load_item-level-requests.sh

Additional Screenshots of graphs or charts

Test #1

Test #2