Check-in-check-out Test Report (Poppy)

Overview

The test's target is to investigate the behaviour of the Check-In/Check-Out activities during fixed load tests.

PERF-702 - Getting issue details... STATUS

Summary

Check-In/Check-Out tests on Poppy carried out with 8, 20, 30, 75 users during 45 minutes and with 30 users during 14 hours - longevity test .

45 minute tests

  • Within Poppy release 45 min tests best response times results were with 20 users. Average CI - 0.45 sec, CO - 0.805 sec. During the highest load (75 users) response times degraded ~ 30% in average.
  • Comparing Poppy and Orchid tests with 20 users during 45 minutes there's no significant difference but for tests with higher vUsers amount we see improvements for CI and CO which should be investigated.

Longevity test

  • Within Poppy release Longevity test with 30 vUsers degraded compared to 45 min test with: CI - 0.471 sec(+4.90%), CO - 1.025 sec(+27.01%).  But CI/CO is stable still during a long fixed load.
  • Comparing Longevity results of Orchid vs Poppy - CI response times changed from 1.061 sec to 0.471 sec and CO - from 3.233 sec to 1.025 sec.
  • Database utilization has growing trend for longevity test.

Common notes

  • Main API call slightly degraded (POST CO: +5.07%)
  • Memory consumption spikes at the beginning of testing usually go after cluster start and stabilize. But here memory leak for mod-pubsub module was suspected.
  • Spikes observed during all tests during 5-10 seconds for all requests with every minute frequency. Response times are up to 6 seconds.

Recommendations

  • Investigate mod-pubsub module behaviour  PERF-721 - Getting issue details... STATUS
  • Investigate response times differences between releases. Run additional test on Orchid release with 30 vUsers for 45 minutes and compare with Poppy results to find the source of changes.

Test Runs 

The following table contains information about a test model and related Grafana snapshots:

Test #

Test Conditions

vUsersRamp-up, sec

Duration, sec 

Grafana

1.

circulation_checkInCheckOut_Poppy_38802700http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1698852592867&to=1698855933000&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All
2.202002700http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1698856470216&to=1698859223000&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All
3.303002700http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1698861207442&to=1698864080572&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All
4.757502700http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1698865254355&to=1698868102672&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All
5. Retest757502700http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1698915649265&to=1698918502961&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All
6.3030086400http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1699029000000&to=1699087095584&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All

Results

The following table contains information about test results and their comparison with the baseline (poppy 8 VU).

  • Comparing Longevity test results against 45 min with 30 users within Poppy release we see almost the same results for Check-In 0.471 sec(+4.90%) in average and 1.025 sec(+27.01%) for Check-Out.
  • Within Poppy the response times during test with 75 users grew up to 30% compared to 8 users test.

Errors:

  • Error messages: POST_circulation/check-out-by-barcode (Submit_barcode_checkout)_POST_422. Reason for that is a doubled usage of checked-out item barcodes. 

Response time

Controller \ Users8 users

Requests75th pct95th pctAverage
Check-In (baseline)14000.4450.5990.479
Check-Out  (baseline)18070.8141.0160.832

20 users
Check-In 34390.426 (-4.27%)0.503 (-16.03%)0.45 (-6.05%)
Check-Out 44320.785 (-3.56%)0.931 (-8.37%)0.805 (-3.25%)

30 users
Check-In 48730.422 (-5.17%)0.488 (-18.53%)0.449 (-6.26%)
Check-Out 67590.786 (-3.44%)0.912 (-10.24%)0.807 (-3%)

75 users

With nginx okapi issues

Check-In 109530.592 (+33.03%)0.733 (+22.37%)0.57 (+19%)
Check-Out 145321.137 (+39.68%)1.378 (+35.63%)1.059 (+27.28%)

75 users

Retest

Check-In 110460.641 (+44.04%)0.785 (+31.05%)0.622 (+29.85%)
Check-Out 144781.182 (+45.21%)1.453 (+43.01%)1.116 (+34.13%)

30 users

Longevity test (compared to 30 vUsers)

Check-In 1106790.444 (+5.21%)0.504 (+3.28%)0.471 (+4.90%)
Check-Out 1471811.068 (+35.88%)1.195 (+31.03%)1.025 (+27.01%)

This table contains common information about tests, amount of total and failed requests, average requests per second, 95th percentile of response time..

Sampler Type8 users20 users30 users

75 users

with fails

75 users Retest30 users Longevity

ValueValueValueValueValueValue
Total Requests1137652760894158579087679075669170318
Failed Requests121141016198
Average RPS42104154337337158
Min, ms000000
Median, ms777997
Percentile 95, ms176167164239251177

Comparisons

The table contains comparison results for Poppy and Orchid releases:

Test #

Test Conditions

vUsersAction95th pctAverage


circulation_checkInCheckOut_orchid (baseline)8Check-in1.0270.534
Check-out1.2010.811

circulation_checkInCheckOut_Poppy_38Check-in

0.599

(-41.67%)

0.479

(-10.30%)

Check-out

1.016

(-15.40%)

0.832

(+2.59%)


circulation_checkInCheckOut_orchid (baseline)20Check-in

0.812

0.513

Check-out

1.130

0.798


circulation_checkInCheckOut_Poppy_320Check-in

0.503

(-38.05%)

0.45

(-12.28%)

Check-out

0.931

(-17.61%)

0.805

(+0.88%)


circulation_checkInCheckOut_orchid (baseline)30Check-in1.7080.834
Check-out2.771.59

circulation_checkInCheckOut_Poppy_330Check-in

0.488

(-71.43%)

0.449

(-46.16%)

Check-out

0.912

(-67.08%)

0.807

(-49.25%)


circulation_checkInCheckOut_orchid (baseline)75Check-in

1.566

0.825

Check-out

3.142

1.960


circulation_checkInCheckOut_Poppy_375Check-in

0.785

(-49.87%)

0.622

(-24.61%)

Check-out

1.453

(-53.76%)

1.116

(-43.06%)


circulation_checkInCheckOut_orchid (baseline) Longevity30Check-in

0.550

1.061

Check-out

1.597

3.233


circulation_checkInCheckOut_Poppy_3 Longevity30Check-in

0.504

(-8.36%)

0.471

(-55.61%)

Check-out

1.195

(-25.17%)

1.025

(-68.30%)

The following table contains information about the response time of the most important requests for CI/CO activities and their comparison with Orchid's results:

API

8 users Orchid

75pct, ms

8 users Poppy

75pct, ms

20 users Orchid

75pct, ms

20 users Orchid

75pct, ms

POST checkout-by-barcode

276


290

+5.07%

264


281

+6.44%

POST checkin-by-barcode 

258


181

-29.84%

245


174

-28.98%

GET circulation/loans

140


139

-0.71%

136


132

-2.94%


Resources Utilization

TestVUMemory Consumption

CPU Utilization

RDS CPU Utilization / CPU (User) usage by brokerRDS Performance
18



CPU (User) usage by broker didn't exceed 23.3%



220




330




4

75

failed




5

75

retest

On the first graph slow growing for mod-pubsub service was detected. And this trend clearly seen in longevity test results.

nginx-okapi go up to 800% after 10 minutes of test start.


This graph doesn't contain nginx-okapi. The highest cpu utilization from mod-users module - 111%.

okapi-b uses goes with 80%, the rest of modules doesn't exceed 65%.

RDS CPU utilization average was 60%  and didn't exceed 70% in spikes. 

CPU (User) usage by broker didn't exceed 40%

630

Here we can see that mod-pubsub grew up to 53%.


CPU (User) usage by broker didn't exceed 23% and at 22:10 UTC went down to 13%

Appendix

Infrastructure

PTF -environment ncp5PTF -environment pcp1
  • 8 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1 [Number of ECS instances, instance type, location region]
  • 2 instances of db.r6.xlarge database instances: Writer & reader instances
  • MSK ptf-kakfa-3 [ kafka configurations]
    • 4 kafka.m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 2 database  instances db.r6g.xlarge, writer/reader, Memory GIB-32GiB, vCPUs - 4 vCPUs, max. connections: 2731

  • MSK tenant
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3

Modules memory and CPU parameters

ModuleTask Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSizeR/W split enabled
pcp1-pvt

mod-remote-storage10mod-remote-storage:3.0.024920447210243960512512FALSE
mod-authtoken9mod-authtoken:2.14.021440115251292288128FALSE
mod-configuration9mod-configuration:5.9.22102489612876888128FALSE
mod-users-bl9mod-users-bl:7.6.021440115251292288128FALSE
mod-inventory-storage10mod-inventory-storage:27.0.024096369020483076384512FALSE
mod-circulation-storage10mod-circulation-storage:17.1.022880259215361814384512FALSE
mod-inventory9mod-inventory:20.1.022880259210241814384512FALSE
mod-circulation10mod-circulation:24.0.022880259215361814384512FALSE
mod-pubsub9mod-pubsub:2.11.02153614401024922384512FALSE
mod-patron-blocks9mod-patron-blocks:1.9.021024896102476888128FALSE
nginx-okapi9nginx-okapi:2023.06.1421024896128000FALSE
okapi-b10okapi:5.0.13168414401024922384512FALSE
mod-feesfines10mod-feesfines:19.0.02102489612876888128FALSE

Methodology/Approach

  • Run data preparation script before each CI/CO test
  • Update .jmx file script for Poppy release
  • Create  artefact and upload to carrier-io
  • Use Jenkins job to change parameters and run tests
  • Test CI/CO with 8, 20, 30, 75 concurrent users for 45 minutes each. 
  • Test CI/CO with 30 users for 24 hours to detect any trends in memory.