[Nolana] Check-IN + title-level requests retest

[Nolana] Check-IN + title-level requests retest

Overview

Test goal is to assess performance of circulation check-in functionality for items with 10 TLR (title-level requests) each. Difference from the previous testing is added indexes to mod_circulation_storage.request and mod_circulation_storage.actual_cost_record in scope of CIRCSTORE-402: Add an index for requests.instanceId fieldClosed.

Previous test report: [Nolana] Check-IN + title-level requests

Ticket: PERF-568: Retest PERF-465Closed

Summary

  • Load tests showed that there is still significant degradation in performance of check-in for items with 10 TLRs each and without it. Also, response time increased after indexes were added. - see Response Time Comparison.

  • Resource monitoring showed that:

  • Query plan analysis for top SQL-queries showed that indexes were not used in queries processing.

Recommendations & Jiras

As added indexes didn't change query plan of most CPU-consuming queries, CIRCSTORE-402: Add an index for requests.instanceId fieldClosed should be reviewed. Tests should be repeated with fixes.

Test Runs 

Test #

Test Conditions

Duration 

Load generator size (recommended)

Load generator Memory (GiB) (recommended)

Notes

 

Test #

Test Conditions

Duration 

Load generator size (recommended)

Load generator Memory (GiB) (recommended)

Notes

 

1.

Baseline, Check-in with 1, 8, 25 users

30 min

t3.medium

3

Without TLR

2.

Verification, Check-in with 1, 8, 25 users

With 10 TLR per item

 

Results

Response Times

Baseline (items without TLR)

1 user

8 users

25 users

Verification (items with 10 TLRs)

1 user

8 users

25 users

 

Response time comparison

Items without TLR and items with TLR (both after indexes were added)

 

User quantity

Check-in response time 95prc, sec

 

Degradation, sec

 

Degradation, %

Baseline

(items without TLR)

Verification

(items with 10 TLR each)

1 user

1.686

2.219

0.533

31%

8 users

0.498

1.221

0.725

145%

25 users

0.588

2.333

1.745

296%

25 users (rerun, with analyze operation before the test)

0.622

2.222

1.6

257%

Tests without indexes and with indexes added (both for items with TLR)

 

User quantity

Check-in response time 95prc, sec

 

Degradation, sec

 

Degradation, %

Baseline

(before fix)

Verification

(after fix)

1 user

0.953

2.219

1.266

132%

8 users

0.782

1.221

0.439

56%

25 users

2.001

2.333

0.332

16%

Service CPU Utilization

Baseline (items without TLR)

1 user

8 users

25 users

Verification (items with 10 TLRs)

During verification tests CPU utilization for mod-users increased significantly. For 25 users test increase was from 22% to 51%.

1 user

8 users

25 users


Memory Utilization

Baseline (items without TLR)

1 user

8 users

25 users

Verification (items with 10 TLRs)

1 user

8 users

25 users


DB CPU Utilization

Baseline (items without TLR)

1 user

8 users

25 users

Verification (items with 10 TLRs)

During verification tests RDS CPU utilization increased significantly. For 25 users test increase was from 15% to 72%.

1 user

8 users

25 users


DB Connections

Baseline (items without TLR)

1 user

8 users

25 users

Verification (items with 10 TLRs)

1 user

8 users

25 users

 

DB load

Baseline (items without TLR)

1 user

8 users

25 users

Verification (items with 10 TLRs)

1 user

8 users

25 users

Top-SQL

Baseline (items without TLR)

25 users

Verification (items with 10 TLRs)

During verification tests two SQL queries moved to the beginning of Top SQL list:

SELECT [tenant]_mod_circulation_storage.count_estimate(?)

SELECT jsonb,id FROM [tenant]_mod_circulation_storage.request WHERE ((lower(f_unaccent(request.jsonb->>?)) LIKE lower(f_unaccent(?))) AND ((((CASE WHEN length(lower(?)) <= ? THEN left(lower(request.jsonb->>?),?) LIKE lower(?) ELSE left(lower(request.jsonb->>?),?) LIKE left(lower(?),?) AND lower(request.jsonb->>?) LIKE lower(?) END) OR (CASE WHEN length(lower(?)) <= ? THEN left(lower(request.jsonb->>?),?) LIKE lower(?) ELSE left(lower(request.jsonb->>?),?) LIKE left(lower(?),?) AND lower(reques

 

25 users

Appendix

Infrastructure

PTF -environment ncp3

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • 2 instances of db.r6.xlarge database instances: Writer & reader instances

  • MSK ptf-kakfa-3 [ kafka configurations]

    • 4 kafka.m5.2xlarge brokers in 2 zones

    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3

Environment didn't change since previous testing, but two indexes were added: request_instanceid_idx on [tenant]_mod_circulation_storage.request and ctual_cost_record_expirationdate_idx on [tenant]_mod_circulation_storage.actual_cost_record.

Modules memory and CPU parameters:

Modules

Version

Task Definition

Running Tasks 

CPU

Memory (Soft/Hard limits)

MaxMetaspaceSize

Xmx

Modules

Version

Task Definition

Running Tasks 

CPU

Memory (Soft/Hard limits)

MaxMetaspaceSize

Xmx

okapi

4.14.7

1

3

1024

1440/1684

512

922

mod-feesfines

18.1.1

3

2

128

896/1024

128

768

mod-patron-blocks

1.7.1

4

2

1024

896/1024

128

768

mod-pubsub

2.7.0

4

2

1024

1440/1536

512

922

mod-authtoken

2.12.0

3

2

512

1152/1440

128

922

mod-circulation-storage

15.0.2

3

2

1024

1440/1536

512

896

mod-circulation

23.3.2

3

2

1024

896/1024

128

768

mod-configuration

5.9.0

3

2

128

896/1024

128

768

mod-inventory

19.0.1

10

2

1024

2592/2880

512

1814

mod-inventory-storage

25.0.3

3

2

1024

1952/2208

512

1440

mod-users

19.0.0

4

2

128

896/1024

128

768

mod-remote-storage

1.7.1

3

2

128

1692/1872

512

1178

Methodology/Approach

  1. Run necessary commands to return the database to the initial state. Do this before each test run. Wait several minutes before the test start.

  2. Conduct check-out for the items with JMeter script Create_TLR.jmx (disable "Create_TLR" step).

  3. Conduct baseline - run check-in load tests with different number of users.

  4. Conduct verification - repeat tests with the same approach but before each test also generate 10 TLR for each item by running JMeter script (Create_TLR.jmx) - enable both Check-in and Create_TLR steps. Important: if indexes were added, "ANALYZE table name" should be conducted to make index work.

  5. Compare test results.

Note - make sure to use the same list of items for Create_TLR.jmx script and Check-in script. Also, items should be selected for those instances which have 1 item per instance.

Grafana dashboard

Baseline (items without TLR)

1 user

http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1692358118565&to=1692360062513&var-percentile=95&var-test_type=fix_load&var-test=circulation_checkIn_nolana3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All

8 users

http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1692362736626&to=1692364709022&var-percentile=95&var-test_type=fix_load&var-test=circulation_checkIn_nolana3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All

25 users

http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1692607012293&to=1692609532152&var-percentile=95&var-test_type=fix_load&var-test=circulation_checkIn_nolana3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All

Verification (items with 10 TLRs)

1 user

http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1692636096938&to=1692638161401&var-percentile=95&var-test_type=fix_load&var-test=circulation_checkIn_nolana3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All

8 users

http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1692677697470&to=1692680390829&var-percentile=95&var-test_type=fix_load&var-test=circulation_checkIn_nolana3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All

25 users

http://carrier-io.int.folio.ebsco.com/grafana/d/SqzWB26nk/jmeter-performance-check-in-check-out?orgId=1&from=1692700591022&to=1692703522031&var-percentile=95&var-test_type=fix_load&var-test=circulation_checkIn_nolana3&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All