MaxRAMPercentage vs Xmx with Data Import and Check-ins Check-outs (Poppy)

Overview

This document contains the results of testing different configurations of modules  with changes in JAVA_OPTS replacing XMX with MaxRAMPercentage. The main KPI for these tests are Memory consumption of modules, Data import durations and Check-in / Check-out response times.

Ticket: PERF-893 - Getting issue details... STATUS

Summary

  • Replacing XMX with MaxRAMPercentage do not affect main KPI's. Comparing DI's durations in tests #1 and #2 deltas do not exceed 4 minutes that is acceptable.
  • After changing MaxRAMPercentage from default values to recommended 66%  Check-in / Check-out response times increased during data import for CI - 15%, for CO - 7%, and it also affected Data import durations - 8 minutes difference in DI test #6.
  • Comparing DI durations between configuration #2 and #3 (with limited MaxRAMPercentage) we see that DI slower in configuration #3. Previously known issue with mod-inventory (container stop because of CPU growing) appear much faster in configuration #3 with limited MaxRAMPercentage.
  • CPU utilization top three modules:
ServiceConfiguration #1Configuration #2Configuration #3
mod-inventory-b254.27305.56216.98
mod-quick-marc-b120.8358.4987.44
nginx-okapi91.7356.195.49
  • Memory consumption top three modules:
ServiceConfiguration #1Configuration #2Configuration #3
mod-inventory-b99.4105.84102.32
mod-search-b96.4389.196.81
mod-users-b63.0861.5860.63
  • Service CPU utilization changed between tests because of the cluster restarts
  • RDS CPU utilization was close to 96% during CI/CO + Data import 
  • RDS DB Connections were close to 600 during CI/CO + Data import 

Test Runs 

Test / Configuration #

Scenario

Load levelJAVA_OPTS
120 vUsers longevity CI/CO + DI MARC Bib Create12 Data Import tests with 100K sequentially (with 5 min pause)
-XMX
220 vUsers longevity CI/CO + DI MARC Bib Create12 Data Import tests with 100K sequentially (with 5 min pause)
-XX:MaxRAMPercentage, Default
320 vUsers longevity CI/CO + DI MARC Bib Create12 Data Import tests with 100K sequentially (with 5 min pause)-XX:MaxRAMPercentage, Recommended - 66%

Test Results

Comparison

The following table compares test results 

Configuration #1 BASELINE, XMX   Configuration #2 default MaxRAMPercentage

Configuration #3 recommended MaxRAMPercentage, 66%
16 hours CICOTest #hh:mm:ss24 hours CI/COTest #hh:mm:ssdelta, configuration #2 - #1 24 hours CI/COTest #hh:mm:ssdelta, configuration #3 - #1








DI 100k Create #1200:56:32These results after 1 container for mod-inventory stopped.
A new 1 container was created.




DI 100k Create #1101:39:094 1k jobs stopped
DI 100k Create #1100:55:58




DI 100k Create #1001:03:19

DI 100k Create #1000:57:00




DI 100k Create #900:59:40

DI 100k Create #900:56:10

DI 100k Create #800:55:01
DI 100k Create #800:58:4700:03:46
DI 100k Create #800:55:23

DI 100k Create #700:54:43
DI 100k Create #700:58:0800:03:26
DI 100k Create #702:46:273 1k jobs stopped

DI 100k Create #600:55:25
DI 100k Create #600:56:0900:00:44
DI 100k Create #601:03:0700:07:42

DI 100k Create #500:54:45
DI 100k Create #500:58:1900:03:33
DI 100k Create #501:03:3700:08:52

DI 100k Create #400:55:38
DI 100k Create #400:54:2700:01:11
DI 100k Create #400:59:1100:03:33

DI 100k Create #300:53:30
DI 100k Create #300:54:3600:01:06
DI 100k Create #300:58:3400:05:04

DI 100k Create #200:53:16
DI 100k Create #200:53:0300:00:13
DI 100k Create #200:57:5900:04:43

DI 100k Create #100:52:51
DI 100k Create #101:05:1000:12:18
DI 100k Create #101:05:4200:12:50











Results

from test #8

DI +DI -
DI +DI -

DI +DI -Comparison CI/CO + DI
between configuration #3 and #2, %
CI1220538CI1078486
CI124047015%
CO2311978CO19671041
CO21219657%

Detailed CICO response time for CICO with DI

Request*Response time (avg, sec)
Pure CICOCICO + 100K MARC BIB Create
Check-Out Controller0.9652.121
Check-In Controller0.471.24
POST_circulation/check-out-by-barcode (Submit_barcode_checkout)0.3140.879
POST_circulation/check-in-by-barcode (Submit_barcode_checkin)0.2080.73
GET_circulation/loans (Submit_barcode_checkout)0.1420.259
GET_circulation/loans (Submit_patron_barcode)0.10.203
GET_inventory/items (Submit_barcode_checkin)0.0550.182

*Top requests were taken where response times >= 100ms.

Service CPU Utilization

 CPU utilization all tests
AVERAGES


ServiceConfiguration #1Configuration #2Configuration #3
mod-inventory-b254.27305.56216.98
mod-quick-marc-b120.8358.4987.44
nginx-okapi91.7356.195.49
mod-di-converter-storage-b7433.0880.35
okapi-b45.6932.344.08
mod-source-record-storage-b39.419.1448.36
mod-users-b38.2733.133.79
mod-source-record-manager-b29.5415.2332.39
mod-inventory-storage-b21.9313.1720.09
mod-authtoken-b19.6715.9919.56
mod-configuration-b17.7216.5514.97
mod-feesfines-b17.1216.7115.97
mod-pubsub-b10.199.529.74
mod-permissions-b8.074.736.74
pub-okapi7.748.416.47
mod-circulation-storage-b3.723.083.28
mod-circulation-b3.033.052.95
mod-password-validator-b2.712.42.65
mod-data-import-b1.812.951.98

Configuration #1

Configuration #2

Configuration #3

Service Memory Utilization

 Memory utilization all tests
AVERAGES


ServiceConfiguration #1Configuration #2Configuration #3
mod-inventory-b99.4105.84102.32
mod-search-b96.4389.196.81
mod-users-b63.0861.5860.63
mod-source-record-storage-b51.6944.7660.69
mod-permissions-b46.547.3847.24
okapi-b4544.8650.14
mod-pubsub-b42.5742.4351.67
mod-data-import-b41.8741.5450.16
mod-di-converter-storage-b38.0239.7839.01
MAXIMUM+B47:B6537.143.4345.35
mod-feesfines-b33.233.233.43
mod-inventory-storage-b30.7346.7237.57
mod-quick-marc-b29.7431.8429.61
mod-configuration-b28.828.8529.21
mod-circulation-b25.7725.2826.93
mod-authtoken-b21.5221.4521.45
mod-circulation-storage-b21.0519.8325.98
nginx-okapi4.884.974.97
pub-okapi4.584.584.58
MAXIMUM


ServiceConfiguration #1Configuration #2Configuration #3
mod-inventory-b107.68106.21107.65
mod-search-b100.2281.1100.16
mod-users-b62.8960.2762.83
okapi-b62.6437.7462.64
mod-pubsub-b59.3834.7259.24
mod-source-record-manager-b52.0829.9452.08
mod-permissions-b51.3644.6951.17
mod-inventory-storage-b50.0846.1950.16
mod-source-record-storage-b49.5448.0249.54
mod-data-import-b41.8141.6541.81
mod-di-converter-storage-b39.8438.6239.96
mod-feesfines-b33.4833.1133.48
mod-quick-marc-b32.3332.0832.49
mod-configuration-b29.2428.4629.25
mod-circulation-b26.2724.4226.27
mod-circulation-storage-b23.0322.822.96
mod-authtoken-b21.7921.4421.79
nginx-okapi5.135.025.13
pub-okapi4.694.584.69

Configuration #1

Configuration #2

Configuration #3


DB CPU Utilization

Configuration #1

Configuration #2

Configuration #3

DB Connections

Configuration #1

Configuration #2

Configuration #3

DB load

Configuration #1

Configuration #2

Configuration #3


Appendix

Infrastructure

PTF -environment pcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 1 database  instance, writer

    NameMemory GIBvCPUsmax_connections

    db.r6g.xlarge

    32 GiB4 vCPUs2731
  • MSK tenant
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
ModuleTask Def. RevisionTask Def. Revision with MaxRAMPercentage = by defaultTask Def. Revision with MaxRAMPercentage = 66.0Module VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSizeMaxRAMPercentage
Tests 
MaxRAMPercentage = recommended (66%)
pcp1-pvt












Thu Apr 04 09:40:28 UTC 2024












mod-data-import202324mod-data-import:3.0.7120481844256129238451270.166.0
mod-authtoken161718mod-authtoken:2.14.12144011525129228812880.066.0
mod-configuration101213mod-configuration:5.9.2210248961287688812885.766.0
mod-users-bl101112mod-users-bl:7.6.02144011525129228812880.066.0
mod-inventory-storage151718mod-inventory-storage:27.0.42409636902048307638451283.466.0
mod-circulation-storage141516mod-circulation-storage:17.1.72288025921536181438451270.066.0
mod-source-record-storage181920mod-source-record-storage:5.7.52560050002048350038451270.066.0
mod-inventory151617mod-inventory:20.1.82288025921024181438451270.066.0
mod-di-converter-storage181920mod-di-converter-storage:2.1.5210248961287688812885.766.0
mod-circulation141516mod-circulation:24.0.112288025921536181438451270.066.0
mod-pubsub131415mod-pubsub:2.11.3215361440102492238451264.066.0
mod-users343536mod-users:19.3.0-SNAPSHOT.677210248961287688812885.766.0
mod-patron-blocks101112mod-patron-blocks:1.9.02102489610247688812885.766.0
mod-source-record-manager171920mod-source-record-manager:3.7.82560050002048350038451270.066.0
nginx-okapi9

nginx-okapi:2023.06.14210248961280000.0
okapi-b111213okapi:5.1.2316841440102492238451264.066.0
mod-feesfines111213mod-feesfines:19.0.0210248961287688812885.766.0

Methodology/Approach

  1. Run CI/CO for 24 hours.
  2. Run DI script with 12 DI's sequentially after ramp-up period for CI/CO.
  3. After getting baseline, repeat CI/CO with changed revisions of modules with needed configuration of JAVA_OPTS - replaced XMX with MaxRAMPercentage. 
  4. Before tests restart the cluster to get consistent results.

Additional Screenshots of graphs or charts

Spreadsheet with notes


Leonid Kolesnykov
May 7, 2024

yes. should be deleted