MaxRAMPercentage vs Xmx with Data Import and Check-ins Check-outs (Poppy)
Overview
This document contains the results of testing different configurations of modules with changes in JAVA_OPTS
replacing XMX with MaxRAMPercentage. The main KPI for these tests are Memory consumption of modules, Data import durations and Check-in / Check-out response times.
Ticket: - PERF-893Getting issue details... STATUS
Summary
- Replacing XMX with MaxRAMPercentage do not affect main KPI's. Comparing DI's durations in tests #1 and #2 deltas do not exceed 4 minutes that is acceptable.
- After changing MaxRAMPercentage from default values to recommended 66% Check-in / Check-out response times increased during data import for CI - 15%, for CO - 7%, and it also affected Data import durations - 8 minutes difference in DI test #6.
- Comparing DI durations between configuration #2 and #3 (with limited MaxRAMPercentage) we see that DI slower in configuration #3. Previously known issue with mod-inventory (container stop because of CPU growing) appear much faster in configuration #3 with limited MaxRAMPercentage.
- CPU utilization top three modules:
Service | Configuration #1 | Configuration #2 | Configuration #3 |
mod-inventory-b | 254.27 | 305.56 | 216.98 |
mod-quick-marc-b | 120.83 | 58.49 | 87.44 |
nginx-okapi | 91.73 | 56.1 | 95.49 |
- Memory consumption top three modules:
Service | Configuration #1 | Configuration #2 | Configuration #3 |
mod-inventory-b | 99.4 | 105.84 | 102.32 |
mod-search-b | 96.43 | 89.1 | 96.81 |
mod-users-b | 63.08 | 61.58 | 60.63 |
- Service CPU utilization changed between tests because of the cluster restarts
- RDS CPU utilization was close to 96% during CI/CO + Data import
- RDS DB Connections were close to 600 during CI/CO + Data import
Test Runs
Test / Configuration # | Scenario | Load level | JAVA_OPTS |
---|---|---|---|
1 | 20 vUsers longevity CI/CO + DI MARC Bib Create | 12 Data Import tests with 100K sequentially (with 5 min pause) |
|
2 | 20 vUsers longevity CI/CO + DI MARC Bib Create | 12 Data Import tests with 100K sequentially (with 5 min pause) |
|
3 | 20 vUsers longevity CI/CO + DI MARC Bib Create | 12 Data Import tests with 100K sequentially (with 5 min pause) | -XX:MaxRAMPercentage, Recommended - 66% |
Test Results
Comparison
The following table compares test results
Configuration #1 BASELINE, XMX | Configuration #2 default MaxRAMPercentage | Configuration #3 recommended MaxRAMPercentage, 66% | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
16 hours CICO | Test # | hh:mm:ss | 24 hours CI/CO | Test # | hh:mm:ss | delta, configuration #2 - #1 | 24 hours CI/CO | Test # | hh:mm:ss | delta, configuration #3 - #1 |
DI 100k Create #12 | 00:56:32 | These results after 1 container for mod-inventory stopped. A new 1 container was created. | ||||||||
DI 100k Create #11 | 01:39:09 | 4 1k jobs stopped | DI 100k Create #11 | 00:55:58 | ||||||
DI 100k Create #10 | 01:03:19 | DI 100k Create #10 | 00:57:00 | |||||||
DI 100k Create #9 | 00:59:40 | DI 100k Create #9 | 00:56:10 | |||||||
DI 100k Create #8 | 00:55:01 | DI 100k Create #8 | 00:58:47 | 00:03:46 | DI 100k Create #8 | 00:55:23 | ||||
DI 100k Create #7 | 00:54:43 | DI 100k Create #7 | 00:58:08 | 00:03:26 | DI 100k Create #7 | 02:46:27 | 3 1k jobs stopped | |||
DI 100k Create #6 | 00:55:25 | DI 100k Create #6 | 00:56:09 | 00:00:44 | DI 100k Create #6 | 01:03:07 | 00:07:42 | |||
DI 100k Create #5 | 00:54:45 | DI 100k Create #5 | 00:58:19 | 00:03:33 | DI 100k Create #5 | 01:03:37 | 00:08:52 | |||
DI 100k Create #4 | 00:55:38 | DI 100k Create #4 | 00:54:27 | 00:01:11 | DI 100k Create #4 | 00:59:11 | 00:03:33 | |||
DI 100k Create #3 | 00:53:30 | DI 100k Create #3 | 00:54:36 | 00:01:06 | DI 100k Create #3 | 00:58:34 | 00:05:04 | |||
DI 100k Create #2 | 00:53:16 | DI 100k Create #2 | 00:53:03 | 00:00:13 | DI 100k Create #2 | 00:57:59 | 00:04:43 | |||
DI 100k Create #1 | 00:52:51 | DI 100k Create #1 | 01:05:10 | 00:12:18 | DI 100k Create #1 | 01:05:42 | 00:12:50 | |||
Results from test #8 | DI + | DI - | DI + | DI - | DI + | DI - | Comparison CI/CO + DI between configuration #3 and #2, % | |||
CI | 1220 | 538 | CI | 1078 | 486 | CI | 1240 | 470 | 15% | |
CO | 2311 | 978 | CO | 1967 | 1041 | CO | 2121 | 965 | 7% |
Detailed CICO response time for CICO with DI
Request* | Response time (avg, sec) | |
---|---|---|
Pure CICO | CICO + 100K MARC BIB Create | |
Check-Out Controller | 0.965 | 2.121 |
Check-In Controller | 0.47 | 1.24 |
POST_circulation/check-out-by-barcode (Submit_barcode_checkout) | 0.314 | 0.879 |
POST_circulation/check-in-by-barcode (Submit_barcode_checkin) | 0.208 | 0.73 |
GET_circulation/loans (Submit_barcode_checkout) | 0.142 | 0.259 |
GET_circulation/loans (Submit_patron_barcode) | 0.1 | 0.203 |
GET_inventory/items (Submit_barcode_checkin) | 0.055 | 0.182 |
*Top requests were taken where response times >= 100ms.
Service CPU Utilization
Configuration #1
Configuration #2
Configuration #3
Service Memory Utilization
Configuration #1
Configuration #2
Configuration #3
DB CPU Utilization
Configuration #1
Configuration #2
Configuration #3
DB Connections
Configuration #1
Configuration #2
Configuration #3
DB load
Configuration #1
Configuration #2
Configuration #3
Appendix
Infrastructure
PTF -environment pcp1
- 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
1 database instance, writer
Name Memory GIB vCPUs max_connections db.r6g.xlarge
32 GiB 4 vCPUs 2731 - MSK tenant
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
Module | Task Def. Revision | Task Def. Revision with MaxRAMPercentage = by default | Task Def. Revision with MaxRAMPercentage = 66.0 | Module Version | Task Count | Mem Hard Limit | Mem Soft limit | CPU units | Xmx | MetaspaceSize | MaxMetaspaceSize | MaxRAMPercentage Tests | MaxRAMPercentage = recommended (66%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
pcp1-pvt | |||||||||||||
Thu Apr 04 09:40:28 UTC 2024 | |||||||||||||
mod-data-import | 20 | 23 | 24 | mod-data-import:3.0.7 | 1 | 2048 | 1844 | 256 | 1292 | 384 | 512 | 70.1 | 66.0 |
mod-authtoken | 16 | 17 | 18 | mod-authtoken:2.14.1 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 | 80.0 | 66.0 |
mod-configuration | 10 | 12 | 13 | mod-configuration:5.9.2 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | 85.7 | 66.0 |
mod-users-bl | 10 | 11 | 12 | mod-users-bl:7.6.0 | 2 | 1440 | 1152 | 512 | 922 | 88 | 128 | 80.0 | 66.0 |
mod-inventory-storage | 15 | 17 | 18 | mod-inventory-storage:27.0.4 | 2 | 4096 | 3690 | 2048 | 3076 | 384 | 512 | 83.4 | 66.0 |
mod-circulation-storage | 14 | 15 | 16 | mod-circulation-storage:17.1.7 | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 | 70.0 | 66.0 |
mod-source-record-storage | 18 | 19 | 20 | mod-source-record-storage:5.7.5 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | 70.0 | 66.0 |
mod-inventory | 15 | 16 | 17 | mod-inventory:20.1.8 | 2 | 2880 | 2592 | 1024 | 1814 | 384 | 512 | 70.0 | 66.0 |
mod-di-converter-storage | 18 | 19 | 20 | mod-di-converter-storage:2.1.5 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | 85.7 | 66.0 |
mod-circulation | 14 | 15 | 16 | mod-circulation:24.0.11 | 2 | 2880 | 2592 | 1536 | 1814 | 384 | 512 | 70.0 | 66.0 |
mod-pubsub | 13 | 14 | 15 | mod-pubsub:2.11.3 | 2 | 1536 | 1440 | 1024 | 922 | 384 | 512 | 64.0 | 66.0 |
mod-users | 34 | 35 | 36 | mod-users:19.3.0-SNAPSHOT.677 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | 85.7 | 66.0 |
mod-patron-blocks | 10 | 11 | 12 | mod-patron-blocks:1.9.0 | 2 | 1024 | 896 | 1024 | 768 | 88 | 128 | 85.7 | 66.0 |
mod-source-record-manager | 17 | 19 | 20 | mod-source-record-manager:3.7.8 | 2 | 5600 | 5000 | 2048 | 3500 | 384 | 512 | 70.0 | 66.0 |
nginx-okapi | 9 | nginx-okapi:2023.06.14 | 2 | 1024 | 896 | 128 | 0 | 0 | 0 | 0.0 | |||
okapi-b | 11 | 12 | 13 | okapi:5.1.2 | 3 | 1684 | 1440 | 1024 | 922 | 384 | 512 | 64.0 | 66.0 |
mod-feesfines | 11 | 12 | 13 | mod-feesfines:19.0.0 | 2 | 1024 | 896 | 128 | 768 | 88 | 128 | 85.7 | 66.0 |
Methodology/Approach
- Run CI/CO for 24 hours.
- Run DI script with 12 DI's sequentially after ramp-up period for CI/CO.
- After getting baseline, repeat CI/CO with changed revisions of modules with needed configuration of
JAVA_OPTS -
replaced XMX with MaxRAMPercentage. - Before tests restart the cluster to get consistent results.
Additional Screenshots of graphs or charts
Spreadsheet with notes
- Grafana links
- Configuration #1
- CI/CO with DI http://carrier-io.int.folio.ebsco.com/grafana/d/elIt9zCnz/jmeter-performance-with-average-latency?orgId=1&from=1712266178037&to=1712268036922&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_5&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All&var-Request=All
- CI/CO without DI http://carrier-io.int.folio.ebsco.com/grafana/d/elIt9zCnz/jmeter-performance-with-average-latency?orgId=1&from=1712269800000&to=1712271600000&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_5&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All&var-Request=All
- Configuration #2
- CI/CO with DI http://carrier-io.int.folio.ebsco.com/grafana/d/elIt9zCnz/jmeter-performance-with-average-latency?orgId=1&from=1712586527108&to=1712588513315&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_5&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All&var-Request=All
- CI/CO without DI http://carrier-io.int.folio.ebsco.com/grafana/d/elIt9zCnz/jmeter-performance-with-average-latency?orgId=1&from=1712589302156&to=1712589604755&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_5&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All&var-Request=All
- Configuration #3
- CI/CO with DI http://carrier-io.int.folio.ebsco.com/grafana/d/elIt9zCnz/jmeter-performance-with-average-latency?orgId=1&from=1712690171214&to=1712690526374&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_5&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All&var-Request=All
- CI/CO without DI http://carrier-io.int.folio.ebsco.com/grafana/d/elIt9zCnz/jmeter-performance-with-average-latency?orgId=1&from=1712687301153&to=1712689643642&var-percentile=95&var-test_type=baseline&var-test=circulation_checkInCheckOut_Poppy_5&var-env=int&var-grouping=1s&var-low_limit=250&var-high_limit=750&var-db_name=jmeter&var-sampler_type=All&var-Request=All