Kafka Zookeeper Mode vs KRaft Mode MSK - instance type comparison

Overview

This document contains the comparison results of testing workflows Check-in/Check-out and Data Import for MARC Bibliographic records in the Quesnelia release with different instance types: kafka.m7g.2xlarge zookeeper mode against kafka.m7g.2xlarge KRaft mode.

Tickets: PERF-921 - Getting issue details... STATUS VS PERF-936 - Getting issue details... STATUS

Summary

  • Comparing kafka.m7g.2xlarge zookeeper metadata mode against kafka.m7g.2xlarge KRaft metadata mode
    • No difference in MSK resource utilization (CPU and Disk)  comparing two MSK clusters
    • In KRaft mode services utilized less memory in mod-inventory-b on 11%, mod-source-record-storage-b on 9%, and more memory in mod-data-import-b on 14%. For the rest modules the difference was less than 7%
    • No significant difference in service CPU utilization:
      • Service CPU utilization in Zookeeper mode: mod-inventory-b - 136%, mod-quick-marc-b - 96%, mod-di-converter-storage-b - 100%, nginx-okapi - 88% the rest of modules utilized less than 50%
      • Service CPU utilization in KRaft mode: mod-inventory-b - 133%, mod-quick-marc-b - 75%, mod-di-converter-storage-b - 103%, nginx-okapi - 75% the rest of modules utilized less than 50%
    • No difference in Check-in/Check-out response time
    • Data import durations fluctuate from test to test but work stable without issues

Test Runs 

Test #

MSK instance type

Scenario

Load level
1

kafka.m7g.2xlarge

zookeeper mode

CICO + DI MARC Bib Create 8 users + 5K, 25K sequentially
2DI MARC Bib Create5K, 25K sequentially
3CICO + DI MARC Bib Update 8 users + 5K, 25K sequentially
4DI MARC Bib Update5K, 25K sequentially
5

kafka.m7g.2xlarge

KRaft mode

CICO + DI MARC Bib Create 8 users + 5K, 25K sequentially
6DI MARC Bib Create5K, 25K sequentially
7CICO + DI MARC Bib Update 8 users + 5K, 25K sequentially
8DI MARC Bib Update5K, 25K sequentially

Test Results

This table shows results of Check-In/Check-out and Data Import create and update jobs.

MSK instance: kafka.m7g.2xlarge, metadata mode - ZooKeeper
Job profileFile sizeDI Duration without CI/CODI Duration with CI/COCI with DI Average secCO with DI Average sec
PTF - Create 25k00:03:0500:02:390.7071.104

25k00:12:0300:12:080.7181.129
PTF - Updates Success - 65k00:03:3600:03:340.7421.124

25k00:17:0500:17:330.7561.148
MSK instance: kafka.m7g.2xlarge, metadata mode - KRaft
Job profileFile sizeDI Duration without CI/CODI Duration with CI/COCI with DI Average secCO with DI Average sec
PTF - Create 25k00:02:4900:02:390.7651.118

25k00:13:3100:12:040.7771.186
PTF - Updates Success - 65k00:04:3600:04:310.7061.095

25k00:24:0700:21:500.741.16

Check-in/Check-out without DI

ScenarioLoad levelRequest

Response time, sec
MSK instance: kafka.m7g.2xlarge Zookeeper mode

Response time, sec
MSK instance: kafka.m7g.2xlarge

KRaft mode

95 percaverage95 percaverage
Circulation Check-in/Check-out (without Data import)8 usersCheck-in0.720.6060.6950.583
Check-out1.2410.9691.1510.944

Comparison

Data Import durations and Check-In/Check-Out response time comparison

  • Response times of CI/CO with Data import do not differ in both MSK clusters
Job ProfileFile sizeDELTA, DI without CI/CODELTA, DI+CI/CODELTA, CI with DIDELTA, CO with DI
PTF - Create 25k00:00:1600:00:00-0.058-0.014
25k-00:01:2800:00:04-0.059-0.057
PTF - Updates Success - 65k-00:01:00-00:00:570.0360.029
25k-00:07:02-00:04:170.016-0.012

Check-in/Check-out without DI

  • Check-in/Check-out perform the same in both MSK clusters. The difference of response times is so small that it can be neglected.
ScenarioLoad levelRequest

Response time, sec
MSK

instance: kafka.m7g.2xlarge

Zookeeper mode

Response time, sec
MSK

instance: kafka.m7g.2xlarge 

KRaft mode


Delta

95 percaverage95 percaverageAverage
Circulation Check-in/Check-out (without Data import)8 usersCheck-in0.720.6060.6950.583-0.023
Check-out1.2410.9691.1510.944-0.025

MSK resource utilization (CPU)

Load scenarioBrokersMSK instance: 

kafka.m7g.2xlarge

Zookeeper mode

MSK instance: 

kafka.m7g.2xlarge 

KRaft mode

Delta, %
CICO1109-1
2109-1
CICO+DI131321
23230-2

Response time

MSK instance: kafka.m7g.2xlarge Zookeeper mode

MSK instance: kafka.m7g.2xlarge Kraft mode

Service CPU Utilization

CPU utilization table

 MSK instance: kafka.m5.2xlarge vs MSK instance: kafka.m7g.2xlarge
MSK instance: kafka.m7g.2xlarge Zookeeper mode
MSK instance: kafka.m7g.2xlarge Kraft mode
ModuleCPU (CICO + 25k Create)CPU (CICO + 25k Update)
ModuleCPU (CICO + 25k Create)CPU (CICO + 25k Update)
mod-inventory-b115.21136.94
mod-inventory-b139.1133.42
mod-quick-marc-b95.1596.4
mod-di-converter-storage-b103.4996.49
mod-di-converter-storage-b81.26100.43
mod-quick-marc-b75.4572.77
nginx-okapi70.5888.94
nginx-okapi75.3373.68
okapi-b38.8950.55
okapi-b41.7451.2
mod-source-record-storage-b31.6139.13
mod-source-record-storage-b38.1134.22
mod-users-b23.622.12
mod-inventory-storage-b23.1326.1
mod-inventory-storage-b21.3719.9
mod-source-record-manager-b17.1616.33
mod-feesfines-b18.289.11
mod-users-b9.1821.82
mod-configuration-b17.610.52
mod-dcb-b8.329.84
mod-source-record-manager-b17.3918.27
mod-search-b7.198.53
mod-authtoken-b17.0413.37
mod-pubsub-b4.325.66
mod-dcb-b13.2312.33
mod-configuration-b3.2410.32
mod-search-b7.951.83
mod-oa-b2.953.35
mod-pubsub-b6.826.49
mod-patron-b2.862.4
pub-okapi3.563.64
mod-feesfines-b2.519.08
mod-circulation-storage-b3.352.7
mod-authtoken-b2.1712.71
mod-patron-b2.722.79
mod-entities-links-b2.151.81
mod-entities-links-b2.242.23
mod-circulation-storage-b2.012.9
mod-circulation-b1.981.8
mod-data-import-b1.61.58
mod-data-import-b1.761.88
edge-patron-b1.131.02
edge-patron-b1.131.16
mod-users-bl-b0.611.11
mod-patron-blocks-b0.971
mod-circulation-b0.552.09
mod-users-bl-b0.680.68
mod-patron-blocks-b0.410.95
pub-edge0.060.06
pub-okapi0.183.98
pub-edge0.070.07
pub-edge0.050.12

DI MARC BIB Create and Update + CICO

MSK instance: kafka.m7g.2xlarge Zookeeper mode

MSK instance: kafka.m7g.2xlarge Kraft mode

Service Memory Utilization

 MSK instance: kafka.m5.2xlarge vs MSK instance: kafka.m7g.2xlarge
  • Service memory consumption for module deltas higher or close to 10%:
    • In KRaft mode services utilized less memory in mod-inventory-b - 11%, mod-source-record-storage-b - 9%, and more in mod-data-import-b - 14%
ModuleMemory MSK instance: kafka.m7g.2xlarge Zookeeper modeMemory MSK instance: kafka.m7g.2xlarge Kraft modeDelta
mod-dcb-b74.3774.770.4
mod-inventory-b70.8159.63-11.18
mod-users-b50.3753.12.73
mod-di-converter-storage-b46.749.672.97
mod-feesfines-b45.4444.03-1.41
mod-inventory-storage-b33.430.49-2.91
mod-source-record-storage-b55.5346.38-9.15
okapi-b42.549.687.18
mod-data-import-b43.5557.7714.22
mod-patron-blocks-b42.3842.70.32
mod-search-b45.58482.42
mod-users-bl-b45.3645.820.46
mod-configuration-b38.6839.731.05
mod-source-record-manager-b41.9138.7-3.21
mod-pubsub-b35.9436.320.38
mod-quick-marc-b42.636.99-5.61
mod-patron-b30.5230.19-0.33
mod-entities-links-b34.4930.56-3.93
mod-authtoken-b27.3227.420.1
mod-circulation-b25.1525.01-0.14
edge-patron-b22.3823.160.78
mod-circulation-storage-b22.3428.986.64
nginx-okapi4.584.690.11
pub-okapi4.464.460
pub-edge4.414.35-0.06
pub-edge4.354.350

MSK instance: kafka.m7g.2xlarge Zookeeper mode

MSK instance: kafka.m7g.2xlarge Kraft mode

DB CPU Utilization

Average DB CPU utilization is 90% during both create and update jobs tests with different MSK instance types. DB CPU utilized 15% during Check-In/Check-Out period without DI.

MSK instance: kafka.m7g.2xlarge Zookeeper mode

MSK instance: kafka.m7g.2xlarge Kraft mode


DB Connections

Average connection count is about 900 connections for create and 860 connections for update jobs with CI/CO. 770 connections for CI/CO without data import for tests with different MSK instance types.

MSK instance: kafka.m7g.2xlarge Zookeeper mode

MSK instance: kafka.m7g.2xlarge Kraft mode

MSK instance resource utilization

 MSK resources table

MSK resource utilization (CPU)

  • No difference between two MSK clusters
Load scenarioBrokersMSK instance: kafka.m7g.2xlarge Zookeeper modeMSK instance: kafka.m7g.2xlarge Kraft modeDelta, %
CICO1109-1
2109-1
CICO+DI131321
23230-2

MSK resource utilization (DIsk) was growing gradually during tests with kafka.m5.2xlarge to 10%

Disk usage by broker

MSK instance: kafka.m7g.2xlarge Zookeeper mode

MSK instance: kafka.m7g.2xlarge Kraft mode

CPU (User) usage by broker

MSK instance: kafka.m7g.2xlarge Zookeeper mode

MSK instance: kafka.m7g.2xlarge Kraft mode

DB load

MSK instance: kafka.m7g.2xlarge Zookeeper mode


Top SQL-queries:


MSK instance: kafka.m7g.2xlarge Kraft mode


Top SQL-queries:



Appendix

Infrastructure

PTF -environment qcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 1 database  instance, writer

    NameMemory GIBvCPUsmax_connections

    db.r6g.xlarge

    32 GiB4 vCPUs2731
  • MSK perf-921-g2
    • m7g.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.2.tiered

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=2


  • MSK ptf-KRaft-mode
    • m7g.2xlarge brokers in 2 zones (total 2 brokers)
    • Apache Kafka version 3.7.x

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
    • revision - 26
    • metadata mode - KRaft
    • Total topics: 1474
    • Total partitions: 11909

Task count for module mod-graphql set to 0 before test start.

Modules

 All qcp1 modules
ModuleTask Definition RevisionModule VersionTask CountMem Hard LimitMem Soft LimitCPU UnitsXmxMetaspace SizeMax Metaspace Size
mod-remote-storage5mod-remote-storage:3.2.024920447210243960512512
mod-ncip5mod-ncip:1.14.42102489612876888128
mod-finance-storage5mod-finance-storage:8.6.021024896102470088128
mod-agreements5mod-agreements:7.0.0215921488128000
mod-ebsconet5mod-ebsconet:2.2.0212481024128700128256
mod-organizations5mod-organizations:1.9.02102489612876888128
mod-consortia3mod-consortia:1.1.025136477610244416384512
edge-sip23edge-sip2:3.2.0-SNAPSHOT.2092102489612876888128
mod-serials-management5mod-serials-management:1.0.02248023121281792384512
mod-settings5mod-settings:1.0.32102489620076888128
mod-data-import8mod-data-import:3.1.01204818442561292384512
edge-dematic5edge-dematic:2.2.01102489612876888128
mod-search5mod-search:3.2.0225922480204814405121024
mod-inn-reach3mod-inn-reach:3.2.0-SNAPSHOT.86236003240102428805121024
mod-tags5mod-tags:2.2.02102489612876888128
edge-courses5edge-courses:1.4.02102489612876888128
mod-authtoken6mod-authtoken:2.15.121440115251292288128
mod-inventory-update5mod-inventory-update:3.3.02102489612876888128
mod-notify5mod-notify:3.2.02102489612876888128
mod-configuration5mod-configuration:5.10.02102489612876888128
mod-orders-storage5mod-orders-storage:13.7.02102489651270088128
edge-caiasoft5edge-caiasoft:2.2.02102489612876888128
mod-login-saml5mod-login-saml:2.8.02102489612876888128
mod-erm-usage-harvester5mod-erm-usage-harvester:4.5.02102489612876888128
mod-licenses5mod-licenses:6.0.02248023125121792384512
mod-gobi5mod-gobi:2.8.02102489612876888128
mod-password-validator5mod-password-validator:3.2.0214401298128768384512
mod-bulk-operations5mod-bulk-operations:2.0.023072260010241536384512
mod-fqm-manager5mod-fqm-manager:2.0.12102489612876888128
edge-dcb5edge-dcb:1.1.02102489612876888128
mod-graphql6mod-graphql:1.12.12102489612876888128
mod-finance5mod-finance:4.9.02102489612876888128
mod-erm-usage5mod-erm-usage:4.7.022800255010241800384512
mod-batch-print6mod-batch-print:1.1.02102489612876888128
mod-copycat5mod-copycat:1.6.02102489612876888128
mod-lists5mod-lists:2.0.02102489612876888128
mod-entities-links6mod-entities-links:3.0.0225922480400144001024
mod-permissions10mod-permissions:6.5.02168415445121024384512
pub-edge3pub-edge:2023.06.142102489612876800
mod-orders5mod-orders:12.8.022048174010241024384512
edge-patron5edge-patron:5.1.02102489625676888128
edge-ncip5edge-ncip:1.9.22102489612876888128
edge-inn-reach3edge-inn-reach:3.1.1-SNAPSHOT.452102489612876888128
mod-users-bl5mod-users-bl:7.7.021440115251292288128
mod-oa4mod-oa:2.1.0-SNAPSHOT.622102489612876888256
mod-inventory-storage5mod-inventory-storage:27.1.024096369020483076384512
mod-invoice6mod-invoice:5.8.021440115251292288128
mod-user-import5mod-user-import:3.8.02102489612876888128
mod-sender6mod-sender:1.12.02102489612876888128
edge-oai-pmh5edge-oai-pmh:2.9.021512136010241440384512
mod-data-export-worker5mod-data-export-worker:3.2.123072280010242048384512
mod-rtac5mod-rtac:3.6.02102489612876888128
mod-circulation-storage5mod-circulation-storage:17.2.022880259215361814384512
mod-calendar5mod-calendar:3.1.02102489612876888128
mod-source-record-storage7mod-source-record-storage:5.8.025600500020483500384512
mod-event-config5mod-event-config:2.7.02102489612876888128
mod-courses5mod-courses:1.4.102102489612876888128
mod-circulation-item5mod-circulation-item:1.0.02102489612876888128
mod-inventory6mod-inventory:20.2.024096368810241814384512
mod-email5mod-email:1.17.02280025505121800384512
mod-pubsub5mod-pubsub:2.13.02153614401024922384512
mod-circulation5mod-circulation:24.2.022880259215361814384512
mod-di-converter-storage5mod-di-converter-storage:2.2.02102489612876888128
edge-rtac5edge-rtac:2.7.12102489612876888128
edge-orders5edge-orders:3.0.02102489612876888128
mod-users6mod-users:19.3.12102489612876888128
mod-template-engine5mod-template-engine:1.20.02102489612876888128
mod-patron-blocks5mod-patron-blocks:1.10.021024896102476888128
mod-audit5mod-audit:2.9.02102489612876888128
edge-fqm5edge-fqm:2.0.02102489612876888128
mod-source-record-manager6mod-source-record-manager:3.9.0-SNAPSHOT.33025600500020483500384512
nginx-edge3nginx-edge:2023.06.1421024896128000
mod-quick-marc5mod-quick-marc:5.1.01228821761281664384512
nginx-okapi3nginx-okapi:2023.06.1421024896128000
okapi-b5okapi:5.3.03168414401024922384512
mod-feesfines5mod-feesfines:19.1.02102489612876888128
mod-invoice-storage5mod-invoice-storage:5.8.021872153610241024384512
mod-dcb6mod-dcb:1.1.02102489612876888128
mod-service-interaction5mod-service-interaction:4.0.12204818442561290384512
mod-data-export17mod-data-export:5.0.41204818442048000
mod-patron5mod-patron:6.1.02102489612876888128
mod-oai-pmh5mod-oai-pmh:3.13.024096369020483076384512
edge-connexion5edge-connexion:1.2.02102489612876888128
mod-kb-ebsco-java5mod-kb-ebsco-java:4.0.02102489612876888128
mod-notes5mod-notes:5.2.021024896128952384512
mod-data-export-spring5mod-data-export-spring:3.2.01204818442561536384512
mod-organizations-storage5mod-organizations-storage:4.7.02102489612876888128
mod-login5mod-login:7.11.02144012981024768384512
pub-okapi3pub-okapi:2023.06.142102489612876800
mod-eusage-reports5mod-eusage-reports:2.1.12102489612876888128

Methodology/Approach

  • Compare two tests results