Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
Overview In Progress

...

  • Comparing kafka.m5.2xlarge, zookeeper metadata mode against kafka.m7g.2xlarge, KRaft metadata mode
    • Resource utilization

    Test Runs 

    ...

    Test #

    ...

    Scenario

    ...

      • Tests with KRaft mode enabled utilize less (5% at least, for some broker - 13%) CPU by brokers during CI/CO and during DI + CI/CO and the same time it is more balanced compared to zookeeper mode
      • Main workflows KPIs do not degrade in tests with Check-in/Check-out with Data import
    • Resource utilization

    Test Runs 

    Test #

    MSK instance type

    Scenario

    Load level
    1kafka.m5.2xlargeCICO + DI MARC Bib Create 8 users + 5K, 25K sequentially
    2DI MARC Bib Create5K, 25K sequentially
    3CICO + DI MARC Bib Update 8 users + 5K, 25K sequentially
    4DI MARC Bib Update5K, 25K sequentially
    5kafka.m7g.2xlargeCICO + DI MARC Bib Create 8 users + 5K, 25K sequentially
    6DI MARC Bib Create5K, 25K sequentially
    7CICO + DI MARC Bib Update 8 users + 5K, 25K sequentially
    8DI MARC Bib Update5K, 25K sequentially

    ...

    This table shows results of Check-In/Check-out and Data Import create and update jobs.

    MSK instance: kafka.m5.2xlarge, metadata mode - ZooKeeper
    Job profileFile sizeDI Duration without CI/CODI Duration with CI/COCI with DI Average secCO with DI Average sec
    PTF - Create 25k00:03:4500:02:440.7361.16

    25k00:14:4000:13:360.7871.176
    PTF - Updates Success - 65k00:04:4300:04:180.7641.153

    25k00:20:2100:21:250.7671.179
    MSK instance: kafka.m7g.2xlarge, metadata mode - KRaft
    Job profileFile sizeDI Duration without CI/CODI Duration with CI/COCI with DI Average secCO with DI Average sec
    PTF - Create 25k00:02:4900:02:390.7651.118

    25k00:13:3100:12:040.7771.186
    PTF - Updates Success - 65k00:04:3600:04:310.7061.095

    25k00:24:0700:21:500.741.16


    Check-in/Check-out without DI

    ...

    • Data import durations fluctuate within a 10% range of the baseline (tests with Zookeeper metadata mode)
    • Response times of CI/CO with Data import do not differ in both MSK clusters
    Job ProfileFile sizeDELTA, DI without CI/CODELTA, DI+CI/CODELTA, CI with DIDELTA, CO with DI
    PTF - Create 25k00:00:5600:00:05-0.0290.042
    25k00:01:0900:01:320.01-0.01
    PTF - Updates Success - 65k00:00:0700:00:130.0580.058
    25k00:03:4600:00:250.0270.019


    Check-in/Check-out without DI

    ...

    MSK resource utilization (CPU)

    Load scenarioBrokersMSK instance: kafka.m5.2xlargeMSK instance: kafka.m7g.2xlargeDelta, %
    CICO1


    2


    CICO+DI1


    2


    Response time

    MSK instance: kafka.m5.2xlarge

    ...

    MSK instance: kafka.m7g.2xlarge


    MSK instance resource utilization

    • Tests with KRaft mode enabled utilize less CPU resources during CI/CO and during DI + CI/CO and the same time it is more balanced compared to zookeeper mode
    • The difference is 5% at least. For some brokers the difference is 13%.
    Expand
    titleMSK resources table

    MSK resource utilization (CPU)

    Load scenario
    Load scenarioBrokersMSK instance: kafka.m5.2xlargeMSK instance: kafka.m7g.2xlargeDelta, %
    CICO1139-4
    2139-4
    CICO+DI14532-13
    23430-4

    MSK resource utilization (DIsk) was 4,6% with kafkagrowing gradually during tests with kafka.m5.2xlarge and 4,3% with kafka.m7g.2xlarge which may be neglected.to 10%

    Disk usage by broker

    MSK instance: kafka.m5.2xlarge

    Image Added

    MSK instance: kafka.m7g.2xlarge

    Image Added

    CPU (User) usage by broker

    MSK instance: kafka.m5.2xlarge

    Image Added

    MSK instance: kafka.m7g.2xlarge

    Image Added

DB load

MSK instance: kafka.m5.2xlarge

...

Appendix

Infrastructure

PTF -environment qcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 1 database  instance, writer

    NameMemory GIBvCPUsmax_connections

    db.r6g.xlarge

    32 GiB4 vCPUs2731


  • MSK ptf-mobius-testing2
    • 2 m5.2xlarge brokers in 2 zones (total 2 brokers)
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=2
    • revision - 2
    • metadata mode - ZooKeeper
  • MSK ptf-KRaft-mode
    • m7g.2xlarge brokers in 2 zones (total 2 brokers)
    • Apache Kafka version 3.7.x

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
    • revision - 26
    • metadata mode - KRaft

...

Expand
titleAll qcp1 modules


ModuleTask Definition RevisionModule VersionTask CountMem Hard LimitMem Soft LimitCPU UnitsXmxMetaspace SizeMax Metaspace Size
mod-remote-storage5mod-remote-storage:3.2.024920447210243960512512
mod-ncip5mod-ncip:1.14.42102489612876888128
mod-finance-storage5mod-finance-storage:8.6.021024896102470088128
mod-agreements5mod-agreements:7.0.0215921488128000
mod-ebsconet5mod-ebsconet:2.2.0212481024128700128256
mod-organizations5mod-organizations:1.9.02102489612876888128
mod-consortia3mod-consortia:1.1.025136477610244416384512
edge-sip23edge-sip2:3.2.0-SNAPSHOT.2092102489612876888128
mod-serials-management5mod-serials-management:1.0.02248023121281792384512
mod-settings5mod-settings:1.0.32102489620076888128
mod-data-import8mod-data-import:3.1.01204818442561292384512
edge-dematic5edge-dematic:2.2.01102489612876888128
mod-search5mod-search:3.2.0225922480204814405121024
mod-inn-reach3mod-inn-reach:3.2.0-SNAPSHOT.86236003240102428805121024
mod-tags5mod-tags:2.2.02102489612876888128
edge-courses5edge-courses:1.4.02102489612876888128
mod-authtoken6mod-authtoken:2.15.121440115251292288128
mod-inventory-update5mod-inventory-update:3.3.02102489612876888128
mod-notify5mod-notify:3.2.02102489612876888128
mod-configuration5mod-configuration:5.10.02102489612876888128
mod-orders-storage5mod-orders-storage:13.7.02102489651270088128
edge-caiasoft5edge-caiasoft:2.2.02102489612876888128
mod-login-saml5mod-login-saml:2.8.02102489612876888128
mod-erm-usage-harvester5mod-erm-usage-harvester:4.5.02102489612876888128
mod-licenses5mod-licenses:6.0.02248023125121792384512
mod-gobi5mod-gobi:2.8.02102489612876888128
mod-password-validator5mod-password-validator:3.2.0214401298128768384512
mod-bulk-operations5mod-bulk-operations:2.0.023072260010241536384512
mod-fqm-manager5mod-fqm-manager:2.0.12102489612876888128
edge-dcb5edge-dcb:1.1.02102489612876888128
mod-graphql6mod-graphql:1.12.12102489612876888128
mod-finance5mod-finance:4.9.02102489612876888128
mod-erm-usage5mod-erm-usage:4.7.022800255010241800384512
mod-batch-print6mod-batch-print:1.1.02102489612876888128
mod-copycat5mod-copycat:1.6.02102489612876888128
mod-lists5mod-lists:2.0.02102489612876888128
mod-entities-links6mod-entities-links:3.0.0225922480400144001024
mod-permissions10mod-permissions:6.5.02168415445121024384512
pub-edge3pub-edge:2023.06.142102489612876800
mod-orders5mod-orders:12.8.022048174010241024384512
edge-patron5edge-patron:5.1.02102489625676888128
edge-ncip5edge-ncip:1.9.22102489612876888128
edge-inn-reach3edge-inn-reach:3.1.1-SNAPSHOT.452102489612876888128
mod-users-bl5mod-users-bl:7.7.021440115251292288128
mod-oa4mod-oa:2.1.0-SNAPSHOT.622102489612876888256
mod-inventory-storage5mod-inventory-storage:27.1.024096369020483076384512
mod-invoice6mod-invoice:5.8.021440115251292288128
mod-user-import5mod-user-import:3.8.02102489612876888128
mod-sender6mod-sender:1.12.02102489612876888128
edge-oai-pmh5edge-oai-pmh:2.9.021512136010241440384512
mod-data-export-worker5mod-data-export-worker:3.2.123072280010242048384512
mod-rtac5mod-rtac:3.6.02102489612876888128
mod-circulation-storage5mod-circulation-storage:17.2.022880259215361814384512
mod-calendar5mod-calendar:3.1.02102489612876888128
mod-source-record-storage7mod-source-record-storage:5.8.025600500020483500384512
mod-event-config5mod-event-config:2.7.02102489612876888128
mod-courses5mod-courses:1.4.102102489612876888128
mod-circulation-item5mod-circulation-item:1.0.02102489612876888128
mod-inventory6mod-inventory:20.2.024096368810241814384512
mod-email5mod-email:1.17.02280025505121800384512
mod-pubsub5mod-pubsub:2.13.02153614401024922384512
mod-circulation5mod-circulation:24.2.022880259215361814384512
mod-di-converter-storage5mod-di-converter-storage:2.2.02102489612876888128
edge-rtac5edge-rtac:2.7.12102489612876888128
edge-orders5edge-orders:3.0.02102489612876888128
mod-users6mod-users:19.3.12102489612876888128
mod-template-engine5mod-template-engine:1.20.02102489612876888128
mod-patron-blocks5mod-patron-blocks:1.10.021024896102476888128
mod-audit5mod-audit:2.9.02102489612876888128
edge-fqm5edge-fqm:2.0.02102489612876888128
mod-source-record-manager6mod-source-record-manager:3.9.0-SNAPSHOT.33025600500020483500384512
nginx-edge3nginx-edge:2023.06.1421024896128000
mod-quick-marc5mod-quick-marc:5.1.01228821761281664384512
nginx-okapi3nginx-okapi:2023.06.1421024896128000
okapi-b5okapi:5.3.03168414401024922384512
mod-feesfines5mod-feesfines:19.1.02102489612876888128
mod-invoice-storage5mod-invoice-storage:5.8.021872153610241024384512
mod-dcb6mod-dcb:1.1.02102489612876888128
mod-service-interaction5mod-service-interaction:4.0.12204818442561290384512
mod-data-export17mod-data-export:5.0.41204818442048000
mod-patron5mod-patron:6.1.02102489612876888128
mod-oai-pmh5mod-oai-pmh:3.13.024096369020483076384512
edge-connexion5edge-connexion:1.2.02102489612876888128
mod-kb-ebsco-java5mod-kb-ebsco-java:4.0.02102489612876888128
mod-notes5mod-notes:5.2.021024896128952384512
mod-data-export-spring5mod-data-export-spring:3.2.01204818442561536384512
mod-organizations-storage5mod-organizations-storage:4.7.02102489612876888128
mod-login5mod-login:7.11.02144012981024768384512
pub-okapi3pub-okapi:2023.06.142102489612876800
mod-eusage-reports5mod-eusage-reports:2.1.12102489612876888128


Methodology/Approach

  • Populate ptf-mobius-testing2 cluster with topics from tenant cluster
  • Run CICO for 2 hours
  • After 10 min delay after start of CICO Run DI Create - Export - Update for 5 and 25k
  • Run alone Data Imports
  • Create new kafka cluster
  • Populate NEW cluster with topics from tenant cluster
  • Run CICO for 2 hours
  • After 10 min delay after start of CICO Run DI Create - Export - Update for 5 and 25k
  • Run alone Data Imports
  • Compare resource utilization of MSK and main KPI for CICO & DI

...