Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

Table of Contents
Overview

...

This document contains the results of testing workflows Check-in/Check-out and Data Import for MARC Bibliographic records in the Quesnelia release with a new MSK instance type where zookeeper instances are not required. The main idea is to see how the kafka.m7g.2xlarge with KRaft mode affects FOLIO performance. Compared results for main workflows with different instance types: kafka.m5.2xlarge against kafka.m7g.2xlarge.

...

  • Comparing kafka.m5.2xlarge, zookeeper metadata mode against kafka.m7g.2xlarge, KRaft metadata mode
    • Resource utilization

    Test Runs 

    ...

    Test #

    ...

    Scenario

    ...

    Test Results

    This table shows results of Check-In/Check-out and Data Import create and update jobs.

    ...

    Check-in/Check-out without DI

    ...

      • Tests with KRaft mode enabled utilize less CPU resources by brokers ((5% at least, for some broker - 13%) during CI/CO and during DI + CI/CO and at the same time the CPU utilizations are more balanced across the brokers compared to zookeeper mode
      • The previous test results  * showed that the performance actually the same for m5 and m7g clusters so m7g and m7g KRaft mode should perform the same way as well. Results for comparison **
      • Main workflows KPIs do not degrade in tests with Check-in/Check-out with Data import
      • No memory leaks. No errors during tests.
    • Resource utilization
      • Service memory usage doesn't differ in tests with both MSK cluster
      • Service CPU utilization in Zookeeper mode: mod-inventory-b - 147%, mod-quick-marc-b - 81%, mod-di-converter-storage-b - 116%, nginx-okapi - 79% the rest of modules utilized less than 50%
      • Service CPU utilization in KRaft mode: mod-inventory-b - 133%, mod-quick-marc-b - 75%, mod-di-converter-storage-b - 103%, nginx-okapi - 75% the rest of modules utilized less than 50%
      • DB CPU was on level of 90% during tests with both MSK cluster

    *Kafka Zookeeper mode - Data Import with Check-ins Check-outs (Quesnelia)[non-ECS] MSK instance type comparison

    **Kafka Zookeeper Mode vs KRaft Mode MSK - instance type comparison

    Test Runs 

    Test #

    MSK instance type

    Scenario

    Load level
    1kafka.m5.2xlargeCICO + DI MARC Bib Create 8 users + 5K, 25K sequentially
    2DI MARC Bib Create5K, 25K sequentially
    3CICO + DI MARC Bib Update 8 users + 5K, 25K sequentially
    4DI MARC Bib Update5K, 25K sequentially
    5kafka.m7g.2xlargeCICO + DI MARC Bib Create 8 users + 5K, 25K sequentially
    6DI MARC Bib Create5K, 25K sequentially
    7CICO + DI MARC Bib Update 8 users + 5K, 25K sequentially
    8DI MARC Bib Update5K, 25K sequentially

    Test Results

    This table shows results of Check-In/Check-out and Data Import create and update jobs.

    MSK instance: kafka.m5.2xlarge, metadata mode - ZooKeeper
    Job profileFile sizeDI Duration without CI/CODI Duration with CI/COCI with DI Average secCO with DI Average sec
    PTF - Create 25k00:03:4500:02:440.7361.16

    25k00:14:4000:13:360.7871.176
    PTF - Updates Success - 65k00:04:4300:04:180.7641.153

    25k00:20:2100:21:250.7671.179
    MSK instance: kafka.m7g.2xlarge, metadata mode - KRaft
    Job profileFile sizeDI Duration without CI/CODI Duration with CI/COCI with DI Average secCO with DI Average sec
    PTF - Create 25k00:02:4900:02:390.7651.118

    25k00:13:3100:12:040.7771.186
    PTF - Updates Success - 65k00:04:3600:04:310.7061.095

    25k00:24:0700:21:500.741.16


    Check-in/Check-out without DI

    ScenarioLoad levelRequest

    Response time, sec
    MSK instance: kafka.m5.2xlarge

    Response time, sec
    MSK instance: kafka.m7g.2xlarge

    95 percaverage95 percaverage
    Circulation Check-in/Check-out (without Data import)8 usersCheck-in0.6950.5870.6950.583
    Check-out1.1480.9581.1510.944

    Comparison

    Data Import durations and Check-In/Check-Out response time comparison

    • Data import durations fluctuate within a 10% range of the baseline (tests with Zookeeper metadata mode)
    • Response times of CI/CO with Data import do not differ in both MSK clusters
    Job ProfileFile sizeDELTA, DI without CI/CODELTA, DI+CI/CODELTA, CI with DIDELTA, CO with DI
    PTF - Create 25k00:00:5600:00:05-0.0290.042
    25k00:01:0900:01:320.01-0.01
    PTF - Updates Success - 65k00:00:07-00:00:130.0580.058
    25k-00:03:46-00:00:250.0270.019


    Check-in/Check-out without DI

    • Check-in/Check-out perform the same in both MSK clusters. The difference of response times is so small that it can be neglected.
    ScenarioLoad levelRequest

    Response time, sec
    MSK instance: kafka.m5.2xlarge

    Response time, sec
    MSK instance: kafka.m7g.2xlarge


    Delta

    95 percaverage95 percaverageAverage
    Circulation Check-in/Check-out (without Data import)8 usersCheck-in0.6950.5870.6950.5830.004

    Check-out1.1480.9581.1510.9440.014


    MSK resource utilization (CPU)

    Load scenarioBrokersMSK instance: kafka.m5.2xlargeMSK instance: kafka.m7g.2xlargeDelta, %
    CICO1139-4
    2139-4
    CICO+DI14532-13
    23430-4

    Response time

    MSK instance: kafka.m5.2xlarge

    Image Added

    MSK instance: kafka.m7g.2xlarge

    Image Added

    Service CPU Utilization

    CPU utilization table

    Expand
    titleMSK instance: kafka.m5.2xlarge vs MSK instance: kafka.m7g.2xlarge


    MSK instance: kafka.m5.2xlarge
    MSK instance: kafka.m7g.2xlarge
    ModuleCPU (CICO + 25k Create)CPU (CICO + 25k Update)
    ModuleCPU (CICO + 25k Create)CPU (CICO + 25k Update)
    mod-inventory-b107.84147.17
    mod-inventory-b139.1133.42
    mod-quick-marc-b79.9881.45
    mod-di-converter-storage-b103.4996.49
    mod-di-converter-storage-b75.12116.01
    mod-quick-marc-b75.4572.77
    nginx-okapi51.779.05
    nginx-okapi75.3373.68
    okapi-b27.6543.17
    okapi-b41.7451.2
    mod-source-record-storage-b24.8239.79
    mod-source-record-storage-b38.1134.22
    mod-inventory-storage-b18.5220.75
    mod-inventory-storage-b23.1326.1
    mod-source-record-manager-b16.9418.05
    mod-source-record-manager-b17.1616.33
    mod-dcb-b8.037.8
    mod-users-b9.1821.82
    mod-search-b7.871.44
    mod-dcb-b8.329.84
    mod-pubsub-b7.327.3
    mod-search-b7.198.53
    mod-users-b6.326.23
    mod-pubsub-b4.325.66
    mod-entities-links-b3.862.27
    mod-configuration-b3.2410.32
    mod-configuration-b3.613.4
    mod-oa-b2.953.35
    mod-patron-b2.862.66
    mod-patron-b2.862.4
    mod-authtoken-b2.862.13
    mod-feesfines-b2.519.08
    mod-oa-b2.82.86
    mod-authtoken-b2.1712.71
    mod-feesfines-b2.32.15
    mod-entities-links-b2.151.81
    mod-circulation-storage-b2.012.15
    mod-circulation-storage-b2.012.9
    mod-data-import-b1.61.72
    mod-data-import-b1.61.58
    edge-patron-b1.081.08
    edge-patron-b1.131.02
    mod-users-bl-b0.530.52
    mod-users-bl-b0.611.11
    mod-patron-blocks-b0.470.43
    mod-circulation-b0.552.09
    mod-circulation-b0.350.37
    mod-patron-blocks-b0.410.95
    pub-okapi0.140.15
    pub-okapi0.183.98
    pub-edge0.070.07
    pub-edge0.050.12


    DI MARC BIB Create and Update + CICO

    MSK instance: kafka.m5.2xlarge

    ...

    Image Added

    MSK instance:

    ...

     kafka.m7g.2xlarge

    ...

    Comparison

    Data Import durations and Check-In/Check-Out response time comparison

    • Data import durations fluctuate within a 10% range of the baseline (tests with Zookeeper metadata mode)
    • Response times of CI/CO with Data import do not differ in both MSK clusters

    ...

    Check-in/Check-out without DI

    • Check-in/Check-out perform the same in both MSK clusters. The difference of response times is so small that it can be neglected.

    ...

    Response time, sec
    MSK instance: kafka.m5.2xlarge

    Response time, sec
    MSK instance: kafka.m7g.2xlarge

    ...

    Delta

    ...

    MSK resource utilization (CPU)

    ...

    Response time

    MSK instance: kafka.m5.2xlarge

    MSK instance: kafka.m7g.2xlarge

    Service CPU Utilization

    Delta for CPU utilization shows in mod-di-converter-storage-b 20% decrease for update job and 10% decrease for mod-feesfines-b module. The most part of modules CPU utilization deltas fluctuate under 10%. 

    ...

    titleMSK instance: kafka.m5.2xlarge vs MSK instance: kafka.m7g.2xlarge

    DI MARC BIB Create and Update + CICO

    MSK instance: kafka.m5.2xlarge

    MSK instance: kafka.m7g.2xlarge

    Service Memory Utilization

    ...

    Image Added


    Service Memory Utilization

    Expand
    titleMSK instance: kafka.m5.2xlarge vs MSK instance: kafka.m7g.2xlarge
    • The comparison of memory resource utilization revealed no difference between tests
    ModuleMemory (kafka.m5.2xlarge)Memory (kafka.m7g.2xlarge)Delta
    mod-oa-b80.7--
    mod-dcb-b74.6174.770.16
    mod-inventory-b59.6159.630.02
    mod-data-import-b57.7557.770.02
    mod-users-b53.2253.1-0.12
    okapi-b49.6549.680.03
    mod-di-converter-storage-b49.6149.670.06
    mod-search-b47.97480.03
    mod-source-record-storage-b46.3846.380
    mod-users-bl-b45.8745.82-0.05
    mod-feesfines-b44.244.03-0.17
    mod-patron-blocks-b42.9142.7-0.21
    mod-configuration-b39.7639.73-0.03
    mod-source-record-manager-b38.7138.7-0.01
    mod-quick-marc-b36.9536.990.04
    mod-pubsub-b36.1936.320.13
    mod-entities-links-b30.5630.560
    mod-inventory-storage-b30.5130.49-0.02
    mod-patron-b30.1930.190
    mod-circulation-storage-b28.9828.980
    mod-authtoken-b27.3827.420.04
    mod-circulation-b2525.010.01
    edge-patron-b23.1623.160
    nginx-okapi4.694.690
    pub-okapi4.464.460
    pub-edge4.354.350


    MSK instance: kafka.m5.2xlarge

    Image Added

    MSK instance: kafka.m7g.2xlarge

    Image Added

    DB CPU Utilization

    Average DB CPU utilization is 85% 90% during both create jobs and 87% during update jobs for tests with different MSK instance types. DB CPU utilized 15% during Check-In/Check-Out period without DI.

    MSK instance: kafka.m5.2xlarge

    Image Added

    MSK instance: kafka.m7g.2xlarge

    Image Added


    DB Connections

    Average connection count is about 850 900 connections for create and 860 connections for update jobs with CI/CO and 730 . 770 connections for CI/CO without data import for tests with different MSK instance types.

    MSK instance: kafka.m5.2xlarge

    Image Added

    MSK instance: kafka.m7g.2xlarge

    MSK instance resource utilization

    Expand
    titleMSK resources table

    MSK resource utilization (CPU)

    kafka.m7g.2xlarge

    Image Added

    MSK instance resource utilization

    Expand
    titleMSK resources table

    MSK resource utilization (CPU)

    • Tests with KRaft mode enabled utilize less CPU resources during CI/CO and during DI + CI/CO and the same time it is more balanced compared to zookeeper mode
    • The difference is 5% at least. For some brokers the difference is 13%.
    Load scenarioBrokersMSK instance: kafka.m5.2xlargeMSK instance: kafka.m7g.2xlargeDelta, %
    CICO1139-4
    2139-4
    CICO+DI14532-13
    23430-4

    MSK resource utilization (DIsk) was 4,6% with kafka.m5.2xlarge and 4,3% with kafka.m7g.2xlarge which may be neglected.growing gradually during tests with kafka.m5.2xlarge to 10%

    Disk usage by broker

    MSK instance: kafka.m5.2xlarge

    Image Added

    MSK instance: kafka.m7g.2xlarge

    Image Added

    CPU (User) usage by broker

    MSK instance: kafka.m5.2xlarge

    Image Added

    MSK instance: kafka.m7g.2xlarge

    Image Added

    DB load

    MSK instance: kafka.m5.2xlarge

    Image Added

    Image Added


    Top SQL-queries:

    Image Added

    Image Added

    MSK instance: kafka.m7g.2xlarge

    Image Added

    Image Added


    Top SQL-queries:

    Image Added

    Image Added



    Appendix

    Infrastructure

    PTF -environment qcp1

    • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
    • 1 database  instance, writer

      NameMemory GIBvCPUsmax_connections

      db.r6g.xlarge

      32 GiB4 vCPUs2731


    • MSK ptf-mobius-testing2
      • 2 m5.2xlarge brokers in 2 zones (total 2 brokers)
      • Apache Kafka version 2.8.0

      • EBS storage volume per broker 300 GiB

      • auto.create.topics.enable=true
      • log.retention.minutes=480
      • default.replication.factor=2
      • revision - 2
      • metadata mode - ZooKeeper
      • Total topics: 1534
      • Total partitions: 12155
    • MSK ptf-KRaft-mode
      • m7g.2xlarge brokers in 2 zones (total 2 brokers)
      • Apache Kafka version 3.7.x

      • EBS storage volume per broker 300 GiB

      • auto.create.topics.enable=true
      • log.retention.minutes=480
      • default.replication.factor=3
      • revision - 26
      • metadata mode - KRaft
      • Total topics: 1474
      • Total partitions: 11909

    Task count for module mod-graphql set to 0 before test start.

    ...

    Expand
    titleAll qcp1 modules


    ModuleTask Definition RevisionModule VersionTask CountMem Hard LimitMem Soft LimitCPU UnitsXmxMetaspace SizeMax Metaspace Size
    mod-remote-storage5mod-remote-storage:3.2.024920447210243960512512
    mod-ncip5mod-ncip:1.14.42102489612876888128
    mod-finance-storage5mod-finance-storage:8.6.021024896102470088128
    mod-agreements5mod-agreements:7.0.0215921488128000
    mod-ebsconet5mod-ebsconet:2.2.0212481024128700128256
    mod-organizations5mod-organizations:1.9.02102489612876888128
    mod-consortia3mod-consortia:1.1.025136477610244416384512
    edge-sip23edge-sip2:3.2.0-SNAPSHOT.2092102489612876888128
    mod-serials-management5mod-serials-management:1.0.02248023121281792384512
    mod-settings5mod-settings:1.0.32102489620076888128
    mod-data-import8mod-data-import:3.1.01204818442561292384512
    edge-dematic5edge-dematic:2.2.01102489612876888128
    mod-search5mod-search:3.2.0225922480204814405121024
    mod-inn-reach3mod-inn-reach:3.2.0-SNAPSHOT.86236003240102428805121024
    mod-tags5mod-tags:2.2.02102489612876888128
    edge-courses5edge-courses:1.4.02102489612876888128
    mod-authtoken6mod-authtoken:2.15.121440115251292288128
    mod-inventory-update5mod-inventory-update:3.3.02102489612876888128
    mod-notify5mod-notify:3.2.02102489612876888128
    mod-configuration5mod-configuration:5.10.02102489612876888128
    mod-orders-storage5mod-orders-storage:13.7.02102489651270088128
    edge-caiasoft5edge-caiasoft:2.2.02102489612876888128
    mod-login-saml5mod-login-saml:2.8.02102489612876888128
    mod-erm-usage-harvester5mod-erm-usage-harvester:4.5.02102489612876888128
    mod-licenses5mod-licenses:6.0.02248023125121792384512
    mod-gobi5mod-gobi:2.8.02102489612876888128
    mod-password-validator5mod-password-validator:3.2.0214401298128768384512
    mod-bulk-operations5mod-bulk-operations:2.0.023072260010241536384512
    mod-fqm-manager5mod-fqm-manager:2.0.12102489612876888128
    edge-dcb5edge-dcb:1.1.02102489612876888128
    mod-graphql6mod-graphql:1.12.12102489612876888128
    mod-finance5mod-finance:4.9.02102489612876888128
    mod-erm-usage5mod-erm-usage:4.7.022800255010241800384512
    mod-batch-print6mod-batch-print:1.1.02102489612876888128
    mod-copycat5mod-copycat:1.6.02102489612876888128
    mod-lists5mod-lists:2.0.02102489612876888128
    mod-entities-links6mod-entities-links:3.0.0225922480400144001024
    mod-permissions10mod-permissions:6.5.02168415445121024384512
    pub-edge3pub-edge:2023.06.142102489612876800
    mod-orders5mod-orders:12.8.022048174010241024384512
    edge-patron5edge-patron:5.1.02102489625676888128
    edge-ncip5edge-ncip:1.9.22102489612876888128
    edge-inn-reach3edge-inn-reach:3.1.1-SNAPSHOT.452102489612876888128
    mod-users-bl5mod-users-bl:7.7.021440115251292288128
    mod-oa4mod-oa:2.1.0-SNAPSHOT.622102489612876888256
    mod-inventory-storage5mod-inventory-storage:27.1.024096369020483076384512
    mod-invoice6mod-invoice:5.8.021440115251292288128
    mod-user-import5mod-user-import:3.8.02102489612876888128
    mod-sender6mod-sender:1.12.02102489612876888128
    edge-oai-pmh5edge-oai-pmh:2.9.021512136010241440384512
    mod-data-export-worker5mod-data-export-worker:3.2.123072280010242048384512
    mod-rtac5mod-rtac:3.6.02102489612876888128
    mod-circulation-storage5mod-circulation-storage:17.2.022880259215361814384512
    mod-calendar5mod-calendar:3.1.02102489612876888128
    mod-source-record-storage7mod-source-record-storage:5.8.025600500020483500384512
    mod-event-config5mod-event-config:2.7.02102489612876888128
    mod-courses5mod-courses:1.4.102102489612876888128
    mod-circulation-item5mod-circulation-item:1.0.02102489612876888128
    mod-inventory6mod-inventory:20.2.024096368810241814384512
    mod-email5mod-email:1.17.02280025505121800384512
    mod-pubsub5mod-pubsub:2.13.02153614401024922384512
    mod-circulation5mod-circulation:24.2.022880259215361814384512
    mod-di-converter-storage5mod-di-converter-storage:2.2.02102489612876888128
    edge-rtac5edge-rtac:2.7.12102489612876888128
    edge-orders5edge-orders:3.0.02102489612876888128
    mod-users6mod-users:19.3.12102489612876888128
    mod-template-engine5mod-template-engine:1.20.02102489612876888128
    mod-patron-blocks5mod-patron-blocks:1.10.021024896102476888128
    mod-audit5mod-audit:2.9.02102489612876888128
    edge-fqm5edge-fqm:2.0.02102489612876888128
    mod-source-record-manager6mod-source-record-manager:3.9.0-SNAPSHOT.33025600500020483500384512
    nginx-edge3nginx-edge:2023.06.1421024896128000
    mod-quick-marc5mod-quick-marc:5.1.01228821761281664384512
    nginx-okapi3nginx-okapi:2023.06.1421024896128000
    okapi-b5okapi:5.3.03168414401024922384512
    mod-feesfines5mod-feesfines:19.1.02102489612876888128
    mod-invoice-storage5mod-invoice-storage:5.8.021872153610241024384512
    mod-dcb6mod-dcb:1.1.02102489612876888128
    mod-service-interaction5mod-service-interaction:4.0.12204818442561290384512
    mod-data-export17mod-data-export:5.0.41204818442048000
    mod-patron5mod-patron:6.1.02102489612876888128
    mod-oai-pmh5mod-oai-pmh:3.13.024096369020483076384512
    edge-connexion5edge-connexion:1.2.02102489612876888128
    mod-kb-ebsco-java5mod-kb-ebsco-java:4.0.02102489612876888128
    mod-notes5mod-notes:5.2.021024896128952384512
    mod-data-export-spring5mod-data-export-spring:3.2.01204818442561536384512
    mod-organizations-storage5mod-organizations-storage:4.7.02102489612876888128
    mod-login5mod-login:7.11.02144012981024768384512
    pub-okapi3pub-okapi:2023.06.142102489612876800
    mod-eusage-reports5mod-eusage-reports:2.1.12102489612876888128


    Methodology/Approach

    • Populate ptf-mobius-testing2 cluster with topics from tenant cluster
    • Run CICO for 2 hours
    • After 10 min delay after start of CICO Run DI Create - Export - Update for 5 and 25k
    • Run alone Data Imports
    • Create new kafka cluster
    • Populate NEW cluster with topics from tenant cluster
    • Run CICO for 2 hours
    • After 10 min delay after start of CICO Run DI Create - Export - Update for 5 and 25k
    • Run alone Data Imports
    • Compare resource utilization of MSK and main KPI for CICO & DI

    Additional/Files

    Topics:

    View file
    nameptf-kafka-tenantCluster-topics_2replicationfactor_BU.csv
    height250

    Excel raw data: