Kafka Zookeeper Mode vs KRaft Mode MSK - instance type comparison

Kafka Zookeeper Mode vs KRaft Mode MSK - instance type comparison

Overview

This document contains the comparison results of testing workflows Check-in/Check-out and Data Import for MARC Bibliographic records in the Quesnelia release with different instance types: kafka.m7g.2xlarge zookeeper mode against kafka.m7g.2xlarge KRaft mode.

Tickets: PERF-921: Test with kafka.m7g.2xlargeClosedVSPERF-936: Kafka KRaft ModeClosed

Summary

  • Comparing kafka.m7g.2xlarge zookeeper metadata mode against kafka.m7g.2xlarge KRaft metadata mode

    • No difference in MSK resource utilization (CPU and Disk)  comparing two MSK clusters

    • In KRaft mode services utilized less memory in mod-inventory-b on 11%, mod-source-record-storage-b on 9%, and more memory in mod-data-import-b on 14%. For the rest modules the difference was less than 7%

    • No significant difference in service CPU utilization:

      • Service CPU utilization in Zookeeper mode: mod-inventory-b - 136%, mod-quick-marc-b - 96%, mod-di-converter-storage-b - 100%, nginx-okapi - 88% the rest of modules utilized less than 50%

      • Service CPU utilization in KRaft mode: mod-inventory-b - 133%, mod-quick-marc-b - 75%, mod-di-converter-storage-b - 103%, nginx-okapi - 75% the rest of modules utilized less than 50%

    • No difference in Check-in/Check-out response time

    • Data import durations fluctuate from test to test but work stable without issues

Test Runs 

Test #

MSK instance type

Scenario

Load level

Test #

MSK instance type

Scenario

Load level

1

kafka.m7g.2xlarge

zookeeper mode

CICO + DI MARC Bib Create

 8 users + 5K, 25K sequentially

2

DI MARC Bib Create

5K, 25K sequentially

3

CICO + DI MARC Bib Update

 8 users + 5K, 25K sequentially

4

DI MARC Bib Update

5K, 25K sequentially

5

kafka.m7g.2xlarge

KRaft mode

CICO + DI MARC Bib Create

 8 users + 5K, 25K sequentially

6

DI MARC Bib Create

5K, 25K sequentially

7

CICO + DI MARC Bib Update

 8 users + 5K, 25K sequentially

8

DI MARC Bib Update

5K, 25K sequentially

Test Results

This table shows results of Check-In/Check-out and Data Import create and update jobs.

MSK instance: kafka.m7g.2xlarge, metadata mode - ZooKeeper

MSK instance: kafka.m7g.2xlarge, metadata mode - ZooKeeper

Job profile

File size

DI Duration without CI/CO

DI Duration with CI/CO

CI with DI Average sec

CO with DI Average sec

PTF - Create 2

5k

00:03:05

00:02:39

0.707

1.104

 

25k

00:12:03

00:12:08

0.718

1.129

PTF - Updates Success - 6

5k

00:03:36

00:03:34

0.742

1.124

 

25k

00:17:05

00:17:33

0.756

1.148

MSK instance: kafka.m7g.2xlarge, metadata mode - KRaft

Job profile

File size

DI Duration without CI/CO

DI Duration with CI/CO

CI with DI Average sec

CO with DI Average sec

PTF - Create 2

5k

00:02:49

00:02:39

0.765

1.118

 

25k

00:13:31

00:12:04

0.777

1.186

PTF - Updates Success - 6

5k

00:04:36

00:04:31

0.706

1.095

 

25k

00:24:07

00:21:50

0.74

1.16

Check-in/Check-out without DI

Scenario

Load level

Request

Response time, sec
MSK instance: kafka.m7g.2xlarge Zookeeper mode

Response time, sec
MSK instance: kafka.m7g.2xlarge

KRaft mode

95 perc

average

95 perc

average

Circulation Check-in/Check-out (without Data import)

8 users

Check-in

0.72

0.606

0.695

0.583

Check-out

1.241

0.969

1.151

0.944

Comparison

Data Import durations and Check-In/Check-Out response time comparison

  • Response times of CI/CO with Data import do not differ in both MSK clusters

Job Profile

File size

DELTA, DI without CI/CO

DELTA, DI+CI/CO

DELTA, CI with DI

DELTA, CO with DI

Job Profile

File size

DELTA, DI without CI/CO

DELTA, DI+CI/CO

DELTA, CI with DI

DELTA, CO with DI

PTF - Create 2

5k

00:00:16

00:00:00

-0.058

-0.014

25k

-00:01:28

00:00:04

-0.059

-0.057

PTF - Updates Success - 6

5k

-00:01:00

-00:00:57

0.036

0.029

25k

-00:07:02

-00:04:17

0.016

-0.012

Check-in/Check-out without DI

  • Check-in/Check-out perform the same in both MSK clusters. The difference of response times is so small that it can be neglected.

Scenario

Load level

Request

Response time, sec
MSK

instance: kafka.m7g.2xlarge

Zookeeper mode

Response time, sec
MSK

instance: kafka.m7g.2xlarge 

KRaft mode

 

Delta

95 perc

average

95 perc

average

Average

Circulation Check-in/Check-out (without Data import)

8 users

Check-in

0.72

0.606

0.695

0.583

-0.023

Check-out

1.241

0.969

1.151

0.944

-0.025

MSK resource utilization (CPU)

Load scenario

Brokers

MSK instance: 

kafka.m7g.2xlarge

Zookeeper mode

MSK instance: 

kafka.m7g.2xlarge 

KRaft mode

Delta, %

Load scenario

Brokers

MSK instance: 

kafka.m7g.2xlarge

Zookeeper mode

MSK instance: 

kafka.m7g.2xlarge 

KRaft mode

Delta, %

CICO

1

10

9

-1

2

10

9

-1

CICO+DI

1

31

32

1

2

32

30

-2

Response time

MSK instance: kafka.m7g.2xlarge Zookeeper mode

MSK instance: kafka.m7g.2xlarge Kraft mode

Service CPU Utilization

CPU utilization table

MSK instance: kafka.m7g.2xlarge Zookeeper mode

 

MSK instance: kafka.m7g.2xlarge Kraft mode

MSK instance: kafka.m7g.2xlarge Zookeeper mode

 

MSK instance: kafka.m7g.2xlarge Kraft mode

Module

CPU (CICO + 25k Create)

CPU (CICO + 25k Update)

 

Module

CPU (CICO + 25k Create)

CPU (CICO + 25k Update)

mod-inventory-b

115.21

136.94

 

mod-inventory-b

139.1

133.42

mod-quick-marc-b

95.15

96.4

 

mod-di-converter-storage-b

103.49

96.49

mod-di-converter-storage-b

81.26

100.43

 

mod-quick-marc-b

75.45

72.77

nginx-okapi

70.58

88.94

 

nginx-okapi

75.33

73.68

okapi-b

38.89

50.55

 

okapi-b

41.74

51.2

mod-source-record-storage-b

31.61

39.13

 

mod-source-record-storage-b

38.11

34.22

mod-users-b

23.6

22.12

 

mod-inventory-storage-b

23.13

26.1

mod-inventory-storage-b

21.37

19.9

 

mod-source-record-manager-b

17.16

16.33

mod-feesfines-b

18.28

9.11

 

mod-users-b

9.18

21.82

mod-configuration-b

17.6

10.52

 

mod-dcb-b

8.32

9.84

mod-source-record-manager-b

17.39

18.27

 

mod-search-b