Kafka KRaft Mode - Data Import with Check-ins Check-outs (Quesnelia)[non-ECS] MSK instance type comparison

Kafka KRaft Mode - Data Import with Check-ins Check-outs (Quesnelia)[non-ECS] MSK instance type comparison

Overview

This document contains the results of testing workflows Check-in/Check-out and Data Import for MARC Bibliographic records in the Quesnelia release with a new MSK instance type where zookeeper instances are not required. The main idea is to see how the kafka.m7g.2xlarge with KRaft mode affects FOLIO performance. Compared results for main workflows with different instance types: kafka.m5.2xlarge against kafka.m7g.2xlarge.

Ticket: PERF-936: Kafka KRaft ModeClosed

Summary

  • Comparing kafka.m5.2xlarge, zookeeper metadata mode against kafka.m7g.2xlarge, KRaft metadata mode

    • Tests with KRaft mode enabled utilize less CPU resources by brokers ((5% at least, for some broker - 13%) during CI/CO and during DI + CI/CO and at the same time the CPU utilizations are more balanced across the brokers compared to zookeeper mode

    • The previous test results  * showed that the performance actually the same for m5 and m7g clusters so m7g and m7g KRaft mode should perform the same way as well. Results for comparison **

    • Main workflows KPIs do not degrade in tests with Check-in/Check-out with Data import

    • No memory leaks. No errors during tests.

  • Resource utilization

    • Service memory usage doesn't differ in tests with both MSK cluster

    • Service CPU utilization in Zookeeper mode: mod-inventory-b - 147%, mod-quick-marc-b - 81%, mod-di-converter-storage-b - 116%, nginx-okapi - 79% the rest of modules utilized less than 50%

    • Service CPU utilization in KRaft mode: mod-inventory-b - 133%, mod-quick-marc-b - 75%, mod-di-converter-storage-b - 103%, nginx-okapi - 75% the rest of modules utilized less than 50%

    • DB CPU was on level of 90% during tests with both MSK cluster

*Kafka Zookeeper mode - Data Import with Check-ins Check-outs (Quesnelia)[non-ECS] MSK instance type comparison

**Kafka Zookeeper Mode vs KRaft Mode MSK - instance type comparison

Test Runs 

Test #

MSK instance type

Scenario

Load level

Test #

MSK instance type

Scenario

Load level

1

kafka.m5.2xlarge

CICO + DI MARC Bib Create

 8 users + 5K, 25K sequentially

2

DI MARC Bib Create

5K, 25K sequentially

3

CICO + DI MARC Bib Update

 8 users + 5K, 25K sequentially

4

DI MARC Bib Update

5K, 25K sequentially

5

kafka.m7g.2xlarge

CICO + DI MARC Bib Create

 8 users + 5K, 25K sequentially

6

DI MARC Bib Create

5K, 25K sequentially

7

CICO + DI MARC Bib Update

 8 users + 5K, 25K sequentially

8

DI MARC Bib Update

5K, 25K sequentially

Test Results

This table shows results of Check-In/Check-out and Data Import create and update jobs.

MSK instance: kafka.m5.2xlarge, metadata mode - ZooKeeper

MSK instance: kafka.m5.2xlarge, metadata mode - ZooKeeper

Job profile

File size

DI Duration without CI/CO

DI Duration with CI/CO

CI with DI Average sec

CO with DI Average sec

PTF - Create 2

5k

00:03:45

00:02:44

0.736

1.16

 

25k

00:14:40

00:13:36

0.787

1.176

PTF - Updates Success - 6

5k

00:04:43

00:04:18

0.764

1.153

 

25k

00:20:21

00:21:25

0.767

1.179

MSK instance: kafka.m7g.2xlarge, metadata mode - KRaft

Job profile

File size

DI Duration without CI/CO

DI Duration with CI/CO

CI with DI Average sec

CO with DI Average sec

PTF - Create 2

5k

00:02:49

00:02:39

0.765

1.118

 

25k

00:13:31

00:12:04

0.777

1.186

PTF - Updates Success - 6

5k

00:04:36

00:04:31

0.706

1.095

 

25k

00:24:07

00:21:50

0.74

1.16

Check-in/Check-out without DI

Scenario

Load level

Request

Response time, sec
MSK instance: kafka.m5.2xlarge

Response time, sec
MSK instance: kafka.m7g.2xlarge

95 perc

average

95 perc

average

Circulation Check-in/Check-out (without Data import)

8 users

Check-in

0.695

0.587

0.695

0.583

Check-out

1.148

0.958

1.151

0.944

Comparison

Data Import durations and Check-In/Check-Out response time comparison

  • Data import durations fluctuate within a 10% range of the baseline (tests with Zookeeper metadata mode)

  • Response times of CI/CO with Data import do not differ in both MSK clusters

Job Profile

File size

DELTA, DI without CI/CO

DELTA, DI+CI/CO

DELTA, CI with DI

DELTA, CO with DI

Job Profile

File size

DELTA, DI without CI/CO

DELTA, DI+CI/CO

DELTA, CI with DI

DELTA, CO with DI

PTF - Create 2

5k

00:00:56

00:00:05

-0.029

0.042

25k

00:01:09

00:01:32

0.01

-0.01

PTF - Updates Success - 6

5k

00:00:07

-00:00:13

0.058

0.058

25k

-00:03:46

-00:00:25

0.027

0.019

Check-in/Check-out without DI

  • Check-in/Check-out perform the same in both MSK clusters. The difference of response times is so small that it can be neglected.

Scenario

Load level

Request

Response time, sec
MSK instance: kafka.m5.2xlarge

Response time, sec
MSK instance: kafka.m7g.2xlarge

 

Delta

95 perc

average

95 perc

average

Average

Circulation Check-in/Check-out (without Data import)

8 users

Check-in

0.695

0.587

0.695

0.583

0.004

 

Check-out

1.148

0.958

1.151

0.944

0.014

MSK resource utilization (CPU)

Load scenario

Brokers

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

Delta, %

Load scenario

Brokers

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

Delta, %

CICO

1

13

9

-4

2

13

9

-4

CICO+DI

1

45

32

-13

2

34

30

-4

Response time

MSK instance: kafka.m5.2xlarge

MSK instance: kafka.m7g.2xlarge

Service CPU Utilization

CPU utilization table

MSK instance: kafka.m5.2xlarge

 

MSK instance: kafka.m7g.2xlarge

MSK instance: kafka.m5.2xlarge

 

MSK instance: kafka.m7g.2xlarge

Module

CPU (CICO + 25k Create)

CPU (CICO + 25k Update)

 

Module

CPU (CICO + 25k Create)

CPU (CICO + 25k Update)

mod-inventory-b

107.84

147.17

 

mod-inventory-b

139.1

133.42

mod-quick-marc-b

79.98

81.45

 

mod-di-converter-storage-b

103.49

96.49

mod-di-converter-storage-b

75.12

116.01

 

mod-quick-marc-b

75.45

72.77

nginx-okapi

51.7

79.05

 

nginx-okapi

75.33

73.68

okapi-b

27.65

43.17

 

okapi-b

41.74

51.2

mod-source-record-storage-b

24.82

39.79

 

mod-source-record-storage-b

38.11

34.22

mod-inventory-storage-b

18.52

20.75

 

mod-inventory-storage-b

23.13

26.1

mod-source-record-manager-b

16.94

18.05

 

mod-source-record-manager-b

17.16

16.33

mod-dcb-b

8.03

7.8

 

mod-users-b

9.18

21.82

mod-search-b

7.87

1.44

 

mod-dcb-b

8.32

9.84

mod-pubsub-b

7.32

7.3

 

mod-search-b

7.19