Dependencies between mod-pubsub kafka partitions and CICO performance(Orchid)

Dependencies between mod-pubsub kafka partitions and CICO performance(Orchid)

 

Overview

According to PERF-534 It's been observed that DI's performance was improved greatly when DI Kafka topics' partitions were increased to 2.

In this testing effort, we'd like to see if increasing mod-pubsub's Kafka topics partitions from 1 to two would have the same positive impact on the Check In Check Out workflow as many mod-pubsub's topics are related to circulation.  We will test CICO with R/W split enabled and disabled as well.

Summary

  • It doesn’t looks like changing of mod-pub-sub partitions to 2 helping much. 

  • For some tests response times were faster, and for some tests were slower (-20ms + 50ms).

  • There is no big benefits when enabling Read/Write split on CICO in these standalone tests. However they may appear during real life usage and/or running several high DB load workflows (such as Data Import), as it will distribute load between DB nodes.

Recommendations

The only notable observation is that mod-pub-sub Kafka topics has naming pattern include mod-pub-sub version like: 

  • ncp5.pub-sub.fs09000000.FEE_FINE_BALANCE_CHANGED.mod-pubsub-2.7.0

  • ncp5.pub-sub.fs09000000.FEE_FINE_BALANCE_CHANGED.mod-pubsub-2.9.1

  • ncp5.pub-sub.fs09000000.FEE_FINE_BALANCE_CHANGED.mod-pubsub-2.10.0-SNAPSHOT

*which is same topic for different mod-pubsub versions. 

If mod-pubsub gets updated frequently, then the old topics might still hang around and will accumulate unnecessarily. So possibly it's a good idea to exclude the version number from topic naming pattern.

 

Test Sets 

Test #

Test Conditions

Duration 

Load generator size 

Load generator Memory(GiB) 

Notes

 

1.

8,20,30,75users CI/CO 

30 mins each

t3.large

3

2 pub-sub partition, R/W split enabled

2.

8,20,30,75users CI/CO 

30 mins each

t3.large

3

1 pub-sub partitions, R/W split enabled

3.

8,20,30,75users CI/CO 

30 mins each

t3.large

3

1 pub-sub partition, R/W split disabled

4.

8,20,30,75users CI/CO 

30 mins each

t3.large

3

2 pub-sub partitions, R/W split disabled

Results

Below listed response times (average (avg.) 75 percentile and 95 percentile) for tests (8,20,30,75 users) with 1 and 2 mod-pub-sub Kafka topic partitions.

Also there is comparison provided between 2 and 1 partitions. Number +n mean that particular response time is slower by n ms comparing with appropriate number from 1 partition test. 

With Read Write split enabled

R/W Split enabled


CI

CO

2 partitions

1 partition

2 partitions

1 partition

avg.

75%

95%

avg. 

75%

95%

avg.

75%

95%

avg. 

75%

95%

8 users

0.476

0.496+2

0.556-4

0.476

0.494

0.560

0.763-7

0.784+38

0.890-27

0.770

0.746

0.917

20 users

0.459-4

0.477-7

0.527-12

0.463

0.484

0.539

0.740-10

0.763-10

0.845-13

0.750

0.773

0.858

30 users

0.456+16

0.482+18

0.530+24

0.440

0.464

0.506

0.747+8

0.770+9

0.848+6

0.739

0.761

0.842

75 users

0.526-3

0.574-6

0.707+12

0.529

0.580

0.695

0.951-4

1.020-11

1.187+2

0.955

1.031

1.185

*Here we can see that there is no significant difference in response times between 1 and 2 partitions of mod-pub-sub kafka topics when R/W split is enabled. For some cases it's better and for some it's worse so we can conclude that it has no pattern and having 2 partitions have no benefits in response times. 

Read Write split disabled 

R/W Split Disabled


CI

CO

2 partitions

1 partition

2 partitions

1 partition

avg.

75%

95%

avg. 

75%

95%

avg.

75%

95%

avg. 

75%

95%

8 users

0.484+17

0.497+20

0.561+7

0.467

0.477

0.554

0.759+24

0.769+26

0.926+70

0.735

0.743

0.856

20 users

0.471+11

0.468-11

0.540+19

0.460

0.479

0.521

0.748+14

0.771+15

0.848+16

0.734

0.756

0.832

30 users

0.446-7

0.472-6

0.520-2

0.453

0.479

0.522

0.727-10

0.749-9

0.824-5

0.737

0.758

0.829

75 users

0.552

0.604+91

0.736+37

0.522

0.513

0.669

0.977+49

1.046+54

1.220

0.928

0.992

1.120

*Here we can see that there is no significant difference in response times between 1 and 2 partitions of mod-pub-sub kafka topics when R/W split is disabled. For some cases it's better and for some it's worse so we can conclude that it has no pattern and having 2 partitions have no benefits in response times. 

Comparisons

Comparison between RW/Split enabled/disabled with 1 and 2 partitions

Table below shows how many milliseconds will we save or miss if we'll enable Read/Write split on DB. (R/W split disabled response times are baseline numbers for comparisons)

Notable observations:

  • As shown here - there is a big difference in CPU usage pattern with and without R/W split. For now I doesn't looks like it helping much, and in most cases it makes performance worse. However there possibly will be performance benefits during real life usage and/or running several high DB load workflows (such as Data Import), as it will distribute load between DB nodes.

  • For now - no visible pattern to conclude if R/W split is working better with 1 mod-pub-sub partition or with 2.

R/W Split Disabled


CI

CO

2 partitions

1 partition

2 partitions

1 partition

avg.

75%

95%

avg. 

75%

95%

avg.

75%

95%

avg. 

75%

95%

8 users

0.484-8

0.497-1

0.561-5

0.467+9

0.477+17

0.554+6

0.759+4

0.769+15

0.926-36

0.735+35

0.743+3

0.856+61

20 users

0.471-17

0.468+9

0.540-13

0.460+3

0.479+5

0.521+18

0.748-8

0.771-8

0.848-3

0.734+16

0.756+17

0.832+26

30 users

0.446+10

0.472+10

0.520+10

0.453-13

0.479-15

0.522-16

0.727+20

0.749+21

0.824+20

0.737+2

0.758+3

0.829+13

75 users

0.552-26

0.604-30

0.736+29

0.522+7

0.513+67

0.669+26

0.977-26

1.046-26

1.220-13

0.928+27

0.992+39

1.120+65

Comparison between current result vs initial results

Initial results was made by measuring CICO performance on snapshot version of modules

As a base for current result we'll use CICO results with Read/Write split disabled and with one mod-pub-sub Kafka partition.(As it was also input conditions for initial testing).

 

Initial test

CI

CO

avg. 

95%

avg. 

95%

8 users

0.467 534 (-14%)

0.554 1'041 (-87%)

0.735 909 (-23%)

0.856 1'462 (-70%)