Mobius baseline test

Overview

Thees tests performed In order to define baseline response time and resource usages. Find possible issues/red flags in infrastructure and/or workflows behaviour.

Summary

  • Low memory on 4xLarge DB during test. this was first assumption that it’s causing slowness.

  • Memory on 4xLarge DB is low even when system in idle state. (±30 Gb from 120Gb available on instance type). May be due to high disk swap rate, lots of indexes and schemas

  • With 8xLarge DB - response times for most workflows increases from 300 ms up to few seconds. However error rate becomes lower.

  • with 8xLarge DB - problem with low RAM dissapered, however it’s still consuming a lot of men in idle state.

  • Multiple (optimistic locking) errors was found in DB logs

 

Recommendations & Jiras

Original ticket: PERF-807 Fixed load tests of MOBIUS workflows.

Test Runs 

Test #

Test Conditions

Duration 

Load generator size (recommended)

Load generator Memory(GiB) (recommended)

Notes

(Optional)

  1.  

Mobius baseline test

30 mins

m5a.xlarge

9

4xLarge DB

2.

Mobius baseline test

30 mins

m5a.xlarge

9

8xLarge DB

Results

Flow

Test 1 4xLarge DB

error count

Test 2 8xLarge DB

error count

Flow

Test 1 4xLarge DB

error count

Test 2 8xLarge DB

error count

CI

6,42

1

6,707

0

CO

10,52

4

10,820

0

create invoice

34,73

0

35,422

0

approve invoice

33,13

0

33,008

0

pay invoice

31,119

0

31,354

0

sahre local instance

11,018

43

11,500

19

export custom workflow

353,296

0

327,27

0

export default

336,563

0

338,833

0

view account

22,45

21

22,642

0

create ILR

5,085

2

5,372

0

search

1,88

52

3,519

12

create order

99,937

0

97,97

0

add order line

185,765

415!!!!!

186,44

415!!!!

approve order

146,352

0

143,998

0

RTAC

33,15

0

32,68

0

singe record create

30,395

44

29,336

19

single record update

96,046

3

94,419

1

receiving order line

307,271

0

301,103

0

serial receiving

282,336

0

279,132

0

unrecieving a piece

106,148

0

103,033

0

renew

8,023

6

8,341

0

Note: Here with 8XLarge DB comparing to 4XLarge DB response times is higher, however errors count is smaller.

Note: add order line add order line workflow has high error rate because of one call that is failing all the time. [GET] erm/entitlements failing with 503.

Memory Utilization

image-20240328-114715.png
image-20240328-114801.png

Note: for most of a modules memory utilisation looks good and without growing trend. Except few modules on next chart.

Note: here - few modules behaviour is suspicious. Most disturbing ones is mod-inventory and mod-search

 

CPU Utilization 

 

Note: most stressed modules: mod-authtoken -110%, okapi - 70% in both tests

 

RDS Metrics

 

Note: Here, predictably 8xLarge DB has lower CPU utilisation than 4XLarge DB.

For 4XLarge DB - CPU utilisation did reach ±90% only once at the4 beginning of a test. Average CPU utilisation for 4XLarge is 50% , for 8Xlarge 30%.

Note: RDS connections rate is more or less the same for 4xLarge and 8Xlarge and approximately is 3-3,5K concurrent connections.

 

Note: Here we can see that with 4XLarge DB we are running low on available memory. As you can see - before the test it has only 30Gb available (while DB instance itself has 120Gb), and during test it doped to >1G.

Additional information from module and database logs

 

 

 

Appendix

Infrastructure

PTF -environment mcpt

  • 11 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • db.r6.4xlarge (db.r6.8xlarge) database instance, one writer

  • ptf-mobius-testing2

    • 2 m5.2xlarge brokers in 2 zones

    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=2

 

 

Module
mcpt-pvt

Task Def. Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft limit

CPU units

Xmx

MetaspaceSize

MaxMetaspaceSize

Module
mcpt-pvt

Task Def. Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft limit

CPU units

Xmx

MetaspaceSize

MaxMetaspaceSize

mod-remote-storage

10

mod-remote-storage:3.0.1

2

4920

4472

1024

3960

512

512

mod-ncip

6

mod-ncip:1.14.4

2

1024

896

128

768

88

128

mod-finance-storage

6

mod-finance-storage:8.5.0

2

1024

896

1024

700

88

128

mod-agreements

11

mod-agreements:6.0.2

0

1592

1488

1024

0

0

0

mod-ebsconet

6

mod-ebsconet:2.1.1

2

1248

1024

128

700

128

256

edge-sip2

6

edge-sip2:3.1.1

2

1024

896

128

768

88

128

mod-consortia

11

mod-consortia:1.0.3

2

2048

1802

128

768

88

128

mod-organizations

6

mod-organizations:1.8.0

2

1024

896

128

700

88

128

mod-settings

11

mod-settings:1.0.2

2

1024

896

200

768

88

128

edge-dematic

9

edge-dematic:2.1.1

1

1024

896

128

768

88

128

mod-data-import

7

mod-data-import:3.0.7

1

2048

1844

256

1292

384

512

mod-search

12

mod-search:3.0.5

2

2592

2480

2048

1440

512

1024

mod-tags

6

mod-tags:2.1.0

2

1024

896

128

768

88

128

mod-authtoken

7

mod-authtoken:2.14.1

2

1440

1152

1024

922

88

128

edge-courses

6

edge-courses:1.3.0

2

1024

896

128

768

88

128

mod-inventory-update

6

mod-inventory-update:3.2.1

2

1024

896

128

768

88

128

mod-notify

6

mod-notify:3.1.0

2

1024

896

128

768

88

128

mod-configuration

6

mod-configuration:5.9.2

2

1024

896

128

768

88

128

mod-orders-storage

7

mod-orders-storage:13.6.0

2

1024

896

512

700

88

128

edge-caiasoft

6

edge-caiasoft:2.1.0

2

1024

896

128

768

88

128

mod-login-saml

9

mod-login-saml:2.7.2

2

1024

896

128

768

88

128

mod-erm-usage-harvester

7

mod-erm-usage-harvester:4.4.1

2

1024

896

128

768

88

128

mod-password-validator

6

mod-password-validator:3.1.0

2

1440

1298

128

768

384

512

mod-gobi

6

mod-gobi:2.7.1

2

1024

896

128

700

88

128

mod-licenses

9

mod-licenses:5.0.2

2

2480

2312

128

1792

384

512

mod-fqm-manager

6

mod-fqm-manager:1.0.3

2

3000

2600

128

2048

384