Test New Relic impact on FOLIO

Overview


The task at hand involves conducting performance tests on Check-in/Check-Out operations under two different conditions:

  • Test Run 1: Running without New Relic.

  • Test Run 2: Running with the New Relic monitoring tool enabled.

  • Test Run 3: Running with the New Relic monitoring tool enabled and changed JVM mod-oa parameter from -XX:MaxMetaspaceSize=128m to -XX:MaxMetaspaceSize=256m

These tests were performed using 20 virtual users to simulate the load on the system. The environment was used for these tests QCP1.

The primary goal of these tests is to assess whether the New Relic monitoring tool impacts the system’s performance.
Ticket: https://folio-org.atlassian.net/browse/PERF-920

Summary

Test Run 1 (Without New Relic). Generate 3159 Check-out transactions and 2420 Check-in transactions. The error rate during the test was 0.04 %. Median Response time for CO = 960 seconds and median Response time for CI =640 seconds.

Test Run 2 (With New Relic). Generate 3180 Check-out transactions and 2381 Check-in transactions. The error rate during the test was 0.04 %. Median Response time for CO = 968 seconds and median Response time for CI = 594 seconds.
Test Run 3 (With New Relic + mod-oa(256m)). Generate 3212 Check-out transactions and 2402 Check-in transactions. The error rate during the test was 0.04 %. Median Response time for CO = 948 seconds and median Response time for CI = 588 seconds.

Conclusion. Two separate tests were conducted in order to determine the impact of enabling New Relic monitoring tool on the Check-in/Check-Out process. The aim of the test was to identify any significant variations in key performance metrics that could potentially be attributed to New Relic.

Comparative analysis of the key performance metrics obtained from both tests yielded the following insights:

  1. Response Times: The median and 95th percentile values for response times exhibited similar trends across both test runs, indicating that New Relic did not add any noticeable overhead to the response times.

  2. Error Rate: The error rates remained consistent in both test scenarios. This suggests that the integration of New Relic did not compromise the stability of the Check-in/Check-Out process.

  3. CPU Usage: The slight variation in CPU usage within the tolerance limit of 5% between the test runs poses no significant concern and is well within acceptable boundaries.

  4. Memory Usage: No changes in memory usage patterns were observed upon enabling New Relic, implying the tool has minimal if any, impact on the memory resources of the system.

In conclusion, the results of the tests demonstrate that utilizing the New Relic monitoring tool does not adversely affect the check-in/check-out process and the overall system performance remains unaffected, but after deployment of New RElic we observed high CPU consumption and OOM errors on mod-oa service.

  1. After increasing mod-oa parameter from -XX:MaxMetaspaceSize=128m to -XX:MaxMetaspaceSize=256m memory CPU utilization decrease to 80%.

Recommendations & Jiras (Optional)

  • While the current tests provided insights into how New Relic impacts the CICO process, it's recommended for future analysis to cover a more extensive range of operations to fully evaluate the impact of New Relic on the system performance during data-import, data-export, or other workflows.

  • Investigate errors described in the Error section, high CPU consumption and OutOfMemoryError on mod-oa module

Test Runs 

  • Test Run 1: Running without New Relic.

    • Test setup: 20 Virtual users, 4000 seconds, 200 seconds of rump up period.

  • Test Run 2: Running with the New Relic monitoring tool enabled.

    • Test setup: 20 Virtual users, 4000 seconds, 200 seconds of rump up period

  • Test Run 3: Running with the New Relic monitoring tool enabled and mod-oa(256m).

    • Test setup: 20 Virtual users, 4000 seconds, 200 seconds of rump up period

Results

NAME

Test 1. Without New Relic

Test 2. With New Relic

Test 3. With New Relic + mod-oa(256m)

CO

CI

CO

CI

CO

CI

TTL REQ, COUNT

3159

2420

3180

2381

3212

2402

THRGHPT, REQ/SEC

1.585

1.214

1.594

1.193

1.638

1.239

ERRORS, COUNT

36

23

43

16

36

24

MIN, MS

24

39

25

46

28

35

MEDIAN, MS

959

640

968

594

947

588

PCT95, MS

1,226

759

1,247

740

1,150

682

MAX, MS

6,504

20,712

14,834

12,165

12,694

17,582

Resource utilization

CPU utilization

The slight variation in CPU usage within the tolerance limit of 5% between the test runs poses no significant concern and is well within acceptable boundaries.

Test Run 1:

image-20240605-081912.png

Test Run 2:

image-20240605-082645.png


Test Run 3:


Top 16 CPU-consuming services

Module

Avrg CPU Utlz (Test 1) %

Avrg CPU Utlz (Test 2)%

Avrg CPU Utlz (Test 3)%

Module

Avrg CPU Utlz (Test 1) %

Avrg CPU Utlz (Test 2)%

Avrg CPU Utlz (Test 3)%

mod-users-b

46.04

45.43

47

nginx-okapi

33.77

27.58

32

mod-configuration-b

29.59

27.53

31

mod-feesfines-b

27.13

27.68

27

okapi-b

25.76

30.62

28

mod-authtoken-b

22.81

19.87

17

mod-permissions-b

18.43

19.49

21

mod-consortia-b

20.87

22.13

20

mod-dcb-b

17.56

17.94

17.5

mod-audit-b

15.98

11.50

13.2

mod-inventory-b

14.23

15.14

14.4

mod-agreements-b

11.36

12.43

11.7

mod-pubsub-b

10.64

11.49

12.1

mod-notes-b

8

10.19

11

mod-calendar-b

9.69

9.31

9

pub-okapi

9.02

10.12

10

 

Memory utilization

Memory Usage: No changes in memory usage patterns were observed upon enabling New Relic. No memory leaks in any module.

Test Run 1:

Test Run 2:

Test 3:


Top Memory-consuming services

Module

Avrg Memory Utlz (Test 1) %

Avrg Memory Utlz (Test 2)%

Module

Avrg Memory Utlz (Test 1) %

Avrg Memory Utlz (Test 2)%

mod-data-export-b

-

104.04

mod-users-b

46.04

45.43

nginx-okapi

33.77

27.58

mod-configuration-b

29.59

27.53

mod-feesfines-b

27.13

27.68

okapi-b

25.76

30.62

mod-authtoken-b

22.81

19.87

mod-permissions-b

18.23

19.49

mod-consortia-b

20.87

-

mod-dcb-b

17.56

17.94

mod-audit-b

15.98

18.568

mod-inventory-b

14.23

15.14

Errors

After the deployment of New Relic on QCP1, two services were identified as having high CPU consumption: mod-oa-b and mod-data-export. These two services had new task definitions with new relic variables.


mod-oa was throwing error messages “OutOfMemoryError



Infrastructure


PTF -environment qcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • 2 database  instances, writer/reader

    NameMemory GIBvCPUsmax_connections

    db.r6g.xlarge

    32 GiB4 vCPUs2731

  • MSK tenant

    • 4 m5.2xlarge brokers in 2 zones

    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3

QCP1 modules

Module

Task Def. Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft limit

CPU units

Xmx

MetaspaceSize

MaxMetaspaceSize

Module

Task Def. Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft limit

CPU units

Xmx

MetaspaceSize

MaxMetaspaceSize

qcp1-pvt

 

 

 

 

 

 

 

 

 

mod-remote-storage

4

mod-remote-storage:3.2.0

2

4920

4472

1024

3960

512

512

mod-ncip

4

mod-ncip:1.14.4

2

1024

896

128

768

88

128

mod-finance-storage

4

mod-finance-storage:8.6.0

2

1024

896

1024

700

88

128

mod-agreements

4

mod-agreements:7.0.0

2

1592

1488

128

0

0

0

mod-ebsconet

4

mod-ebsconet:2.2.0

2

1248

1024

128

700

128

256

mod-organizations

4

mod-organizations:1.9.0

2

1024

896

128

700

88

128

mod-consortia

2

mod-consortia:1.1.0

2

3072

2048

128

2048

512

1024

edge-sip2

2

edge-sip2:3.2.0-SNAPSHOT.209

2

1024

896

128

768

88

128

mod-serials-management

4

mod-serials-management:1.0.0

2

2480

2312

128

1792

384

512

mod-settings

4

mod-settings:1.0.3

2

1024

896

200

768

88

128

mod-data-import

7

mod-data-import:3.1.0

1

2048

1844

256

1292

384

512

edge-dematic

4

edge-dematic:2.2.0

1

1024

896

128

768

88

128

mod-search

4

mod-search:3.2.0

2

2592

2480

2048

1440

512

1024

mod-inn-reach

2

mod-inn-reach:3.2.0-SNAPSHOT.86

2

3600

3240

1024

2880

512

1024

mod-tags

4

mod-tags:2.2.0

2

1024

896

128

768

88

128

edge-courses

4

edge-courses:1.4.0

2

1024

896

128

768

88

128

mod-authtoken

5

mod-authtoken:2.15.1

2

1440

1152

512

922

88

128

mod-inventory-update

4

mod-inventory-update:3.3.0

2

1024

896

128

768

88

128

mod-notify

4

mod-notify:3.2.0

2

1024

896

128

768

88

128

mod-configuration

4

mod-configuration:5.10.0

2

1024

896

128

768

88

128

mod-orders-storage

4

mod-orders-storage:13.7.0

2

1024

896

512

700

88

128

edge-caiasoft

4

edge-caiasoft:2.2.0

2

1024

896

128

768

88

128

mod-login-saml

4

mod-login-saml:2.8.0

2

1024

896

128

768

88

128

mod-erm-usage-harvester

4

mod-erm-usage-harvester:4.5.0

2

1024

896

128

768

88

128

mod-licenses

4

mod-licenses:6.0.0

2

2480

2312

128

1792

384

512

mod-gobi

4

mod-gobi:2.8.0

2

1024

896

128

700

88

128

mod-password-validator

4

mod-password-validator:3.2.0

2

1440

1298

128

768

384

512

mod-bulk-operations

4

mod-bulk-operations:2.0.0

2

3072

2600

1024

1536

384

512

mod-fqm-manager

4

mod-fqm-manager:2.0.1

2

3000

2600

128

2048

384

512

edge-dcb

4

edge-dcb:1.1.0

2

1024

896

128

768

88

128

mod-graphql

5

mod-graphql:1.12.1

2

1024

896

128

768

88

128

mod-finance

4

mod-finance:4.9.0

2

1024

896

128

700

88

128

mod-erm-usage

4

mod-erm-usage:4.7.0

2

1024

896

128

768

88

128

mod-batch-print

5

mod-batch-print:1.1.0

2

1024

896

128

768

88

128

mod-copycat

4

mod-copycat:1.6.0

2

1024

512

128

768

88

128

mod-lists

4

mod-lists:2.0.0

2

3000

2600

128

2048

384

512

mod-entities-links

5

mod-entities-links:3.0.0

2

2592

2480

400

1440

0

1024

mod-permissions

8

mod-permissions:6.5.0

2

1684

1544

512

1024

384

512

pub-edge

3

pub-edge:2023.06.14

2

1024

896

128

768

0

0

mod-orders

4

mod-orders:12.8.0

2

2048

1440

1024

1024

384

512

edge-patron

4

edge-patron:5.1.0

2

1024

896

256

768

88

128

edge-ncip

4

edge-ncip:1.9.2

2

1024

896

128

768

88

128

edge-inn-reach

2

edge-inn-reach:3.1.1-SNAPSHOT.45

2

1024

896

128

768

88

128

mod-users-bl

4

mod-users-bl:7.7.0

2

1440

1152

512

922

88

128

mod-oa

2

mod-oa:2.1.0-SNAPSHOT.62

2

1024

896

128

768

88

128

mod-inventory-storage

4

mod-inventory-storage:27.1.0

2

4096

3690

2048

3076

384

512

mod-invoice

5

mod-invoice:5.8.0

2

1440

1152

512

922

88

128

mod-user-import

4

mod-user-import:3.8.0

2

1024

896

128

768

88

128

mod-sender

5

mod-sender:1.12.0

2

1024

896

128

768

88

128

edge-oai-pmh

4

edge-oai-pmh:2.9.0

2

1512

1360

1024

1440

384

512

mod-data-export-worker

4

mod-data-export-worker:3.2.1

2

3072

2048

1024

2048

384

512

mod-rtac

4

mod-rtac:3.6.0

2

1024

896

128

768

88

128

mod-circulation-storage

4

mod-circulation-storage:17.2.0

2

2880

2592

1536

1814

384

512

mod-calendar

4

mod-calendar:3.1.0

2

1024

896

128

768

88

128

mod-source-record-storage

4

mod-source-record-storage:5.8.0

2

5600

5000

2048

3500

384

512

mod-event-config

4

mod-event-config:2.7.0

2

1024

896

128

768

88

128

mod-courses

4

mod-courses:1.4.10

2

1024

896

128

768

88

128

mod-circulation-item

4

mod-circulation-item:1.0.0

2

1024

896

128

0

0

0

mod-inventory

4

mod-inventory:20.2.0

2

2880

2592

1024

1814

384

512

mod-email

4

mod-email:1.17.0

2

1024

896

128

768

88

128

mod-pubsub

4

mod-pubsub:2.13.0

2

1536

1440

1024

922

384

512

mod-circulation

4

mod-circulation:24.2.0

2

2880

2592

1536

1814

384

512

mod-di-converter-storage

4

mod-di-converter-storage:2.2.0

2

1024

896

128

768

88

128

edge-rtac

4

edge-rtac:2.7.1

2

1024

896

128

768

88

128

edge-orders

4

edge-orders:3.0.0

2

1024

896

128

768

88

128

mod-users

5

mod-users:19.3.1

2

1024

896

128

768

88

128

mod-template-engine

4

mod-template-engine:1.20.0

2

1024

896

128

768

88

128

mod-patron-blocks

4

mod-patron-blocks:1.10.0

2

1024

896

1024

768

88

128

mod-audit

4

mod-audit:2.9.0

2

1024

896

128

768

88

128

edge-fqm

4

edge-fqm:2.0.0

2

1024

896

128

768

88

128

mod-source-record-manager

5

mod-source-record-manager:3.9.0-SNAPSHOT.330

2

5600

5000

2048

3500

384

512

nginx-edge

3

nginx-edge:2023.06.14

2

1024

896

128

0

0

0

mod-quick-marc

4

mod-quick-marc:5.1.0

1

2288

2176

128

1664

384

512

nginx-okapi

3

nginx-okapi:2023.06.14

2

1024

896

128

0

0

0

okapi-b

4

okapi:5.3.0

3

1684

1440

1024

922

384

512

mod-feesfines

4

mod-feesfines:19.1.0

2

1024

896

128

768

88

128

mod-invoice-storage

4

mod-invoice-storage:5.8.0

2

1872

1536

1024

1024

384

512

mod-dcb

5

mod-dcb:1.1.0

2

1024

896

128

768

88

128

mod-service-interaction

4

mod-service-interaction:4.0.1

2

2048

1844

256

1290

384

512

mod-data-export

11

mod-data-export:5.0.4

1

2048

1524

1024

0

0

0

mod-patron

4

mod-patron:6.1.0

2

1024

896

128

768

88

128

mod-oai-pmh

4

mod-oai-pmh:3.13.0

2

4096

3690

2048

3076

384

512

edge-connexion

4

edge-connexion:1.2.0

2

1024

896

128

768

88

128

mod-kb-ebsco-java

4

mod-kb-ebsco-java:4.0.0

2

1024

896

128

768

88

128

mod-notes

4

mod-notes:5.2.0

2

1024

896

128

952

384

512

mod-data-export-spring

4

mod-data-export-spring:3.2.0

1

2048

1844

256

1536

384

512

mod-organizations-storage

4

mod-organizations-storage:4.7.0

2

1024

896

128

700

88

128

mod-login

4

mod-login:7.11.0

2

1440

1298

1024

768

384

512

pub-okapi

3

pub-okapi:2023.06.14

2

1024

896

128

768

0

0

mod-eusage-reports

4

mod-eusage-reports:2.1.1

2

1024

896

128

768

88

128

Methodology/Approach

Step 1. Test Run 1 was executed from the carrier box with parameters described in the TEST RUN section.
Step 2. Deploy New Relic, and verify if it works.
Step 3. Test Run 2 was executed from the carrier box with parameters described in the TEST RUN section, with the same parameters as on the step 1.