Check-in items API Report (Goldenrod)

Check-in items API Report (Goldenrod)

 

 

 

 

 

 

 

 

 

 

 

PERF-90: POST_circulation/check-in-by-barcode (Check-in)Closed

Overview

In this testing effort the performance of POST /circulation/check-in-by-barcode running in the Goldenrod release PERF-90: POST_circulation/check-in-by-barcode (Check-in)Closed is tested with 1, 5, 8, and 20 virtual users for 30 minutes.  

 

Backend:

  • mod-inventory-storage-19.2.4

  • mod-inventory-16.0.1

  • mod-circulation-19.0.9

  • mod-circulation-storage-12.0.1

  • okapi-3.1.1

  • mod-authtoken-2.5.1

  • mod-permissions-5.11.2

Frontend:

  • folio_inventory-4.0.3

  • folio_circulation-3.0.1

  • Item Check-in (folio_checkin-3.0.1)

Environment:

  • 61 back-end modules deployed in 110 ECS services

  • 3 okapi ECS services

  • 8 m5.large  EC2 instances

  • 2 db.r5.xlarge AWS RDS instance (1 reader, 1 writer)

  • INFO Okapi logging level

High Level Summary

  1. POST /circulation/check-in-by-barcode takes on average 460ms - 500ms for up to 8 users, and much slower for 20 users

  2. Comparing to Fameflower this Goldenrod version is about 11% - 20% slower, but a couple of tests need to be redone to confirm results.

  3. There were no slow queries or specific API calls, but the total time of all API calls add up to slowness.

  4. mod-authtoken calls add up to at least 20% of the overall time.

  5. mod-circulation appears to have memory leaks

Test Runs and Results

Test

Virtual Users

Duration

OKAPI log level

Profiled

Ramp up (secs)

Average (ms)

75th %tile (ms)

Test

Virtual Users

Duration

OKAPI log level

Profiled

Ramp up (secs)

Average (ms)

75th %tile (ms)

1. 

1

30 min

INFO

No

1

459

549

2. 

5

30 min

INFO

No

50

460

483

3. 

8

30 min

INFO

No

80

502

529

4. 

20

30 min

INFO

No

200

975

1093

Running this test on Fameflower gives the following comparison:

Virtual Users

FF Average (ms)

GR Average (ms)

Delta %

FF 75th% tile (ms)

 GR 75th %tile (ms)

Delta%

Virtual Users

FF Average (ms)

GR Average (ms)

Delta %

FF 75th% tile (ms)

 GR 75th %tile (ms)

Delta%

1

414

459

10.87

461

549

19.09

5

435

460

5.75%

458

483

5.46

8

505

502

-0.59

531

529

-0.38

20

851

978

14.57

914

1093

19.58

It looks like Goldenrod is slower than Fameflower by about 10%-19%. There were inconclusive tests for 5 and 8 users, or 1 and 20 users, so a new set of tests were rerun against both Goldenrod and Fameflower builds:

Test

Virtual Users

Duration

OKAPI log level

Profiled

Ramp up (secs)

Average (ms)

75th %tile (ms)

Test

Virtual Users

Duration

OKAPI log level

Profiled

Ramp up (secs)

Average (ms)

75th %tile (ms)

5. 

1

30 min

WARNING

No

1

463

559

6. 

5

30 min

WARNING

No

50

466

486

7. 

8

30 min

WARNING

No

80

523

550

8. 

20

30 min

WARNING

No

200

988

1072

When comparing to a new set of Fameflower response times:

Virtual Users

FF Average (ms)

GR Average (ms)

Delta %

FF 75th% tile (ms)

 GR 75th %tile (ms)

Delta%

Virtual Users

FF Average (ms)

GR Average (ms)

Delta %

FF 75th% tile (ms)

 GR 75th %tile (ms)

Delta%

1

410

463

12.93

455

559

22.86

5

420

466

10.95

442

486

9.95

8

471

523

11.04

499

550

10.22

20

851

988

16.10

909

1072

17.93

A couple things to note about the methodology of the second round of test runs: 

  1. mod-circulation and mod-feesfines were all restarted before each test run because there are indications that these two modules have memory leaks.  These leaks over time affect performance and the modules take longer to respond. It was important to have them at the same state before each test run for consistency in time

  2. OKAPI log level on both builds, FF and GR, were confirmed to be the same. This time they were set at WARNING level, but the level is not relevant in this discussion as long as both builds have the same log level.

Here we see a more consistent and tighter band of delta variability between the test runs. With the second test run results, and also considering the first set of results, we can more confidently conclude that Goldenrod version is about 10-17% slower than Fameflower.

JVM Profiling

Database

Database shows that CPU usage increases  according with number of users. CPU usage for 1 user is  about 3-5 %.   CPU usage for the high case of 20 users is 60%.

Virtual Users

CPU %

Virtual Users

CPU %

1

8

5

34%

8

50%

20

65%

Database Slow queries 

There were no slow queries in this API call.  Slow queries are those that take longer than 20ms. 

 

CPU Utilization

The following shows CPU utilization percentage of the ECS services running okapi and the relevant nmodules.

 

1 user

5 users

8 users

20 users

 

1 user

5 users

8 users

20 users

 

Average %

Range %

Average %

Range %

Average %

Range %

Average %

Range %

Okapi

47

45-48

167

156-177

228

218-238

240

154-282

mod-circulation

146

146-152

155

145-155

135

103-159

237

166-800

mod-circulation-storage

12

10.31-14.77

30

27-41

40

39-61

48

47-52

mod-inventory

3.0

3.03-3.26

3.28

3.23-3.49

3.16

3.0-3.31

3.11

3.10-3.22

mod-inventory-storage

24

17.83-26.11

87

81-90

123

114-152

142

111-159

mod-circulation seems to be performing erratically at times.  It needs to be focused on in a longevity test.

The graph below shows mod-circulation CPU usage.  The written numbers indicate the number of virtual users participated in the test, and the arrow points to when the tests were running.  Observations: the CPU usage increased over time for each test run performed. Immediately after each test run the CPU utilization came down but not back to the starting point. This indicates some kind of a leak.  For the 20 users run there were a huge spike about 5 minutes into the test run. At this time the container was running out of memory (see the memory graph in the next section) so things became unstable. 

Memory

Memory for most of the relevant modules, except for mod-circulation, was stable throughout the runs, only a spike here or there, but in a 30 minutes run they were consistent. 

 

1 user

5 users

8 users

20 users

 

1 user

5 users

8 users

20 users

 

Average

Average

Average

Average

Okapi

67%

67%

67%

67%

mod-circulation

128%

120%

103%

140%; 47%-97%

mod-circulation-storage

45.7

45.7

45.67%

45.7%

mod-inventory

55%

55%

55%

55%

mod-inventory-storage

56%

56%

56%

56%

Memory was stable for OKAPI and all relevant modules except for mod-circulation which may have a memory leak as testing proceeded in this sequence: 8 users, 5 users, 1 user, 20 users.  mod-circulation's average memory utilization climbed from 103% to 140% over 4 tests. mod-circulation was crashed and restarted during the 20 users run, so its memory consumption range is lower than other runs, but it started out at 140%. 

mod-circulation memory usage showing when the test was run. The written numbers indicate the number of virtual users participated in the tests. This graph shows an increase of memory usage during the test run that was not released. For the 20 users test run, it started out at a very high level and peaked at 140%, which is when the container crashed and restarted, as seen by the vertical drop off.

Heap dumps were taken for consecutive test runs (not specifically the ones in the diagrams above but at a later date) show an increasing in the number of HTTP Client objects.

Here is the leaked suspects report for a test run, generated by Eclipse's Memory Analyzer tool.

Here is the suspected leaks for the next consecutive run

Note the increasing in the number of HttpClientImpl objects from one run (10,88,480) to another run (19,666,000). Connection managers are increased as well, from 4,001,128 objects to 7,280,328 objects.

Giraffe Diagrams