Data Import MARC BIB test report Ramsons Eureka [non-ECS]

Data Import MARC BIB test report Ramsons Eureka [non-ECS]

Overview

This document contains the results of testing Data Import for MARC Bibliographic records at Ramsons Eureka [non-ECS] environment.

Testing results of 1k, 5k, 10k, 25k, 50k, and 100k data import, create, and update profiles are used to compare Eureka's performance with Okapi’s.
Current Ticket: PERF-1085: [Ramsons] [Eureka] [Data import] Update and Create MARC BIB RecordsClosed

Summary

  1. Data imports for 1k, 5k, 10k, 25k, and 50k create are slower on the Eureka environment compared to the Okapi

    1. 50K completed with errors, on UI status was “COMPLETED with errors“ but 100% of the records were created, details in the ticket;

    2. 100K Failed to complete, problem could be in previously opened ticket.

  2. Data imports for the 1k, 5k, 10k, 25k, and 50k update were completed successfully, but are slower, about 2 times slower on the Eureka environment compared to the Okapi. The problem is in mod-search queries.
    Additional tests(tests results page last column) with mod-search disabled showed that the duration of DI of create and update is about the same on Eureka as on OKAPI.

  3. The DI-related modules, including mod-data-import, mod-srs, mod-srm, mod-inventory, and mod-inventory-storage, demonstrated stable and consistent CPU utilization throughout both test scenarios. Additionally, memory usage remained smooth and uniform, with no notable issues or fluctuations observed

  4. Eureka related modules, including kong and keycloak modules, during both tests had small CPU utilization(<5%). Memory utilization for folio-keycloak was high on both tests(~90%), another modules were consuming about 50%.

  5. Sidecars CPU rich maximum 50% and demonstrated stable and consistent utilization, memory showed a growing trend and needs to be investigated.

 

Recommendations & Jiras

Results

Test #

Data-import test

Profile

Duration

Ramsons

(rcp1)

mod-search disabled

Duration

Ramsons (rcp1)

mod-search enabled

Duration

Ramsons

(rcon)

Duration

Ramsons
Eureka

RECP1

mod-search enabled

Duration

Ramsons
Eureka

RECP1
mod-search disabled (*1)

Test #

Data-import test

Profile

Duration

Ramsons

(rcp1)

mod-search disabled

Duration

Ramsons (rcp1)

mod-search enabled

Duration

Ramsons

(rcon)

Duration

Ramsons
Eureka

RECP1

mod-search enabled

Duration

Ramsons
Eureka

RECP1
mod-search disabled (*1)

1

5k MARC BIB Create

PTF - Create 2

1 min

3 min 7 s

-

3 min 10 s

2 min 3 sec

 

10k MARC BIB Create

PTF - Create 2

4 min 30 s

6 min 15 s

5 min 10 s

7 min 35 s

3 min 51 s

2

25k MARC BIB Create

PTF - Create 2

11 min

17 min

10 min 30 s

19 min 17 s

9 min 38 s

3

50k MARC BIB Create

PTF - Create 2

22 min

41 min 25 s

15 min 43 s

36 min 50 s
Completed with errors

19 min 59 s

4

100k MARC BIB Create

PTF - Create 2

46 min

1 hr 19 min

31 min 51 s

Failed to complete

40 min 10 sec

 

5k MARC BIB Update

PTF - Updates Success - 6

3 min 33 s

6 min 33 s

-

9 min 10 s

3 min 42 s

6

10k MARC BIB Update

PTF - Updates Success - 6

6 min 46 s

11 min 14 s

7 min 10 s

19 min 06 sec

6 min 43 s

7

25k MARC BIB Update

PTF - Updates Success - 6

16 min 40 s

28 min 43 s

19 min 3 s

48 min 41 s

17 min 12

8

50k MARC BIB Update

PTF - Updates Success - 6

33 min 45 s

58 min 30 s

38 min 53 sec

1 hr 40 min

34 min 14 s

9

100k MARC BIB Update

PTF - Updates Success - 6

1 hr 8 min

2 hr 14 min

1 hr 23 min

not tested

not tested

*1 DI tests were run in the scope of the ticket PERF-1096: Investigate memory growing on sidecars during DIClosed and resource utilization is not documented in this report.




Cluster resource utilization graphs


Test 1. Data-import Create

Modules CPU Utilization graphs

CPU Usage is stable for all modules involved for MARC BIB creates process

image-20250228-132427.png

Eureka components were stable during the DI create and consume, just up to 5% of CPU

image-20250228-133020.png

Modules memory utilization graphs

Memory usage for Create test sets showing stable trend. No memory leak suspects observed.

image-20250228-132914.png

Eureka component during DI Create showed a stable trend. But folio-keycloak used about 90%, so the PTF recommendation is to add more memory for this module.

image-20250228-133113.png

Sidecars resource utilization for DI Create


CPU Usage is stable for all sidecar modules.

image-20250228-133551.png

The memory usage graph indicates that the DI related modules exhibit a steadily increasing trend in memory consumption. This pattern warrants further investigation to rule out potential memory leak issues.

image-20250228-133858.png



RDS Metrics

The database remains stable and displays a CPU utilization pattern consistent with previous reports.

image-20250228-134345.png

RDS Database Connections

image-20250228-134422.png

Database load (Performance insights metrics)

image-20250228-134842.png
image-20250228-135059.png

Slow queries WITH cte AS…… were investigated in the previous Data-import report ECS and non-ECS reports Data Import test report Ramsons [ECS] Data Import test report Ramsons [non-ECS]

Test 2. Data-import Update

Modules CPU Utilization graphs

CPU Usage is stable for all modules involved for MARC BIB updates

image-20250303-125118.png


Eureka components were stable during the DI update and consume, just up to 5% of CPU, so about the same behaviour like during create DI

image-20250303-125350.png
Sidecars

Modules memory utilization graphs

Memory usage for Updatetest sets showing stable trend. No memory leak suspects observed.

image-20250303-125022.png

Eureka component during DI Update showed a stable trend. But folio-keycloak used about 88%(like during DI Create ), so the PTF recommendation is to add more memory for this module.

image-20250303-125247.png

Sidecars resource utilization for DI Update

CPU Usage is stable for all sidecar modules.

 

image-20250303-125526.png

The memory usage graph indicates that the DI related modules exhibit a steadily increasing trend in memory consumption. This pattern same during DI creating proccess.

image-20250303-125659.png

On this graph we can see DI related sidecar modules that have groving memory utilization trend

image-20250303-130253.png

RDS Metrics

The database remains stable and displays a CPU utilization pattern consistent with previous reports.

image-20250303-130449.png

RDS Database Connections

image-20250303-130529.png

Database load (Performance insights metrics)

 

image-20250303-130700.png
image-20250303-130743.png

MSK Resource utilization

CPU (User) usage by broker during DI Create

image-20250303-132516.png

CPU (User) usage by broker during DI Update

 

image-20250303-132652.png

OpenSearch resource utilization

Maximum CPU utilization percentage for all data nodes during Test 1. DI Create

image-20250303-133245.png

CPU utilization percentage for the master node.

image-20250304-194132.png

 

Maximum CPU utilization percentage for all data nodes during Test 2. DI Update

image-20250304-193818.png
image-20250304-194253.png

 

Appendix

Infrastructure

PTF -environment RECP1

  • 10 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • db.r6.xlarge database instances, writer

  • MSK fse-test

    • 4 kafka.m7g.xlarge brokers in 2 zones

    • Apache Kafka version 3.7.x (KRaft mode)

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3

  • OpenSearch 2.13 ptf-test cluster

    • r6g.2xlarge.search 4 data nodes

    • r6g.large.search 3 dedicated master nodes

RECP1 Dataset

image-20250304-141622.png



 

Module

Task Definition Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft Limit

CPU Units

Xmx

Metaspace Size

Max Metaspace Size

 

Module

Task Definition Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft Limit

CPU Units

Xmx

Metaspace Size

Max Metaspace Size

mod-remote-storage

5

mod-remote-storage:3.3.5

2

4920

4472

0

3960

512

512

mod-remote-storage - Sidecar 1

N/A

folio-module-sidecar:2.0.6

N/A

1024

512

128

256

0

96

mod-finance-storage

5

mod-finance-storage:8.7.3

2

1024

896

128

700

88

128

mod-finance-storage - Sidecar 1

N/A

folio-module-sidecar:2.0.6

N/A

1024

512

128

256

0

96

mod-ebsconet

5

mod-ebsconet:2.3.1

2

1248

1024

0

700

128

256

mod-ebsconet - Sidecar 1

N/A

folio-module-sidecar:2.0.6

N/A

1024

512

128

256

0

96

mod-consortia-keycloak

1

mod-consortia-keycloak:1.6.6

2

5136

4776

512

4416

384

512

mod-consortia-keycloak - Sidecar 1

N/A

folio-module-sidecar:2.0.6

N/A

1024

512

128

256

0

96

mod-tags

5

mod-tags:2.3.0

2

1024

896

128

768

88

128

mod-tags - Sidecar 1

N/A

folio-module-sidecar:2.0.6

N/A

1024

512

128

256

0

96

edge-courses

4

edge-courses:1.5.1

2

1024

896

128

768

88

128

mod-notify

5

mod-notify:3.3.0

2

1024