Data Import MARC BIB Sunflower [non-ECS] - CSP1 & CSP2

Data Import MARC BIB Sunflower [non-ECS] - CSP1 & CSP2

Overview

This document contains the results of testing Data Import for MARC Bibliographic records at Sunflower release [non-ECS] - CSP1.
In scope is testing of 1K,5K,10K,25K,50K, 100K data import create and update.
Ticket CSP1: PERF-1194: [Sunflower] [non-ECS] [Data import] Update and Create MARC BIB RecordsClosed

Ticket CSP2 PERF-1218: [Sunflower-CSP2] [non-ECS] [Data import] Update and Create MARC BIB RecordsClosed

 

Summary

  • All tests passed successfully in both CSP1 and CSP2

  • CSP1:

    • Duration for 5k, 10k, 25k MARC BIB Create around the same;

    • Duration for 50k AND 100k MARC BIB Create 14% AND ~29% slower than Sunflower GA accordingly;

    • Duration for all PTF - Updates Success - 1 profile are faster;

    • With disabled mod-search duration for 50k AND 100k MARC BIB Create are faster compared to Ramsons with disabled mod-search;

    • Results for 50k AND 100k MARC BIB Create on another tenant (fs07000001) showed much faster results due to a smaller dataset.

  • CSP2:

    • Durations are either the same or better than CSP1, and a little slower than Sunflower GA

    • CSP2 Updates are faster than CSP 1 likely due to a mod-search query that was running in CSP 1 (which created locking waits in the DB) not running in CSP 2.

    • CSP2’s Create imports MSK CPU utilizations are much lower compared to CSP 1 Create imports.

Results

Test #

Data-import test

Profile

Duration

Duration

Duration

Duration

Sunflower CSP2

Sunflower CSP1

Sunflower

Ramsons

(secp1)

(secp1)

(secp1)

(rcp1)

1

1k MARC BIB Create

PTF - Create 2

45s

34s

-

-

2

5k MARC BIB Create

PTF - Create 2

2min 35s

3min 4s

2min 4s

3min 7s

3

10k MARC BIB Create

PTF - Create 2

4min 44s

5min 25s

4min 43s

6min 15s

4

25k MARC BIB Create

PTF - Create 2

11min 33s

11min 27s

10min

17 min

5

50k MARC BIB Create

PTF - Create 2

22min 03s

24min 33s

21 min

41min 25s

6

100k MARC BIB Create

PTF - Create 2

46min 54s

54min 31s

42min 46s

1hr 19min

7

1k MARC BIB Update

PTF - Updates Success - 6

1min 1s

50s

-

-

8

5k MARC BIB Update

PTF - Updates Success - 6

3min 42s

5min 7s

6min 18s

6min 33s

9

10k MARC BIB Update

PTF - Updates Success - 6

6min 59s

8min 13s

6min 4s

11min 14s

10

25k MARC BIB Update

PTF - Updates Success - 6

17min 13s

19min 27s

31min

28min 43s

11

50k MARC BIB Update

PTF - Updates Success - 6

35min 01s

37min 11s

1hr 8min

58min 30s

12

100k MARC BIB Update

PTF - Updates Success - 6

1hr 12min

1hr 29min

2hr 5min

2hr 14min

min

Results of additional DI PTF - Create 2 profile test runs for 50k AND 100k:

Test #

Data-import test

Profile

Duration

Duration

Duration

Duration

Duration

Duration

Duration

Sunflower CSP1

Sunflower CSP1

Sunflower CSP1

Sunflower CSP1

Sunflower

Ramsons

Ramsons

(secp1)

(secp1) #2

(secp1) #3

(secp1) #4

(secp1)

(rcp1)

(rcp1)

 

 

 

 

mod-search disabled 

 

fs07000001 

 

 

mod-search disabled 

1

50k MARC BIB Create

PTF - Create 2

24min 33s

18min 50s

24min 05s

14min 17s

21min

41min 25s

22min

2

100k MARC BIB Create

PTF - Create 2

54min 310s

38min 13s

47min 18s

28min 37s

42min 46s

1hr 19min

46min

Memory Utilization

Memory utilization showed stable trend during DI creates and updates tests in CSP1 and CSP2

 

CSP1 Memory Utilization %

CSP2 Memory Utilization %

 

CSP1 Memory Utilization %

CSP2 Memory Utilization %

mgr-applications

88%

80%

mgr-tenant-entitlements

76%

67%

mod-oa

76%

Was not stable, disabled

during tests

mod-scheduler

74%

48%

mod-roles-keycloak

72%

47%

mod-data-import

72%

40%

mod-finance

69%

52%

mod-source-record-storage

 

25%

mod-source-record-manager

 

25%

mod-di-converter-storage

 

34%

mod-inventory

 

35%

mod-inventory-storage

 

20%

CSP 1

CSP 2

CSP 1

CSP 2

image-20250904-121635.png

Creates

image-20251030-105336.png

 

Updates

image-20251030-105946.png

Service CPU Utilizations

CPU utilization is stable for all modules during all tests in both CSP 1 and CSP 2. CPU utilization in CSP 2 of all modules are slightly higher than in CSP 1. It’s notable that mod-tlr and mod-requests-mediated were active during DI jobs (see graphs and chart below). In the case of mod-tlr, the activities were mostly Kafka related, setting offsets. More investigations is needed to understand why mod-tlr and mod-requests-mediated ran during Data Imports.

 

CSP 1 CPU %

CSP 2 (Create | Update % )

 

CSP 1 CPU %

CSP 2 (Create | Update % )

mod-inventory

163

173 | 197

mod-inventory-storage-b

82

80 | 131

mod-tlr

28

35 | 44

mod-requests-mediated

11

13 | 14

mod-linked-data

9

 

mod-consortia-keycloak

7

10 | 12

mod-roles-keycloak

7

7.75 | 9

mod-source-record-manager

 

2.7 | 2.23

mod-source-record-storage

 

6.8 | 9.5

mod-di-converter-storage

 

1.7 | 2.0

mod-data-import

 

(spikes up to 14% for create import and 10% for update import of 50K files)

CSP 1

CSP 2

CSP 1

CSP 2

image-20250904-121512.png

Creates

image-20251029-210317.png

 

Updates

image-20251030-104239.png

 

 

RDS Metrics 

CPU usage of DB is high (as usual during data import process).

CSP1 DB CPU utilization for DI creates and updates

CSP 1 (Creates and Updates)

CSP 2 (Creates top, Updates bottom)

CSP 1 (Creates and Updates)

CSP 2 (Creates top, Updates bottom)

22ef8343-a583-46f0-b4ad-a7b2a4b2ed62.png
image-20251028-100543.png

 

image-20251028-100825.png

 

 

RDS Database Connections

Without test load is about 1380. When DI Create 1520, DI Update 1500 connections.

  CSP1 Creates and Updates

CSP2 Creates and Updates

  CSP1 Creates and Updates

CSP2 Creates and Updates

image-20250909-095127.png

 

441d3ebc-95c6-445d-8c17-9461ba69849f.png

 

 

DB load for DI creates and updates

CSP 1 (Creates and Updates)

CSP Creates

CSP 1 (Creates and Updates)

CSP Creates

image-20250904-122553.png

 

image-20251028-095544.png

 

image-20251028-100038.png

At first glance the DB load graphs of both CSP1 and CSP2 are pretty much the same in amplitudes, meaning that DI creates and update imports in both versions incurred about the same average number of active sessions, but upon a closer look we can see that in CSP1 updates there are lock waits, whereas there were no lock waits in CSP 2. When examining the Top SQL queries below, it’s clearer that CSP 1’s locking was due to a mod-search query (with cte…), whereas CSP 2 didn’t have this query running. Together these two facts explain why update imports in CSP 2 were a little bit faster.

Top SQL

CSP 1 Create and Updates

CSP 2 (Creates top, Updates bottom)

CSP 1 Create and Updates

CSP 2 (Creates top, Updates bottom)

image-20250904-123338.png

 

image-20251028-095628.png

 

image-20251028-100119.png

Deadlocks for CSP 1 and CSP 2

Further evidence of deadlocking happening in CSP 1 updates but not in CSP 2

CSP 1 (Updates)

CSP 2 (Creates and Updates)

CSP 1 (Updates)

CSP 2 (Creates and Updates)

image-20250904-124211.png

 

ad744ad0-ac6f-477d-9c4e-26c9025443a3.png

 

 

No deadlocks detected in all the DI jobs, Create and Updates.

MSK CPU usage

It’s remarkable that CSP 2’s create imports are did not use much MSK CPU compared to CSP1 (the left group of spikes in the graph below). For updates both versions used the same level of CPU.

CSP1

CSP2

CSP1

CSP2

image-20250904-101307.png

 

image-20251029-115239.png

 

 

Appendix

Infrastructure

PTF -environment secp1