[Sunflower] [non-ECS] [Data import] Create MARC authority Records + CSP1

[Sunflower] [non-ECS] [Data import] Create MARC authority Records + CSP1

Overview

This document presents performance testing results for Data Import of MARC Authority records using a create job profile in the Sunflower Eureka non-ECS environments (secp1).

The performance evaluation was carried out across a range of records for a single tenant: 1K, 5K, 10K, 25K, and 50K records. Additionally, we conducted a data import and parallel Check-In/Check-Out test simulating 5 virtual users to assess system behavior under concurrent operations and parallel data import on 3 tenants simultaneously.
Ticket: https://folio-org.atlassian.net/browse/PERF-1110

The report also includes repeated test results executed on the Sunflower Critical Service Patch 1 (CSP1) environment for comparison purposes.
Ticket: https://folio-org.atlassian.net/browse/PERF-1196

Summary

  • Data import tests finished successfully during Test 1 - Test 2, however in Test 3 only in fs09000000 tenant test finished successfully, but for member tenants failed.

  • The data import process during Test 1 of MARC Authority records using a create job profile in the Sunflower release demonstrates a slight but noteworthy improvement in performance compared to the Ramsons release. Our analysis indicates an average speed increase of DI Marc Authority of approximately 16%.

    • The test results indicate that five virtual users (5 VU) for Check-In/Check-Out (CICO) operations do not affect the performance of the data import process.

Sunfower vs Sunflower CSP1 Comparison summary


Single-Tenant Data Import (DI)

  • Small files (1K–10K): CSP1 showed slightly slower performance (8–12% longer DI durations).

  • Medium and Large files (50K): CSP1 was significantly slower, with DI duration increasing by 8-17%.

Data Import with Concurrent Check-In/Check-Out (CICO)

  • DI Durations: CSP1 generally longer than Sunflower for small to large files

  • Check-In/Check-Out (avg times):

    • CI avg increased slightly from 1.08s → 1.23s.

    • CO avg increased from 1.20s → 1.43s.

  • Baseline CICO (no DI, Sunflower CSP1):

    • CI: 532 ms (avg), 611 ms (p95).

    • CO: 912 ms (avg), 1297 ms (p95).

Parallel Multi-Tenant Imports

  • Sunflower: ~26 min (50K) and ~13 min (25K) across all three tenants.

  • CSP1: Strong regression — ~43 min (50K) and ~42 min (25K), consistent across tenants.

CSP1 exhibits major performance degradation in multi-tenant imports, with durations nearly doubling compared to baseline.

Recommendations & Jiras

  • During the Data Import Test1, we observed that some job runs failed due to 1–4 records (out of 1000) throwing a “UUID conflict” error.
    To resolve this, we upgraded mod-inventory and its dependent modules to the latest release. This approach successfully unblocked the issue for the central tenant.

    However, the same issue still persists for the member tenant, so further investigation is needed.

  • In the Ramsons release, there were issues with mod-permissions on the member tenant fs07-2 during Test3, while testing Data Import concurrently across three tenants.
    We observed that, unlike the previous report release, this time with Sunflower we did not encounter such a blocker. https://folio-org.atlassian.net/browse/PERF-1074

 

Test Runs 

Test

Test conditions and short description

Status

Test 1.

Tenant: fs0900000. Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q. 1k - 5k - 10k - 25k -50k with 5 minutes pauses between each DI

Completed

Test 2.

Tenant: fs0900000. Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q 1k - 5k - 10k - 25k -50k with 5 minutes pauses between each DI

CheckIn-CheckOut 5 Virtual users

Completed

Test 3.

Parallel, multi-tenant Data import
Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q. In parallel
Tenant: fs0900000. 50k and 25k without pause.
Tenant: fs0700001. 50k and 25k without pause.
Tenant: fs0700002. 50k and 25k without pause.
Comment: While testing concurrently on the central tenant, everything passed successfully.
However, the member tenants experienced UUID conflicts, with 102 out of 50,000 records failing.
Due to the issue being almost unreproducible and having minimal impact on the overall results, we decided to proceed with reporting.

Completed

CSP1 . The same set of tests was repeated to validate the system performance after applying Critical Service Patch 1.

Test

Test conditions and short description

Status

CSP1-Test 1.

Tenant: fs09. Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q. 1k - 5k - 10k - 25k -50k

Completed

CSP1-Test 2.

Tenant: fs09. Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q 1k - 5k - 10k - 25k -50k +

CheckIn-CheckOut 5 Virtual users

Completed

CSP1-Test 3.

Parallel, multi-tenant Data import
Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q. In parallel
Tenant: fs0900000. 50k and 25k without pause.
Tenant: fs0700001. 50k and 25k without pause.
Tenant: fs0700002. 50k and 25k without pause.

Completed

Test Results and Comparison

Test №1

Table 1. - Test with 1k, 10k, 25k, and 50k records files DI started on one tenant fs09000000-(secp1-00), and comparative results between Sunflower and Ramsons.

Number of records 

DI duration 
Poppy

DI duration 
Quesnelia

DI duration 
Ramsons

DI duration 
Sunflower

Time Diff and Perc. Improvement
R vs S

Number of records 

DI duration 
Poppy

DI duration 
Quesnelia

DI duration 
Ramsons

DI duration 
Sunflower

Time Diff and Perc. Improvement
R vs S

1,000

29 sec

22 sec

21 sec

16 sec

5 sec, 23.81%

5,000

1min 38s

1 min 19 sec

1 min 04 sec

1 min 03 sec

1 sec, 1.56%

10,000

2min 53s

2 min 36 sec

2 min 36 sec

2 min 03 sec

33 sec, 21.15%

25,000

5 min 42s

6 min 24s

5 min 55 sec

4 min 39 sec

76 sec, 21.41%

50,000

13 min 48s

11 min 59 sec

10 min 31 sec

9 min 09 sec

82 sec, 13.00%

Test №2

Test with CICO 5 concurrent users and DI 1K, 5K, 10K, 25K, and 50K started on one tenant fs09000000-(secp1-00).

Table 2. - Сomparative Baseline Check-In\Check-Out results without Data Import between Ramsons and Sunflower.

 

CICO, Avg time without
DI
(Quesnelia)

CICO, 95% time without
DI
(Quesnelia)

CICO, Avg time without
DI
(Ramsons)

CICO, 95% time without
DI
(Ramsons)

 

CICO, Avg time without
DI
(Quesnelia)

CICO, 95% time without
DI
(Quesnelia)

CICO, Avg time without
DI
(Ramsons)

CICO, 95% time without
DI
(Ramsons)

Check-In

511 ms

593 ms

835 ms

934 ms

Check-Out

876 ms

1117 ms

1115 ms

1323 ms

Table 3. - Сomparative  Check-In\Check-Out results between Baseline (Sunflower) and  Check-In\Check-Out plus Data Import (Sunflower.)

Number of records 

DI Duration with CICO
(Ramsons)

DI Duration
with CICO
(Sunflower)

CI time

Avg, sec

CO time

Avg, sec

Baseline CI
Avg delta

Baseline CO
Avg delta

Number of records 

DI Duration with CICO
(Ramsons)

DI Duration
with CICO
(Sunflower)

CI time

Avg, sec

CO time

Avg, sec

Baseline CI
Avg delta

Baseline CO
Avg delta

1,000

28 sec

17 sec

1.064

1.238

+27%

+11%

5,000

1 min 09 sec

1 min 02 sec

906

1.242

+8.5%

+11%

10,000

2 min 27 sec

1 min 59 sec

974

1.239

+16.6%

+11%

25,000

6 min 54 sec

4 min 59 sec

1.105

1.368

+32.3%

+23%

50,000

15 min 47 sec

10 min 03 sec

1.331

1.548

+59%

+39%

Test №3
Table 4. - Duration on parallel multitenant data-import on tenants fs09000000-(secp1-00), fs07000001-(secp1-01) and fs07000002-(secp1-02)

Tenant

50K DI

25K DI

Tenant

50K DI

25K DI

fs0900000

26 min 41 sec

13 min 27 sec

fs0700001

26 min 32 sec

13 min 26 sec

fs0700002

26 min 29 sec

13 min 21 sec

Sunflower vs Sunflower CSP1

Table 1. Test with 1k, 10k, 25k, and 50k records files DI started on one tenant fs09000000, and comparative results between Sunflower and Sunflower CSP1.

 

Number of records 

DI duration 
Sunflower

DI duration 
Sunflower CSP1

Time Diff
sec

Perc. Improvement

 

Number of records 

DI duration 
Sunflower

DI duration 
Sunflower CSP1

Time Diff
sec

Perc. Improvement

1,000

16 sec

18 sec

0:02

12.5

5,000

1 min 03 sec

1 min 08 sec

0:05

8

10,000

2 min 03 sec

2 min 14 sec

0:11

9

25,000

4 min 39 sec

4 min 15 sec

-1:36

-8.6

50,000

9 min 09 sec

10 min 44 sec

1:35

17.3

Table 2. - Сomparative  Check-In\Check-Out results between Baseline (Sunflower) and  Check-In\Check-Out plus Data Import (Sunflower.)

 

 

Number of records 

DI Duration
with CICO
Sunflower

DI Duration
with CICO
Sunflower CSP1

CI time

Avg, sec

CO time

Avg, sec

 

 

Number of records 

DI Duration
with CICO
Sunflower

DI Duration
with CICO
Sunflower CSP1

CI time

Avg, sec

CO time

Avg, sec

1,000

17 sec

18 sec

1.083

1.198

5,000

1 min 02 sec

1 min 09 sec

1.026

1.231

10,000

1 min 59 sec

2 min 24 sec

992

1.224

25,000

4 min 59 sec

4 min 01 sec

1.123

1.279

50,000

10 min 03 sec

11 min 29 sec

1.231

1.434

 

CICO, Avg time without
DI
Sunflower CSP1

CICO, 95% time without
DI
Sunflower CSP1

 

CICO, Avg time without
DI
Sunflower CSP1

CICO, 95% time without
DI
Sunflower CSP1

Check-In

532 ms