/
[Ramsons] [ECS] [Data import] Create MARC authority Records

[Ramsons] [ECS] [Data import] Create MARC authority Records

Overview

This document presents performance testing results for Data Import of MARC Authority records using a Create job profile in the Ramsons release on Okapi-based ECS environments (RCON). The tests were conducted with Kafka consolidated topics and file-splitting features enabled.

The performance evaluation was carried out across a range of records for a single tenant: 1K, 5K, 10K, 25K, and 50K records. Additionally, we conducted a Data Import and parallel Check-In/Check-Out test simulating 5 virtual users to assess system behavior under concurrent operations and parallel data import on 3 tenants.
Current ticket: PERF-979: [Ramsons] [ECS] [Data import] Create MARC authority RecordsClosed
Previous report: [Quesnelia] [ECS] [Data import] Create MARC authority Records

Summary

  • Data Import tests finished successfully during Test 1 - Test 3

  • The Data Import process during Test 1 of MARC bibliographic records using a Create job profile in the Ramsons release demonstrates a slight but noteworthy improvement in performance compared to the Quesnelia release(Table 1).

  • The Data Import and parallel Check-In/Check-Out testing, simulating five virtual users, revealed that the Ramsons release demonstrated better performance compared to Quesnelia.

    • The test results indicate that five virtual users (5 VU) for Check-In/Check-Out (CICO) operations do not affect the performance of the data import process, even vice versa, the duration of the DI has slightly decreased

    • Response time of CI and CO transactions increased proportionally with the increase in the number of importing records(Table 2).

  • The parallel 50К data-import on 3 tenants was successful, but the duration increased by 1.5-3 times compared to one DI on one tenant(Table 3).

  • Mod-source-record-manager has a new approach for inserting data in the records journal, using the function on the DB side, we observed that compared to previous results, this version results in about 50 more AAS. But according to the testing results this problem did not lead to a deterioration of the DI process.

Recommendations & Jiras

 

Test Runs 

Test

Test conditions and short description

Status

Test 1.

Tenant: cs00000int. Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q 1k - 5k - 10k - 25k -50k with 5 minutes pauses between each DI

Completed

Test 2.

Tenant: cs00000int_001. Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q 1k - 5k - 10k - 25k -50k with 5 minutes pauses between each DI

CheckIn-CheckOut 5 Virtual users

Completed

Test 3.

Parallel, multi-tenant Data import
Job profile KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q. In parallel
Tenant: cs00000int. 50k
Tenant: cs00000int_0001. 50k
Tenant: cs00000int_0002. 50k

Completed

Test Results and Comparison

Test №1

Table 1. - Test with 1k, 10k, 25k, and 50k records files DI started on one tenant cs00000int, and comparative results between Quesnelia and Ramsons.

Number of records 

% creates

DI duration 
M. Glory

DI duration
Nolana

DI duration Orchid

DI duration 
Poppy

DI duration
Quesnelia
[ECS], QCON

DI duration 
Ramsons
[ECS], RCON

Time Diff and Perc. Improvement
R vs Q

Number of records 

% creates

DI duration 
M. Glory

DI duration
Nolana

DI duration Orchid

DI duration 
Poppy

DI duration
Quesnelia
[ECS], QCON

DI duration 
Ramsons
[ECS], RCON

Time Diff and Perc. Improvement
R vs Q

1,000

100

24 s

27 s

41 sec

29 sec

25 sec

27 sec

2 sec, 8%

5,000

100

1 min 21 s

1 min 15 s

1min 21s

1 min 38 sec

1 min 23 sec

1 min 24 sec

1 sec, 1.2%

10,000

100

2 min 32 s

2 min 31 s

2min 53s

2 min 53 sec

2 min 43 sec

2 min 38 sec

5 sec, 3.1%

25000

100

11 min 14 s

7 min 7 s

5 min 42s

6 min 24 sec

6 min 27 sec

5 min 24 sec

1 min 24 sec, 16.3%

50,000

100

22 min

11 min 24 s

11 min 11s

13 min 48 sec

11 min 45 sec

9 min 42 sec

2 min 03 sec, 17.4%

Test 2. DI Central tenant 1k-5K-10K-22K-50K + CI/CO 5VU.
Table 2. - Сomparative Baseline Check-In\Check-Out results without Data Import between Quesnelia and Ramsons.

Number of records

DI Duration with CICO

Poppy

DI Duration with CICO
Quesnelia
ECS

DI Duration with CICO
Ramsons
ECS

CI Avg time
(Quesnelia)

CI Avg time
(Ramsons)

CI, Avg time without DI
Ramsons
ECS

CO time Avg
(Ramsons )

CO time Avg
(Ramsons )

CO, Avg time without
DI
Ramsons
ECS

Number of records

DI Duration with CICO

Poppy

DI Duration with CICO
Quesnelia
ECS

DI Duration with CICO
Ramsons
ECS

CI Avg time
(Quesnelia)

CI Avg time
(Ramsons)

CI, Avg time without DI
Ramsons
ECS

CO time Avg
(Ramsons )

CO time Avg
(Ramsons )

CO, Avg time without
DI
Ramsons
ECS

1,000

35 sec

21 sec

17 sec

0.870 sec

0.642 sec

 

 

0.616 sec

 

 

1.361 sec

1.231 sec

 

 

1.187 sec

 

5,000

1 min 41 sec

1 min 09 sec

57 sec

0.878 sec

0.655 sec

1.772 sec

1.243 sec

10,000

3 min 4 sec

2 min 17 se

1 min 47 sec

0.955 sec

0.671 sec

1.905 sec

1.261 sec

25,000

6 min 32 sec

6 min 20 sec

4 min 01 sec

0.970 sec

0.691 sec

1.920 sec

1.339 sec

50,000

13 min 48 sec

13 min 49 sec

09 min 13 sec

1.040 sec

0.796 sec

1.907 sec

1.585 sec

Test №3
Table 3. - Duration on parallel multitenant data-import on tenants cs00000int, cs00000int_0001 and cs00000int_0002

Tenant

50K DI

Tenant

50K DI

Central - cs00000int

27 min 03 sec

College- cs00000int_0001

27 min 18 sec

Professional- cs00000int_0002

15 min 02 sec


Cluster resource utilization for Test 1

Service CPU Utilization

The image shows CPU consumption during Test 1.

image-20250221-142320.png

Service memory utilization

Service memory utilization remains consistent across all modules.

image-20250221-143739.png

DB CPU Utilization

Here are the conclusions drawn from the database CPU usage graph:

  • For 1k records, the maximum CPU usage was approximately 35%.

  • For 5k records, the maximum CPU usage reached around 72%.

  • For 10k records, the maximum CPU usage climbed to about 92%.

  • For both 25k and 50k records, the maximum CPU usage was around 93%.

image-20250221-151834.png

DB Connections

image-20250224-100847.png

Database load

Sliced by SQL

image-20250224-101306.png

Top SQL queries during test 1

image-20250224-101326.png

Load by squalls (AAS)

SQL statements

Calls/sec

Rows/sec

Avg latency (ms)/call

Load by squalls (AAS)

SQL statements

Calls/sec

Rows/sec

Avg latency (ms)/call

3.64

COMMIT

0.00

0.00

-

0.59

insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($...

30.00

30.00

19.20

0.38

WITH input_rows(record_id, authority_id) AS ( VALUES ($1::uuid,$2::uuid) ) , ...

30.00

30.00

0.35

0.32

INSERT INTO cs00000int_mod_source_record_manager.events_processed (handler_id, e...

30.00

30.00

0.96

0.19

SELECT insert_journal_records($1::jsonb[])

0.91

0.91

175.38

0.16

select a1_0.id,a1_0.source_file_id,a1_0.created_by_user_id,a1_0.created_date,a1_...

0.02

0.00

0.00

0.06

insert into authority (source_file_id,created_by_user_id,created_date,deleted,he...

30.00

30.00

2.55

0.06

with "cte" as (select count(*) from "records_lb" where ("records_lb"."snapshot_i...

-

-

-

0.05

insert into authority (source_file_id,created_by_user_id,created_date,deleted,he...

28.09

28.09

1.15

0.05

insert into authority (source_file_id,created_by_user_id,created_date,deleted,he...

30.00

30.00

1.13

image-20250224-101502.png

 

 

Cluster resource utilization for Test 2

The checkIn-CheckOut test started at about 15:30 and finished at 16:25

CICO Response time graph

Response time and throughput were stable during the 1-hour CICO test with 5 VU. Error rate ~0.02%

image-20250221-143014.png

Service CPU Utilization

The image shows CPU consumption during Test 2

image-20250221-142916.png

Service memory utilization

Service memory utilization remains consistent across all modules.

image-20250221-144043.png

DB CPU Utilization

Here are the conclusions drawn from the database CPU usage graph:

  • For 1k records, the maximum CPU usage was approximately 28%.

  • For 5k records, the maximum CPU usage reached around 76%.

  • For 10k records, the maximum CPU usage climbed to about 86%.

  • For both 25k and 50k records, the maximum CPU usage was around 86%.

image-20250221-152018.png

DB Connections

In the idle state number of connection ~1100 and during CICO 5VU + 50K DI ~1520

image-20250224-101011.png

Database load

Sliced by SQL

image-20250224-103915.png

Top SQL queries during test 2

image-20250224-104049.png

Load by sqls (AAS)

SQL statements

Calls/sec

Rows/sec

Avg latency (ms)/call

Load by sqls (AAS)

SQL statements

Calls/sec

Rows/sec

Avg latency (ms)/call

0.71

SELECT insert_journal_records($1::jsonb[])

0.73

0.73

938.90

0.49

COMMIT

0.00

0.00

-

0.28

WITH input_rows(record_id, authority_id) AS ( VALUES ($1::uuid,$2::uuid) ) , ...

24.19

24.19

0.18

0.28

insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($...

24.19

24.19

10.23

0.22

INSERT INTO cs00000int_0001_mod_source_record_manager.events_processed (handler_...

24.19

24.19

1.23

0.05

INSERT INTO cs00000int_0001_mod_pubsub.audit_message (id, event_id, event_type, ...

9.93

9.93

1.41

0.04

with "cte" as (select count(*) from "records_lb" where ("records_lb"."snapshot_i...

-

-

-

0.03

insert into authority (source_file_id,created_by_user_id,created_date,deleted,he...

24.19

24.19

0.89

0.03

insert into "records_lb" ("id", "snapshot_id", "matched_id", "generation", "reco...

24.19

24.19

1.08

0.03

select max("records_lb"."generation") as "generation" from "records_lb" join "sn...

24.19

24.19

0.52

image-20250224-104224.png

 

Cluster resource utilization for Test 3

Service CPU Utilization

The image shows CPU consumption during Test 3

image-20250221-144745.png

Service memory utilization


Service memory utilization remains consistent across all modules.

image-20250221-144604.png

DB CPU Utilization

The maximum CPU usage was approximately 83%.

image-20250221-152147.png

DB Connections

In the idle state, the number of connections is ~1200 and during test 3 ~ 1587

image-20250224-101106.png

Database load

Sliced by SQL

image-20250224-104315.png

Top SQL queries during test 3

image-20250224-104352.png

Load by sqls (AAS)

SQL statements

Calls/sec

Rows/sec

Avg latency (ms)/call

Load by sqls (AAS)

SQL statements

Calls/sec

Rows/sec

Avg latency (ms)/call

2.60

SELECT insert_journal_records($1::jsonb[])

1.03

1.03

2657.80

2.51

SELECT insert_journal_records($1::jsonb[])

0.65

0.65

4113.74

2.10

SELECT insert_journal_records($1::jsonb[])

0.96

0.96

1135.52

1.02

COMMIT

0.00

0.00

0.00

0.82

SELECT * FROM get_job_log_entries(?, ?, ?, ?, ?, ?, ?)

0.00

0.03

784912.01

0.28

WITH input_rows(record_id, authority_id) AS ( VALUES ($1::uuid,$2::uuid) ) , ...

16.67

16.67

0.30

0.27

insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($...

16.55

16.55

18.19

0.22

INSERT INTO cs00000int_0002_mod_source_record_manager.events_processed (handler_...

16.67

16.67

0.57

0.17

insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($...

16.67

16.67

8.08

0.14

WITH input_rows(record_id, authority_id) AS ( VALUES ($1::uuid,$2::uuid) ) , ...

16.57

16.57

0.23

image-20250224-104514.png

 

MSK Cluster

MSK Cluster resource utilization for Test 1

CPU (User) usage by the broker reaches a maximum 64 % during 50k DI

image-20250224-105724.png


Disk usage by broker

image-20250224-110023.png

MSK Cluster resource utilization for Test 2


CPU (User) usage by broker reaches maximum of 63% during 50k and 25k DI and CICO

image-20250224-105806.png


Disk usage by broker

image-20250224-110058.png

MSK Cluster resource utilization for Test 3

CPU (User) usage by broker reaches a maximum 65 % during 50k DI and CICO

image-20250224-105856.png

Disk usage by broker

image-20250224-110154.png



OpenSearch Service


Maximum CPU utilization percentage for all data nodes, Test 1.

image-20250224-124453.png

CPU utilization percentage for the master node Test 1.

image-20250224-124749.png

Maximum CPU utilization percentage for all data nodes Test 2.

image-20250224-124533.png

CPU utilization percentage for the master node Test 2.

image-20250224-124819.png



Maximum CPU utilization percentage for all data nodes Test 3.

image-20250224-124625.png



CPU utilization percentage for the master node Test 3

 

image-20250224-124846.png

 

Appendix

Infrastructure

PTF -environment RCON

  • 11 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • db.r6.xlarge database instances, writer

  • MSK fse-test

    • 4 kafka.m7g.xlarge brokers in 2 zones

    • Apache Kafka version 3.7.x (KRaft mode)

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3

  • OpenSearch 2.13 ptf-test cluster

    • r6g.2xlarge.search 4 data nodes

    • r6g.large.search 3 dedicated master nodes

Cluster Resources - rcon-pvt

Module

Task Definition Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft Limit

CPU Units

Xmx

Metaspace Size

Max Metaspace Size

Module

Task Definition Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft Limit

CPU Units

Xmx

Metaspace Size

Max Metaspace Size

mod-remote-storage

8

mod-remote-storage:3.3.3

2

4920

4472

0

3960

512

512

mod-finance-storage

7

mod-finance-storage:8.7.3

2

1024

896

0

700

88

128

mod-ncip

7

mod-ncip:1.15.6

2

1024

896

0

768

88

128

mod-agreements

8

mod-agreements:7.1.4

2

1592

1488

0

0

0

0

mod-ebsconet

8

mod-ebsconet:2.3.1

2

1248

1024

0

700

128

256

mod-organizations

7

mod-organizations:2.0.0

2

1024

896

0

700

88

128

mod-consortia

10

mod-consortia:1.2.2

2

5136

4776

0

2048

512

1024

edge-sip2

7

edge-sip2:3.3.1

2

1024

896

0

768

88

128

mod-serials-management

8

mod-serials-management:1.1.2

2

2480

2312

0

1792

384

512

mod-settings

7

mod-settings:1.1.0

2

1024

896

0

768

88

128

mod-data-import

10

mod-data-import:3.2.4

1

2048

1844

0

1292

384

512

mod-search

18

mod-search:4.0.7

2

2592

2480

0

1440

512

1024

edge-dematic

7

edge-dematic:2.3.1

1

1024

896

0

768

88

128

mod-inn-reach

4

mod-inn-reach:3.2.1-SNAPSHOT.102

2

3600

3240

0

2880

512

1024

mod-record-specifications

7

mod-record-specifications:1.0.2

2

1024

896

0

768

88

128

mod-tags

7

mod-tags:2.3.0

2

1024

896

0

768

88

128

mod-authtoken

9

mod-authtoken:2.16.1

2

1440

1152

0

922

88

128

edge-courses

8

edge-courses:1.5.1

2

1024

896

0

768

88

128

mod-notify

7

mod-notify:3.3.0

2

1024

896

0

768

88

128

mod-inventory-update

7

mod-inventory-update:4.0.0

2

1024

896

0

768

88

128

mod-configuration

7

mod-configuration:5.11.0

2

1024

896

0

768

88

128

mod-orders-storage

7

mod-orders-storage:13.8.3

2

1024

896

0

700

88

128

edge-caiasoft

7

edge-caiasoft:2.3.2

2

1024

896

0

768

88

128

mod-login-saml

7

mod-login-saml:2.9.3

2

1024

896

0

768

88

128

mod-erm-usage-harvester

7

mod-erm-usage-harvester:5.0.1

2

1024

896

0

768

88

128

mod-gobi

7

mod-gobi:2.9.0

2

1024

896

0

700

88

128

mod-licenses

7

mod-licenses:6.1.2

2

2480

2312

0

1792

384

512

mod-password-validator

7

mod-password-validator:3.3.0

2

1440

1298

0

768

384

512

edge-dcb

8

edge-dcb:1.2.1

2

1024

896

0

768

88

128

mod-bulk-operations

8

mod-bulk-operations:2.1.8

2

3072

2600

0

1536

384

512

mod-fqm-manager

10

mod-fqm-manager:3.0.7

2

3000

2600

0

768

88

128

mod-graphql

9

mod-graphql:1.13.1

2

1024

896

0

768

88

128

mod-finance

8

mod-finance:5.0.1

2

1024

896

0

700

88

128

mod-erm-usage

7

mod-erm-usage:5.0.0

2

2800

2550

0

1800

384

512

mod-batch-print

7

mod-batch-print:1.2.0

2

1024

896

0

768

88

128

mod-tlr

4

mod-tlr:1.0.0-SNAPSHOT.8

2

1024

896

0

768

88

128

mod-lists

12

mod-lists:3.0.5

2

6000

2600

0

768

88

128

mod-copycat

7

mod-copycat:1.7.0

2

1024

512

0

768

88

128

mod-entities-links

11

mod-entities-links:3.1.3

2

2592

2480

0

1440

0

1024

mod-permissions

13

mod-permissions:6.6.1

2

1684

1544

512

1024

384

512

pub-edge

7

pub-edge:2023.06.14

2

1024

896

0

768

0

0

mod-orders

9

mod-orders:12.9.9

2

2048

1740

0

1024

384

512

edge-patron

8

edge-patron:5.2.1

2

1024

896

0

768

88

128

mod-marc-migrations

26

mod-marc-migrations:1.0.0

2

1024

896

0

768

88

128

edge-ncip

8

edge-ncip:1.10.1

2

1024

896

0

768

88

128

edge-inn-reach

5

edge-inn-reach:3.3.0-SNAPSHOT.69

2

1024

896

0

768

88

128

mod-users-bl

7

mod-users-bl:7.9.3

2

1440

1152

0

922

88

128

mod-oa

4

mod-oa:2.1.0-SNAPSHOT.66

2

1024

896

0

768

88

128

mod-inventory-storage

12

mod-inventory-storage:28.0.4

2

4096

3690

0

3076

512

1024

mod-invoice

8

mod-invoice:5.9.2

2

1440

1152

0

922

88

128

mod-user-import

7

mod-user-import:3.9.0

2

1024

896

0

768

88

128

mod-sender

7

mod-sender:1.13.0

2

1024

896

0

768

88

128

edge-oai-pmh

7

edge-oai-pmh:2.10.0

2

1512

1360

0

1440

384

512

mod-data-export-worker

10

mod-data-export-worker:3.3.6

2

3072

2048

0

2048

384

512

mod-rtac

7

mod-rtac:3.7.0

2

1024

896

0

768

88

128

mod-circulation-storage

8

mod-circulation-storage:17.3.3

2

2880

2592

0

1814

384

512

mod-source-record-storage

13

mod-source-record-storage:5.9.5

2

5600

5000

0

3500

384

512

mod-calendar

7

mod-calendar:3.2.0

2

1024

896

0

768

88

128

mod-event-config

7

mod-event-config:2.8.0

2

1024

896

0

768

88

128

mod-courses

8

mod-courses:1.4.11

2

1024

896

0

768

88

128

mod-circulation-item

7

mod-circulation-item:1.1.0

2

1024

896

0

0

0

0

mod-inventory

9

mod-inventory:21.0.5

2

2880

2592

0

1814

384

512

mod-email

8

mod-email:1.18.1

2

2800

2550

0

1800

384

512

mod-requests-mediated

4

mod-requests-mediated:1.0.0-SNAPSHOT.4

2

1024

896

0

768

88

128

mod-circulation

8

mod-circulation:24.3.8

2

2880

2592

0

1814

384

512

mod-pubsub

8

mod-pubsub:2.15.3

2

1536

1440

0

922

384

512

mod-di-converter-storage

9

mod-di-converter-storage:2.3.1

2

1024

896

0

768

88

128

edge-rtac

7

edge-rtac:2.8.0

2

1024

896

0

768

88

128

edge-orders

7

edge-orders:3.1.0

2

1024

896

0

768

88

128

mod-users

8

mod-users:19.4.5

2

1024

896

0

768

88

128

mod-template-engine

7

mod-template-engine:1.21.0

2

1024

896

0

768

88

128

mod-patron-blocks

7

mod-patron-blocks:1.11.1

2

1024

896

0

768

88

128

mod-audit

8

mod-audit:2.10.2

2

1024

896

0

768

88

128

edge-fqm

9

edge-fqm:3.0.2

2

1024

896

0

768

88

128

mod-source-record-manager

8

mod-source-record-manager:3.9.5

2

5600

5000

0

3500

384

512

nginx-edge

7

nginx-edge:2023.06.14

2

1024

896

0

0

0

0

mod-quick-marc

7

mod-quick-marc:6.0.0

1

2288

2176

0

1664

384

512

nginx-okapi

7

nginx-okapi:2023.06.14

2

1024

896

0

0

0

0

okapi-b

8

okapi:6.1.1

3

1684

1440

1024

922

384

512

mod-feesfines

7

mod-feesfines:19.2.1

2

1024

896

0

768

88

128

mod-invoice-storage

7

mod-invoice-storage:5.9.1

2

1872

1536

0

1024

384

512

mod-reading-room

7

mod-reading-room:1.0.0

2

1024

896

0

768

88

128

mod-dcb

8

mod-dcb:1.2.4

2

1024

896

0

768

88

128

mod-service-interaction

7

mod-service-interaction:4.1.1

2

2048

1844

0

1290

384

512

mod-patron

8

mod-patron:6.2.5

2

1024

896

0

768

88

128

mod-data-export

13

mod-data-export:5.1.5

1

2048

1844

0

0

0

0

mod-oai-pmh

7

mod-oai-pmh:3.14.3

2

4096

3690

0

3076

384

512

edge-connexion

7

edge-connexion:1.3.1

2

1024

896

0

768

88

128

mod-notes

7

mod-notes:6.0.0

2

1024

896

0

952

384

512

mod-kb-ebsco-java

8

mod-kb-ebsco-java:5.0.0

2

1024

896

0

768

88

128

mod-organizations-storage

7

mod-organizations-storage:4.8.1

2

1024

896

0

700

88

128

mod-data-export-spring

8

mod-data-export-spring:3.4.3

1

2048

1844

0

1536

384

512

mod-login

7

mod-login:7.12.1

2

1440

1298

0

768

384

512

pub-okapi

7

pub-okapi:2023.06.14

2

1024

896

0

768

0

0

edge-erm

5

edge-erm:1.3.0

2

1024

896

0

768

88

128

mod-eusage-reports

7

mod-eusage-reports:3.0.0

2

1024

896

0

768

88

128

Inventory size

SELECT 'cs00000int' AS cs00000int,

    (SELECT COUNT(id) FROM cs00000int_mod_inventory_storage.instance) AS instanceCount,

    (SELECT COUNT(id) FROM cs00000int_mod_inventory_storage.holdings_record) AS holdingsCount,

    (SELECT COUNT(id) FROM cs00000int_mod_inventory_storage.item) AS itemCount,

    (SELECT COUNT(id) FROM cs00000int_mod_entities_links.authority AS authorityCount

UNION ALL

SELECT 'cs00000int_0001' AS cs00000int_0001,

    (SELECT COUNT(id) FROM cs00000int_0001_mod_inventory_storage.instance) AS instanceCount,

    (SELECT COUNT(id) FROM cs00000int_0001_mod_inventory_storage.holdings_record) AS holdingsCount,

    (SELECT COUNT(id) FROM cs00000int_0001_mod_inventory_storage.item) AS itemCount,

    (SELECT COUNT(id) FROM cs00000int_0001_mod_entities_links.authority;

UNION ALL

SELECT 'cs00000int_0002' AS cs00000int_0002,

    (SELECT COUNT(id) FROM cs00000int_0002_mod_inventory_storage.instance) AS instanceCount,

    (SELECT COUNT(id) FROM cs00000int_0002_mod_inventory_storage.holdings_record) AS holdingsCount,

    (SELECT COUNT(id) FROM cs00000int_0002_mod_inventory_storage.item) AS itemCount,

    (SELECT COUNT(id) FROM cs00000int_0002_mod_entities_links.authority AS authorityCount;

UNION ALL

SELECT 'cs00000int_0003' AS cs00000int_0003,

    (SELECT COUNT(id) FROM cs00000int_0003_mod_inventory_storage.instance) AS instanceCount,

    (SELECT COUNT(id) FROM cs00000int_0003_mod_inventory_storage.holdings_record) AS holdingsCount,

    (SELECT COUNT(id) FROM cs00000int_0003_mod_inventory_storage.item) AS itemCount,

    (SELECT COUNT(id) FROM cs00000int_0003_mod_entities_links.authority AS authorityCount;

UNION ALL

SELECT 'cs00000int_0004' AS cs00000int_0004,

    (SELECT COUNT(id) FROM cs00000int_0004_mod_inventory_storage.instance) AS instanceCount,

    (SELECT COUNT(id) FROM cs00000int_0004_mod_inventory_storage.holdings_record) AS holdingsCount,

    (SELECT COUNT(id) FROM cs00000int_0004_mod_inventory_storage.item) AS itemCount,

    (SELECT COUNT(id) FROM cs00000int_0004_mod_entities_links.authority AS authorityCount;

UNION ALL

SELECT 'cs00000int_0005' AS cs00000int_0005,

    (SELECT COUNT(id) FROM cs00000int_0005_mod_inventory_storage.instance) AS instanceCount,

    (SELECT COUNT(id) FROM cs00000int_0005_mod_inventory_storage.holdings_record) AS holdingsCount,

    (SELECT COUNT(id) FROM cs00000int_0005_mod_inventory_storage.item) AS itemCount,

    (SELECT COUNT(id) FROM cs00000int_0005_mod_entities_links.authorityAS authorityCount;
 

 

image-20250224-135718.png

Methodology/Approach

DI tests scenario a data import job profile that creates new MARC authority records for non-matches (Job Profile: KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q) were started from UI on Ramsons(RCON ) env with file splitting features enabled on a non-ecs environment.

  • Action for non-matches:  Create MARC authority record
    Test set

  • Test 1: Manually tested 1k, 10k, 25k, and 50k records files DI started on one tenant(cs00000int) only.

  • Test 2: Manually tested 1k, 10k, 25k, and 50k records files DI started on one tenant(cs00000_0001_int) only plus Check-in and Checkout (CICO) for 5 concurrent users.

  • Test 3: Manually tested 50k records files DI started on 3 tenants concurrently

To get data-import durations, SQL query was used

SELECT (completed_date-started_date) as duration, * FROM {tenant}_mod_source_record_manager.job_execution where subordination_type = 'COMPOSITE_PARENT' and job_profile_name ='KG - Create SRS MARC Authority on nonmatches to 010 $a DUPLICATE for Q' order by started_date desc limit 15

Related content