/
Data Import MARC BIB + CI/CO (Ramsons) [NON-ECS]

Data Import MARC BIB + CI/CO (Ramsons) [NON-ECS]

Overview

  • This document contains the results of testing Check-in/Check-out and Data Import for MARC Bibliographic records on the Ramsons[NON-ECS] release environment. 

PERF-969 - Getting issue details... STATUS  

Summary

  • Data Import with Check In/Check Out tests finished successfully with PTF - Create 2 and PTF - Updates Success - 2 job profiles with files 5K, 10K, 25K, 50K, 100K records.
  • Instance type was updated from m6i.2xlarge to r7g.2xlarge
  • Comparing with previous testing results Quesnelia and Ramsons releases
    • Check-in duration degraded by 20% and Check-out duration degraded by 40% on average without Data Import
    • Data Import Create jobs duration with CI/CO degraded by 15%-40%, depending on file size
    • Data Import Update jobs durations with CI/CO degraded by two times.
    • Check-in duration degraded by 20% and Check-out duration degraded by 40% on average for Data Import Create
    • Check-in duration degraded by 40% and Check-out duration degraded by 45% on average for Data Import Update

Test Runs 

Test №ScenarioTest ConditionsResults

1
DI MARC Bib Create5K, 10K, 25K, 50K, 100K consequentially
Completed
CICO 8 users

2
DI MARC Bib Update5K, 10K, 25K, 50K, 100K consequentially


Completed

CICO8 users

Test Results

This table contains durations for Data Import with Check-in/Check-out

ProfileMARC File

DI Duration

Ramsons

(hh:mm:ss)

Check In, Check Out Response time (8 users)

Ramsons

CI Average msCO Average ms

DI MARC Bib Create

(PTF - Create 2)

5K.mrc0:02:538371.628
10K.mrc0:05:358091.460
25K.mrc0:15:138951.545
50K.mrc0:27:179011.495
100K.mrc1:04:318961.459

DI MARC Bib Update

(PTF - Updates Success - 2)

5K.mrc0:07:259921.767
10K.mrc0:12:521.1001.903
25K.mrc0:37:021.2251.957
50K.mrc1:13:161.2991.957
100K.mrc2:32:251.2972.027

Check-in/Check-out without Data Import

ScenarioLoad levelRequestResponse time, ms
Quesnelia
95 percaverage
Circulation Check-in/Check-out
(without Data import)
8 usersCheck-in767625
Check-out13471138

Comparison

This table contains DI durations with CICO comparison between Quesnelia and Ramsons releases.

ProfileMARC FileDI Duration

DI Delta  
Quesnelia/Ramsons

with CI/CO
(hh:mm:ss)

Check In, Check Out
Response time
(8 users)
Check In, Check Out
Response time
(8 users)
Delta, %

without CI/CO

with CI/COQuesneliaRamsonsQuesnelia/RamsonsQuesnelia/Ramsons
QuesneliaRamsonsQuesneliaRamsonsCI Average secCO Average secCI Average secCO Average secCICO
DI MARC Bib Create (PTF - Create 2)5K.mrc00:03:20
00:02:320:02:530:00:21
+14%
0.6450.9010.8371.628+23%+45%
10K.mrc00:06:0000:05:3300:05:030:05:35

0:00:32
+11%

0.6280.9220.8091.460+22%+37%
25K.mrc00:13:410:14:1600:11:580:15:130:03:15
+27%
0.6390.960.8951.545+29%+38%
50K.mrc00:21:590:28:2300:23:290:27:170:03:48
+16%
0.6781.0030.9011.495+25%+33%
100K.mrc00:40:160:49:0300:46:071:04:31

0:18:24
+40%

0.6860.9980.8961.459+23%+32%
DI MARC Bib Update (PTF - Updates Success - 2)5K.mrc00:07:10
00:03:240:07:25

0:04:01
+118%

0.6280.9750.9921.767+37%+45%
10K.mrc00:10:270:15:0000:06:290:12:520:04:01
+98%
0.6641.0181.1001.903+40%+47%
25K.mrc00:23:160:34:4100:16:150:37:020:20:47
+128%
0.7171.0621.2251.957+41%+46%
50K.mrc00:40:521:10:0100:33:331:13:160:39:43
+118%
0.7211.0711.2991.957+44%+45%
100K.mrc01:02:002:27:0501:10:142:32:251:22:11
+117%
0.7391.0811.2972.027+43%+47%

Detailed CICO response time comparison without DI

ScenarioLoad levelRequestResponse time, ms
Ramsons
Response time, ms
Quesnelia
Delta, %
95 percaverage95 percaverageRamsons/Quesnelia
average
Circulation Check-in/Check-out (without Data import)8 usersCheck-in767625609521+19.96%
Check-out1.3471.1381.070803+41.72%

Test №1-2

Resource utilization

 Resource utilization table
CPU RAM 
okapi-b36%mod-oa-b127%
mod-inventory-b5.74%mod-dcb-b64%
mod-source-record-storage-b4.21%mod-inventory-b63%
mod-inventory-storage-b3.67%mod-source-record-storage-b46%
mod-pubsub-b1.20%okapi-b44%
mod-source-record-manager-b1.20%mod-quick-marc-b41%
nginx-okapi0.85%mod-pubsub-b41%
mod-circulation-storage-b0.82%mod-data-import-b41%
mod-users-bl-b0.59%mod-users-b40%
mod-di-converter-storage-b0.57%mod-users-bl-b35%
mod-search-b0.48%mod-entities-links-b35%
mod-quick-marc-b0.43%mod-feesfines-b35%
mod-authtoken-b0.29%mod-di-converter-storage-b33%
mod-circulation-b0.26%mod-source-record-manager-b33%
mod-configuration-b0.16%mod-patron-blocks-b30%
mod-dcb-b0.14%mod-inventory-storage-b30%
mod-feesfines-b0.11%mod-circulation-storage-b30%
mod-entities-links-b0.11%mod-configuration-b29%
mod-patron-blocks-b0.05%mod-patron-b22%
pub-okapi0.05%mod-circulation-b22%
mod-oa-b0.04%mod-authtoken-b21%
mod-data-import-b0.04%edge-patron-b17%
edge-patron-b0.03%nginx-okapi4%
mod-patron-b0.03%pub-okapi4%
pub-edge0.00%pub-edge4%

Response times

 Response times

Service CPU Utilization

Service Memory Utilization

DB CPU Utilization

DB Connections

Kafka metrics

OpenSearch Data Nodes metrics


DB load

                                                                                                                     

Top SQL-queries

Top applications


#TOP SQL statement
1
insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)
#TOP SLOW  SQL statement
1
WITH cte AS (SELECT id,
                    name,
                    name_type_id,
                    authority_id,
                    last_updated_date
             FROM fs09000000_mod_search.contributor
             WHERE last_updated_date > $1
             ORDER BY last_updated_date
             )
SELECT c.id,
       c.name,
       c.name_type_id,
       c.authority_id,
       c.last_updated_date,
       json_agg(
               CASE
                                WHEN sub.instance_count IS NULL THEN NULL
                                ELSE json_build_object(
                       'count', sub.instance_count,
                       'typeId', sub.type_ids,
                       'shared', sub.shared,
                       'tenantId', sub.tenant_id
               )
               END
       ) AS instances
FROM cte c
         LEFT JOIN
     (SELECT cte.id,
             ins.tenant_id,
             ins.shared,
             array_agg(DISTINCT ins.type_id) FILTER (WHERE ins.type_id <> '') AS type_ids,
             count(DISTINCT ins.instance_id)                                  AS instance_count
      FROM fs09000000_mod_search.instance_contributor ins
               INNER JOIN cte
                          ON ins.contributor_id = cte.id
      GROUP BY cte.id,
               ins.tenant_id,
               ins.shared) sub ON c.id = sub.id
GROUP BY c.id,
         c.name,
         c.name_type_id,
         c.authority_id,
         c.last_updated_date
      ORDER BY last_updated_date ASC


Appendix

Infrastructure

PTF -environment rcp1
  • rcp1 5 r7g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1 
  • 1 instance of db.r6g.xlarge database instance: Writer instance
  • MSK fse-test
      • 4 kafka.m7g.xlarge brokers in 2 zones (2 brokers per zone)
      • Apache Kafka version 3.7.x, metadata mode - KRaft

      • EBS storage volume per broker 300 GiB

      • auto.create.topics.enable=true
      • log.retention.minutes=480
      • default.replication.factor=2
      • revision - 26
  • OpenSearch 2.13 ptf-test cluster
    • r6g.2xlarge.search 4 data nodes

    • r6g.large.search 3 dedicated master nodes

DB tables records size:

Tennentinstance countholdings countitem countauthority count
fs0900000030.587.26030.646.07531.734.81816.535.572

Methodology/Approach

DI tests scenario (DI MARC Bib Create and Update) were started on Ramsons (rcp1) env with  file splitting features enabled on a NON-ECS environment.

  • To run CI/CO - Ubuntu AWS instance was used as load generator
  • DI tests were started from UI

Test runs:

  • Test 1: Manually tested 5K, 10K, 25K, 50K, 100K consequentially records files, DI (DI MARC Bib Create ) started on Main tenant (fs09000000), and CICO with 8 users on background.
  • Test 2: Manually tested 5K, 10K, 25K, 50K, 100K consequentially records files, DI (DI MARC Bib Update) started on Main tenant (fs09000000), and CICO with 8 users on background.


 All RCP1 modules

Cluster Resources - rcp1-pvt (Wed Jan 22 12:28:18 UTC 2025)

ModuleTask Definition RevisionModule VersionTask CountMem Hard LimitMem Soft LimitCPU UnitsXmxMetaspace SizeMax Metaspace Size
mod-remote-storage9mod-remote-storage:3.3.324920447203960512512
mod-ncip7mod-ncip:1.15.621024896076888128
mod-finance-storage8mod-finance-storage:8.7.321024896070088128
mod-agreements9mod-agreements:7.1.42318429760000
mod-ebsconet9mod-ebsconet:2.3.12124810240700128256
mod-organizations7mod-organizations:2.0.021024896070088128
mod-consortia3mod-consortia:1.2.2251364776044165121024
edge-sip27edge-sip2:3.3.121024896076888128
mod-settings8mod-settings:1.1.021024896076888128
mod-serials-management9mod-serials-management:1.1.222480231201792384512
edge-dematic7edge-dematic:2.3.111024896076888128
mod-data-import8mod-data-import:3.2.412048184401292384512
mod-search21mod-search:4.0.7225922480014405121024
mod-inn-reach4mod-inn-reach:3.2.1-SNAPSHOT.102236003240028805121024
mod-record-specifications16mod-record-specifications:1.0.221024896076888128
mod-tags7mod-tags:2.3.021024896076888128
mod-authtoken8mod-authtoken:2.16.1214401152092288128
edge-courses9edge-courses:1.5.121024896076888128
mod-notify7mod-notify:3.3.021024896076888128
mod-inventory-update7mod-inventory-update:4.0.021024896076888128
mod-configuration7mod-configuration:5.11.021024896076888128
mod-orders-storage8mod-orders-storage:13.8.321024896070088128
edge-caiasoft7edge-caiasoft:2.3.221024896076888128
mod-login-saml7mod-login-saml:2.9.321024896076888128
mod-erm-usage-harvester7mod-erm-usage-harvester:5.0.121024896076888128
mod-password-validator7mod-password-validator:3.3.02144012980768384512
mod-gobi7mod-gobi:2.9.021024896070088128
mod-licenses7mod-licenses:6.1.222480231201792384512
edge-dcb7edge-dcb:1.2.121024896076888128
mod-bulk-operations9mod-bulk-operations:2.1.823072260001536384512
mod-fqm-manager21mod-fqm-manager:3.0.721024896076888128
mod-graphql9mod-graphql:1.13.121024896076888128
mod-finance9mod-finance:5.0.121024896076888128
mod-erm-usage7mod-erm-usage:5.0.021024896076888128
mod-batch-print7mod-batch-print:1.2.021024896076888128
mod-tlr4mod-tlr:1.0.0-SNAPSHOT.821024896076888128
mod-lists16mod-lists:3.0.521024896076888128
mod-copycat7mod-copycat:1.7.021024512076888128
mod-entities-links13mod-entities-links:3.1.32259224800144001024
mod-permissions17mod-permissions:6.6.121684154401024384512
pub-edge5pub-edge:2023.06.1421024896076800
mod-orders9mod-orders:12.9.922048174001024384512
edge-patron9edge-patron:5.2.121024896076888128
edge-ncip7edge-ncip:1.10.121024896076888128
mod-marc-migrations30mod-marc-migrations:1.0.021024896076888128
edge-inn-reach3edge-inn-reach:3.3.0-SNAPSHOT.6921024896076888128
mod-users-bl7mod-users-bl:7.9.321440115251292288128
mod-oa5mod-oa:2.1.0-SNAPSHOT.66220488960180088512
mod-invoice9mod-invoice:5.9.2214401152092288128
mod-inventory-storage13mod-inventory-storage:28.0.424096369003076384512
mod-user-import7mod-user-import:3.9.021024896076888128
mod-sender7mod-sender:1.13.021024896076888128
edge-oai-pmh8edge-oai-pmh:2.10.021512136001440384512
mod-data-export-worker9mod-data-export-worker:3.3.623072280002048384512
mod-rtac7mod-rtac:3.7.021024896076888128
mod-circulation-storage9mod-circulation-storage:17.3.322880259201814384512
mod-source-record-storage13mod-source-record-storage:5.9.525600500003500384512
mod-calendar8mod-calendar:3.2.021024896076888128
mod-event-config7mod-event-config:2.8.021024896076888128
mod-courses7mod-courses:1.4.1121024896076888128
mod-circulation-item7mod-circulation-item:1.1.0210248960000
mod-inventory8mod-inventory:21.0.522880259201814384512
mod-email9mod-email:1.18.122800255001800384512
mod-requests-mediated12mod-requests-mediated:1.0.0-SNAPSHOT.421024896076888128
mod-di-converter-storage8mod-di-converter-storage:2.3.121024896076888128
mod-pubsub9mod-pubsub:2.15.32153614400922384512
mod-circulation9mod-circulation:24.3.822880259201814384512
edge-orders7edge-orders:3.1.021024896076888128
edge-rtac7edge-rtac:2.8.021024896076888128
mod-template-engine7mod-template-engine:1.21.021024896076888128
mod-users9mod-users:19.4.521024896076888128
mod-patron-blocks8mod-patron-blocks:1.11.121024896076888128
mod-audit9mod-audit:2.10.221024896076888128
edge-fqm9edge-fqm:3.0.221024896076888128
mod-source-record-manager13mod-source-record-manager:3.9.525600500003500384512
nginx-edge5nginx-edge:2023.06.14210248960000
mod-quick-marc7mod-quick-marc:6.0.012288217601664384512
nginx-okapi5nginx-okapi:2023.06.14210248960000
okapi-b7okapi:6.1.03168414400922384512
mod-feesfines7mod-feesfines:19.2.121024896076888128
mod-invoice-storage8mod-invoice-storage:5.9.121872153601024384512
mod-reading-room8mod-reading-room:1.0.021024896076888128
mod-service-interaction8mod-service-interaction:4.1.122048184401290384512
mod-dcb9mod-dcb:1.2.421024896076888128
mod-data-export10mod-data-export:5.1.512592248001440881024
mod-patron9mod-patron:6.2.521024896076888128
mod-oai-pmh7mod-oai-pmh:3.14.324096369020483076384512
edge-connexion7edge-connexion:1.3.121024896076888128
mod-notes7mod-notes:6.0.0210248960952384512
mod-kb-ebsco-java7mod-kb-ebsco-java:5.0.021024896076888128
mod-login8mod-login:7.12.12144012980768384512
mod-organizations-storage7mod-organizations-storage:4.8.121024896070088128
mod-data-export-spring9mod-data-export-spring:3.4.312048184401536384512
pub-okapi5pub-okapi:2023.06.1421024896076800
edge-erm6edge-erm:1.3.021024896076888128
mod-eusage-reports7mod-eusage-reports:3.0.021024896076888128

Test artifacts: