[Sunflower] [non-ECS] Multi-tenants DI

[Sunflower] [non-ECS] Multi-tenants DI

Overview

  • This document contains the results of testing Data Import for MARC Bibliographic records with an update job in the Sunflower release on secp1 environments with Kafka consolidated topics and file splitting features enabled on a non-ecs environment

https://folio-org.atlassian.net/browse/PERF-1113

  • During the first run, tests #2 and #3 failed due to authentication issues for tenants fs07000001 and fs07000002. These authentication blockers have since been resolved. Detailed steps and actions taken to fix the issue can be found here documentations

Summary

Three DI update job tests with 25,000 records were performed on the secp1 cluster.

We compared the Sunflower and Quesnelia releases and observed both performance gains and losses in the DI update job durations:

  • Single tenant: On average, jobs ran 9 minutes and 40 seconds slower in Sunflower.

  • Two tenants: Jobs were 10 minutes slower in Sunflower.

  • Three tenants: Jobs ran 45 minutes faster in Sunflower compared to Quesnelia.

Resource Utilization (Sunflower):

  • mod-di-converter-storage CPU usage: 203%

  • mod-inventory CPU usage: 128%

  • Database CPU usage: 99%

  • Average DB connections: 1,542

 

Test Results

Profile

Test #

Tenant

MARC File

DI Duration Sunflower (hh:mm:ss)

Results

Profile

Test #

Tenant

MARC File

DI Duration Sunflower (hh:mm:ss)

Results

DI MARC Bib Update (PTF - Updates Success - 6)

1

fs09000000

25K.mrc

00:25:35

Completed

2

fs09000000

25K.mrc

00:30:15

Completed

fs07000001

25K.mrc

00:25:09

Completed

3

fs09000000

25K.mrc

00:26:35

Completed

fs07000001

25K.mrc

00:25:36

Completed

fs07000002

25K.mrc

00:15:15

Completed

 

Comparison

Test #

Sunflower

Quesnelia

DI Delta
per tenant
(hh:mm:ss) Sunflower/Quesnelia

DI Delta
per test
(hh:mm:ss) Sunflower/Quesnelia

Profile

Tenant

DI Duration Sunflower

per tenant

(hh:mm:ss)

DI Duration Sunflower

per test

(hh:mm:ss)

Profile

Tenant

DI Duration Quesnelia (hh:mm:ss)

DI Duration Quesnelia

per test

(hh:mm:ss)

№1

25K.mrc

DI MARC Bib Update

(PTF - Updates Success - 6)

fs09000000

00:25:35

00:25:35

DI MARC Bib Update

(PTF - Updates Success - 1)

fso9000000

00:15:55

00:15:55

+00:09:40

+00:09:40

№2

25K.mrc

fs09000000

00:30:15

00:55:24

fso9000000

00:17:06

00:45:23

+00:13:09

+0:10:01

fs07000001

00:25:09

fso7000001

00:28:17

-00:03:08

№3

25K.mrc

fs09000000

00:26:35

01:06:59

fso9000000

00:33:57

1:51:59

-00:07:22

-00:45:00

fs07000001

00:25:36

fso7000001

00:36:07

-00:10:31

fs07000002

00:15:15

fso7000002

00:41:55

-00:26:40

Resource utilization for Test #1 and Test #2

Service CPU Utilization

Here we can see that mod-di-converter-storage-b module used 145% CPU, mod-source-record-storage-b used 88% CPU and mod-inventory-b used 81% CPU

image-20250508-100408.png

Service Memory Utilization

Here we can see that all modules show a stable trend.

image-20250507-113908.png

DB CPU Utilization

DB CPU in the average was 99%.

image-20250508-104949.png

DB Connections

DB connections was 1496.

image-20250508-105110.png

DB load

image-20250508-105428.png

Top SQL-queries

image-20250508-105601.png

 

#

TOP 5 SQL statements

#

TOP 5 SQL statements

1

insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)

2

INSERT INTO fs09000000_mod_source_record_manager.journal_records (id, job_execution_id, source_id, source_record_order, entity_type, entity_id, entity_hrid, action_type, action_status, error, action_date, title, instance_id, holdings_id, order_id, permanent_location_id, tenant_id) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17) ON CONFLICT DO NOTHING

3

insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)

4

WITH cte AS (SELECT id, name, name_type_id, authority_id, last_updated_date FROM fs09000000_mod_search.contributor WHERE last_updated_date > $1 ORDER BY last_updated_date ) SELECT c.id, c.name, c.name_type_id, c.authority_id, c.last_updated_date, json_agg( CASE WHEN sub.instance_count

5

UPDATE fs09000000_mod_inventory_storage.holdings_record SET jsonb = $1::jsonb WHERE id=?

Resource utilization for Test #3

Service CPU Utilization

Here we can see that mod-di-converter-storage-b module used 203% CPU in average, mod-inventory-b used 129% CPU and mod-source-record-storage-b used 112% CPU

Untitled-20250508-110603.jpg

Service Memory Utilization

Here we can see that all modules show a stable trend.

image-20250507-131230.png

DB CPU Utilization

DB CPU was 97%.

image-20250507-131438.png

DB Connections

DB connections was 1588.

image-20250507-131609.png

DB load

image-20250508-110028.png

Top SQL-queries

image-20250508-110053.png

 

#

TOP 5 SQL statements

#

TOP 5 SQL statements

1

insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)

2

INSERT INTO fs07000002_mod_search.contributor (id, name, name_type_id, authority_id) VALUES ($1, $2, $3, $4) ON CONFLICT (id) DO UPDATE SET last_updated_date = CURRENT_TIMESTAMP

3

insert into "marc_records_lb" ("id", "content") values (cast($1 as uuid), cast($2 as jsonb)) on conflict ("id") do update set "content" = cast($3 as jsonb)

4

INSERT INTO fs07000002_mod_source_record_manager.journal_records (id, job_execution_id, source_id, source_record_order, entity_type, entity_id, entity_hrid, action_type, action_status, error, action_date, title, instance_id, holdings_id, order_id, permanent_location_id, tenant_id) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17) ON CONFLICT DO NOTHING

5

INSERT INTO fs09000000_mod_search.item (id, tenant_id, instance_id, holding_id, json) VALUES ($1::uuid, $2, $3::uuid, $4::uuid, $5::jsonb) ON CONFLICT (id, tenant_id) DO UPDATE SET instance_id = EXCLUDED.instance_id, holding_id = EXCLUDED.holding_id, tenant_id = EXCLUDED.tenant_id, json = EXCLUDED.json

Appendix

Infrastructure

PTF - environment Sunflower (secp1)

  • Name

Memory GIB

vCPUs

  • Name

Memory GIB

vCPUs

  • db.r7g.xlarge

32 GiB4

4 vCPUs

  • MSK fse-test

    • 4 kafka.m7g.xlarge brokers in 2 zones

    • Apache Kafka version 3.7.x

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3