Data Import test report Ramsons [non-ECS]

Data Import test report Ramsons [non-ECS]

Overview

This document contains the results of testing Data Import for MARC Bibliographic records at Ramsons okapi release [non-ECS].

In scope is testing of 5K,10K,25K,50K, 100K data import create and update. Two round of tests - one mod-search was enabled and in the 2nd round mod-search was disable(task count =0).
Ticket: https://folio-org.atlassian.net/browse/PERF-966

 

Summary

  • All tests passed successfully.

  • With disabled mod-search all DI durations (for creates and updates) are the same in overal as in Q release.

  • With enabled mod-search there’s a big performance degradation for creates and updates (50-100%). It’s happening because of long running query from mod search side that causing significant load on DB and slowing overall process down. (this query is part of runtime reindexing process for contributor)

  • On mod-search schema side deadlocks observed during data import creates and updates, however deadlocks does not affect completion of DI.

Recommendations & Jiras

Results

Test #

Data-import test

Profile

Duration

Ramsons

(rcp1)

mod-search disabled

Duration

Ramsons (rcp1)

mod-search enabled

Duration

Ramsons

(rcon)

Duration

Quesnelia (qcp1)

Duration

Quesnelia (qcon)

Status

Test #

Data-import test

Profile

Duration

Ramsons

(rcp1)

mod-search disabled

Duration

Ramsons (rcp1)

mod-search enabled

Duration

Ramsons

(rcon)

Duration

Quesnelia (qcp1)

Duration

Quesnelia (qcon)

Status

1

5k MARC BIB Create

PTF - Create 2

1 min

3 min 7 s

-

-

-

-

 

10k MARC BIB Create

PTF - Create 2

4 min 30 s

6 min 15 s

5 min 10 s

6 minutes

4 min 14 sec

Completed

2

25k MARC BIB Create

PTF - Create 2

11 min

17 min

10 min 30 s

13 min 41 sec

9 min 41 sec

Completed 

3

50k MARC BIB Create

PTF - Create 2

22 min

41 min 25 s

15 min 43 s

21 min 59 sec

18 min 18 sec

Completed 

4

100k MARC BIB Create

PTF - Create 2

46 min

1 hr 19 min

31 min 51 s

40 min 16 sec

38 min 36 sec

Completed

 

5k MARC BIB Update

PTF - Updates Success - 6

3 min 33 s

6 min 33 s

-

-

-

-

6

10k MARC BIB Update

PTF - Updates Success - 6

6 min 46 s

11 min 14 s

7 min 10 s

10 min 27 sec

5 min 59 sec

Completed

7

25k MARC BIB Update

PTF - Updates Success - 6

16 min 40 s

28 min 43 s

19 min 3 s

23 min 16 sec

19 min 52 sec

Completed

8

50k MARC BIB Update

PTF - Updates Success - 6

33 min 45 s

58 min 30 s

38 min 53 sec

40 min 52 sec

37 min 53 sec

Completed

9

100k MARC BIB Update

PTF - Updates Success - 6

1 hr 8 min

2 hr 14 min

1 hr 23 min

1 hrs 2 min

1 hrs 14 min

Completed

Memory Utilization

Memory utiliяation showed stable trend during DI creates and updates tests. No sudden crashes or unexpected growth of memory usage were observed.

All services didn’t exceed 80% of memory usage. Most used module is mod-permissions and it memory growing up to 80% during tests, however after each test it returns to normal condition.

Service memory usage for DI creates and updates with mod-search enabled

image-20250212-122616.png

Service memory usage for DI creates and updates with mod-search disabled

image-20250212-123852.png

Service memory usage for DI creates and updates (combined)

image-20250212-121100.png

 

 

 

CPU Utilization

CPU utilization is stable and predictable for all modules during all tests. Top service utilization modules are:

  • mod-inventory 10% max

  • mod-source-record-storage 7,5% max

  • mod-inventory-storage 5,5% max

Service CPU utilisation for DI creates and updates with enabled mod-search

image-20250212-122312.png

Service CPU utilisation for DI creates and updates with disabled mod-search

During tests set with disabled mod-search (task count set to 0) CPU usage is higher for most of a modules. It happens as with disabled mod-search DB got more free resources and was able to process requests faster and it got reflection on CPU usage of a modules.

image-20250212-123326.png

 

Service CPU utilization for DI creates and updates (combined)

On chart here is clearly visible higher CPU usage for modules with mod-search disabled. Moreover here is clearly visible that without mod-search DI is much faster.

image-20250212-120712.png

 

RDS Metrics 

As expected CPU usage of DB is high (as usual during data import process), however there’s visible improvement (lower) (Close to 100% with mod-search enabled and ±85% with disabled mod-search) CPU usage during DI with disabled mod-search.

DB CPU utilization for DI creates and updates with mod-search enabled

image-20250212-124310.png

 

DB CPU utilisation for DI creates and updates with mod-search disabled

image-20250212-124121.png

DB CPU utilization for DI creates and updates (combined)

image-20250212-121251.png

 

DB load for DI creates and updates with mod-search enabled

image-20250212-133417.png

DB load for DI creates and updates with mod-search disabled

image-20250212-133256.png

DB load for DI creates and updates (combined)

On this chart is clearly visible impact of mod-search on overall DB load.

image-20250212-141001.png

Slow query detected from mod-search side that affecting performance significantly:

Data volume:

search.instance - 4 109 321

search.instance_contributor - 8 327 231

Slow query found in mod-search

WITH cte AS (SELECT id, name, name_type_id, authority_id, last_updated_date FROM fs09000000_mod_search.contributor WHERE last_updated_date > $1 ORDER BY last_updated_date ) SELECT c.id, c.name, c.name_type_id, c.authority_id, c.last_updated_date, json_agg( CASE WHEN sub.instance_count IS NULL THEN NULL ELSE json_build_object( 'count', sub.instance_count, 'typeId', sub.type_ids, 'shared', sub.shared, 'tenantId', sub.tenant_id ) END ) AS instances FROM cte c LEFT JOIN (SELECT cte.id, ins.tenant_id, ins.shared, array_agg(DISTINCT ins.type_id) FILTER (WHERE ins.type_id <> '') AS type_ids, count(DISTINCT ins.instance_id) AS instance_count FROM fs09000000_mod_search.instance_contributor ins INNER JOIN cte ON ins.contributor_id = cte.id GROUP BY cte.id, ins.tenant_id, ins.shared) sub ON c.id = sub.id GROUP BY c.id, c.name, c.name_type_id, c.authority_id, c.last_updated_date ORDER BY last_updated_date ASC

 

 

MSK CPU usage

During all tests CPU usage haven’t exceed 55% on all brokers.  

image-20250214-105034.png

Appendix

Infrastructure

PTF -environment rcp1

  • 11 m6g.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • db.r6.xlarge database instances, writer

  • MSK fse-test

    • 4 kafka.m7g.xlarge brokers in 2 zones

    • Apache Kafka version 3.7.x (KRaft mode)

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3

  • OpenSearch 2.13 ptf-test cluster

    • r6g.2xlarge.search 4 data nodes

    • r6g.large.search 3 dedicated master nodes

Cluster Resources - rcp1-pvt

Module

Task Definition Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft Limit

CPU Units

Xmx

Metaspace Size

Max Metaspace Size

Module

Task Definition Revision

Module Version

Task Count

Mem Hard Limit

Mem Soft Limit

CPU Units

Xmx

Metaspace Size

Max Metaspace Size

mod-remote-storage

9

mod-remote-storage:3.3.3

2

4920

4472

0

3960

512

512

mod-ncip

7

mod-ncip:1.15.6

2

1024

896

0

768

88

128

mod-finance-storage

7

mod-finance-storage:8.7.3

2

1024

896

1024

700

88

128

mod-agreements

9

mod-agreements:7.1.4

2

3184

2976

0

0

0

0

mod-ebsconet

9

mod-ebsconet:2.3.1

2

1248

1024

0

700

128

256

mod-organizations

7

mod-organizations:2.0.0

2

1024

896

0

700

88

128

mod-consortia

3

mod-consortia:1.2.2

2

5136

4776

0

4416

512

1024

edge-sip2

7

edge-sip2:3.3.1

2

1024

896

0

768

88

128

mod-settings

7

mod-settings:1.1.0

2

1024

896

200

768

88

128

mod-serials-management

9

mod-serials-management:1.1.2

2

2480

2312

0

1792

384

512

edge-dematic

7

edge-dematic:2.3.1

1

1024

896

0

768

88

128

mod-data-import

8

mod-data-import:3.2.4

1

2048

1844

0

1292

384

512

mod-search

21

mod-search:4.0.7

2

2592

2480

0

1440

512

1024