Skip to end of banner
Go to start of banner

Data Import test report (Quesnelia)[non-ECS]

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Overview

This document contains the results of testing Data Import for MARC Bibliographic records at Quesnelia release [non-ECS]. https://folio-org.atlassian.net/jira/people/712020:c7153665-e98d-4df6-a9f4-fe368ae2480f/boards/224?selectedIssue=PERF-836

Summary

Recommendations and Jiras

Results

Test #

Data-import test

Profile

Duration

Poppy  with R/W split enabled

Duration

Quesnelia with R/W split enabled

Difference, % / sec

Results

1.

1k MARC BIB Create

PTF - Create 2

39 sec

54 sec

↓ 15 sec

Completed

2.

5k MARC BIB Create

PTF - Create 2

2 min 22 sec

3 min 20 sec

↓ 1 min 8 sec

Completed

3.

10k MARC BIB Create

PTF - Create 2

4 min 29 sec

6 minutes

↓ 1 min 31 sec

Completed

4.

25k MARC BIB Create

PTF - Create 2

10 min 38 sec

13 min 41 sec

↓ 3 min 3 sec

Completed 

5.

50k MARC BIB Create

PTF - Create 2

20 min 26 sec

21 min 59 sec

↓ 1 min 33 sec

Completed 

6.

100k MARC BIB Create

PTF - Create 2

2 hours 46 min

Cancelled

40 min 16 sec

Completed

7.

500k MARC BIB Create

PTF - Create 2

Not tested

3 hours 27 min

Completed

8.

1k MARC BIB Update

PTF - Updates Success - 1

34 sec

9

2k MARC BIB Update

PTF - Updates Success - 1

1 min 09 sec

10

5k MARC BIB Update

PTF - Updates Success - 1

2 min 31 sec

↓ 6.66% / 17 sec

11

10k MARC BIB Update

PTF - Updates Success - 1

5 min 13 sec

↓ 1.84% / 10 sec

12

25k MARC BIB Update

PTF - Updates Success - 1

12 min 27 sec

↓ 14% / 105 sec

13

25k MARC BIB Update

PTF - Updates Success - 1

2 min 15 sec

14

25k MARC BIB Update

PTF - Updates Success - 1

12 min

Test Runs 

MARC BIB CREATE

Tests #1-7 1k, 5k, 10k, 25k, 50k, 100k, 500k records

Data-import

start time

end time

1

500k_bib_Create.mrc

2024-04-01 09:56:59.095+00

2024-04-01 13:26:19.429+00

2

100k_bib_Create.mrc

2024-04-01 09:03:56.04+00

2024-04-01 09:44:12.654+00

3

50k_bib_Create.mrc

2024-04-01 08:18:58.078+00

2024-04-01 08:40:56.215+00

4

25k_bib_Create.mrc

2024-04-01 07:58:48.679+00

2024-04-01 08:12:30.555+00

5

10k_bib_Create.mrc

2024-04-01 07:47:09.388+00

2024-04-01 07:53:08.405+00

6

5k_bib_Create.mrc

2024-04-01 07:40:32.282+00

2024-04-01 07:43:52.674+00

7

1k_bib_Create.mrc

2024-04-01 07:38:30.511+00

2024-04-01 07:39:24.804+00

Service CPU Utilization 

MARC BIB CREATE

Tests #1-7

1k, 5k, 10k, 25k, 50k, 100k, 500k records

CPU utilization for all modules came back to by default numbers after all tests. Average for mod-inventory-b - 130%, mod-inventory-storage-b - 25%, mod-source-record-storage-b - 40%, mod-source-record-manager-b - 35%, mod-di-converter-storage-b - 70%, , mod-data-import - 350% spike for 500k job(same behaviour on Poppy version).

image-20240402-125027.png

MARC BIB UPDATE

Tests #8-14

1k, 2k, 5k, 10k, 25k, 25k, 25k records

Average for mod-inventory-b - 220%, mod-inventory-storage-b - 25%, mod-source-record-storage-b - 50%, mod-source-record-manager-b - 45%, mod-di-converter-storage-b - 90%, , mod-data-import - 96% spike for 25k job.

Memory Utilization

No memory leak is suspected for DI modules.

MARC BIB CREATE

Tests #1-7

1k, 5k, 10k, 25k, 50k, 100k, 500k records

image-20240402-123807.png

MARC BIB UPDATE

Tests #8-14

1k, 5k, 10k, 25k, 25k, 25k records

RDS CPU Utilization 

MARC BIB CREATE

Average 95% for DI jobs with more than 10k records

image-20240402-125315.png

MARC BIB UPDATE

RDS Database Connections

MARC BIB CREATE
 For DI  job Create maximum 275 and for Update - 260 connections

image-20240402-125422.png

Average active sessions (AAS)

MARC BIB CREATE

image-20240402-125550.png

Top SQL

image-20240402-125636.png

MARC BIB UPDATE

Top SQL

INSERT INTO fs09000000_mod_source_record_manager.events_processed 

INSERT INTO fs09000000_mod_source_record_manager.journal_records 

MSK CPU utilization (Percent) OpenSearch

Avarage CPU Utilization is about 9%

image-20240402-130225.png

CPU (User) usage by broker

image-20240402-130345.png

Errors

Appendix

Infrastructure

PTF -environment pcp1

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • 2 database  instances, writer/reader

    NameMemory GIBvCPUsmax_connections

    db.r6g.xlarge

    32 GiB4 vCPUs2731

  • MSK tenant

    • 4 m5.2xlarge brokers in 2 zones

    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3

Methodology

  1. Prepare files for DI Create job

    • 1K, 2K, 5K, 10K, 25K, 50K, 100K files.

  2. Run DI Create on a single tenant one by one with delay with files using PTF - Create 2 profile.

  3. Prepare files for DI Update with Data export app

  4. Run DI Update on a single tenant one by one with delay with prepared files using PTF - Update Success 1 profile

SELECT (completed_date-started_date) as duration, *
FROM fs09000000_mod_source_record_manager.job_execution

where subordination_type = 'COMPOSITE_PARENT'
order by started_date desc
limit 10
  • No labels