Data Import test report - NLA

Data Import test report - NLA

 

Overview

This document contains the results of testing Data Import for NLA. https://folio-org.atlassian.net/browse/PERF-599

Summary

  • Job duration differs up to 1,5 minute for importing 5k-10k records for different test runs (from 8 min 48 sec to 10 min 19 sec ).

  • Memory utilization was stable no memory leak is suspected for all of the modules.

  • Average CPU usage did not exceed 130 % for all modules.

  • Approximately DB CPU usage is up to 97%.

Recommendations and Jiras

 

Investigate the cases where instances can have status code: 404 for data import. https://folio-org.atlassian.net/browse/PERF-609

Results

Files 

Records number

Test 1 duration

Test 2 duration

Test 3 duration

Files 

Records number

Test 1 duration

Test 2 duration

Test 3 duration

For_LA_edeposit_updates_1k.mrc

1000

LMS LA edeposit records update

53 sec

56 sec

1 min 13 sec

For_LA_edeposit_updates_5k.mrc

5000

LMS LA edeposit records update

5 min 44 sec

4 min 57 sec

6 min 9 sec

For_LA_edeposit_updates_10k.mrc

10000

LMS LA edeposit records update

9 min 7 sec

8 min 48 sec

10 min 19 sec

For_DISC_HRID_match_1.mrc

1

DISC HRID match

2 sec

2 sec

2 sec

For_DISC_HRID_match_12.mrc

12

DISC HRID match

2 sec 

3 sec

3 sec

For_DISC_HRID_match_251.mrc

251

DISC HRID match

12 sec

15 sec

12 sec

For_DISC_HRID_match_1k.mrc

1000

DISC HRID match

43 sec

1 min

47 sec

For_DISC_NewNonEdepositRecords_5.mrc

5

DISC New NON edeposit records

3 sec

3 sec

2 sec

NewEDepositRecords_13.mrc

13

DISC New edeposit records

3 sec

3 sec

3 sec

NewEDepositRecords_54.mrc

54

DISC New edeposit records

5 sec

7 sec

4 sec

NewEDepositRecords_74.mrc

74

DISC New edeposit records

5 sec

7 sec

4 sec

NewEDepositRecords_77.mrc

77

DISC New edeposit records

6 sec

9 sec

4 sec

NewEDepositRecords_100.mrc

100

DISC New edeposit records

13 sec

11 sec

5 sec

NewEDepositRecords_200.mrc

200

DISC New edeposit records

13 sec

17 sec

8 sec

 * - Jobs order in the table corresponds to the jobs order on the graphs. on the graphs marked by record numbers.

Memory Utilization

Memory utilization increased for mod-source-record-storage by 1% (from 48% to 49%) at the beginning of 10000 records update. All other modules' CPU utilization was stable.

Service CPU Utilization 

Average CPU usage did not exceed  91% for all modules.

Instance CPU Utilization

Average CPU usage did not exceed  24%.

RDS CPU Utilization 

Approximately DB CPU usage is up to 97%

 

RDS Database Connections

Maximum 490 connections count.

Appendix

Infrastructure

Records count :

  • mod_source_record_storage.marc_records_lb = 7300919

  • mod_source_record_storage.raw_records_lb = 7300919

  • mod_source_record_storage.records_lb = 7300919

  • mod_source_record_storage.marc_indexers = 245032159 (all records)

  • mod_source_record_storage.marc_indexers with field_no 010 = 1008129

  • mod_source_record_storage.marc_indexers with field_no 035 = 8968420

  • mod_inventory_storage.authority = 852215

  • mod_inventory_storage.holdings_record = 6091403

  • mod_inventory_storage.instance = 5581816

  • mod_inventory_storage.item = 5705915

PTF -environment  - nptf 

  • 8 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1

  • 1 database instance, writer

  • MSK ptf-kakfa-3

    • 4 m5.2xlarge brokers in 2 zones

    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true

    • log.retention.minutes=480

    • default.replication.factor=3

  • Kafka topics partitioning: - 2 partitions for DI topics

 

Modules memory and CPU parameters

Modules

Version

Task Definition

Running Tasks 

CPU

Memory

MemoryReservation

MaxMetaspaceSize

Xmx

Modules

Version

Task Definition

Running Tasks 

CPU

Memory

MemoryReservation

MaxMetaspaceSize

Xmx

mod-inventory-storage

26.0.0

1

2

1024

2208

1952

384

1440

mod-inventory

20.0.4

3

2

1024

2880

2592

512

1814

mod-source-record-storage

5.6.6

5

2

1024

4096

3688

512

3076

mod-quick-marc

3.0.0

1

1

128

2288

2176

512

1664

mod-source-record-manager

3.6.3

5

2

1024

4096

3688

512

3076

mod-di-converter-storage

2.0.2

2

2

128

1024

896

128

768

mod-data-import

2.7.1

1

1

256

2048

1844

512

1292

okapi

5.0.1

2

3

1024

1684

1440

512

922

nginx-okapi

2022.03.02

1

2

128

1024

896

-

-

pub-okapi

2022.03.02

1

2

128

1024

896

-

768

Methodology/Approach

To test Baseline for DI JMeter scripts were used.

  • 5 min pauses between the tests

Test was repeated 3 times.