Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

...

This document contains the results of testing Data Import Splitting Feature for MARC Bibliographic records in the Orchid release to detect the baseline for ocp3.

Jira Legacy
serverFOLIO Issue TrackerSystem Jira
serverId6ccf3fe401505d01-3301b853-368a3c2e-983e90f1-20c466b11a49ee9b165564fc
keyPERF-644
Jira Legacy
serverFOLIO Issue TrackerSystem Jira
serverId6ccf3fe401505d01-3301b853-368a3c2e-983e90f1-20c466b11a49ee9b165564fc
keyPERF-645
Jira Legacy
serverFOLIO Issue TrackerSystem Jira
serverId6ccf3fe401505d01-3301b853-368a3c2e-983e90f1-20c466b11a49ee9b165564fc
keyPERF-647
Jira Legacy
serverFOLIO Issue TrackerSystem Jira
serverId6ccf3fe401505d01-3301b853-368a3c2e-983e90f1-20c466b11a49ee9b165564fc
keyPERF-646
Jira Legacy
serverFOLIO Issue TrackerSystem Jira
serverId6ccf3fe401505d01-3301b853-368a3c2e-983e90f1-20c466b11a49ee9b165564fc
keyPERF-671

Splitting feature documentation Detailed Release Notes for Data Import Splitting Feature

...

Recommendations and Jiras

Jira Legacy
serverFOLIO Issue TrackerSystem Jira
serverId6ccf3fe401505d01-3301b853-368a3c2e-983e90f1-20c466b11a49ee9b165564fc
keyMODDATAIMP-924

Results

Test #

Profile

Splitting Feature EnabledResults

Splitting Feature Disabled

ResultsBefore Splitting Feature releasedResults
1

100K MARC Create

PTF - Create 237 min -39 minCompleted40 minCompleted32-33 minutesCompleted
1

250K MARC Create 

PTF - Create 21 hour 32 minCompleted1 hour 41 minCompleted1 hour 33 min - 1 hour 57 minCompleted
1500K MARC CreatePTF - Create 23 hours 29 minCompleted*3 hours 55 minCompleted3 hours 33 minCompleted
2Multitenant MARC Create (100k, 50k, and 1 record)PTF - Create 22 hours 40 minCompleted*

3 hours 1 minCompleted
3CI/CO + DI MARC Create (20 users CI/CO, 25k records DI on 3 tenants)PTF - Create 2



24 minCompleted *
4

100K MARC Update (Create new file)

PTF - Updates Success - 1

58 min 25 sec

57 min 19 sec

Completed1 hour 3 minCompleted--
4

250K MARC Update

PTF - Updates Success - 1

2 hours 2 min **


2 hours 12 min

Completed with errors **

Completed

1 hour 53 minCompleted--
4500K MARC UpdatePTF - Updates Success - 1

4 hours 43 min

4 hours 38 minutes

Completed

Completed

5 hour 59 minCompleted--

 * - One record on one tenant could be discarded with error: io.netty.channel.StacklessClosedChannelException

Jira Legacy
serverFOLIO Issue TrackerSystem Jira
serverId6ccf3fe401505d01-3301b853-368a3c2e-983e90f1-20c466b11a49ee9b165564fc
keyMODDATAIMP-748
?


 ** -  up to 10 items were discarded with the error: io.vertx.core.impl.NoStackTraceThrowable: Cannot get actual Item by id: org.folio.inventory.exceptions.InternalServerErrorException: Access for user 'data-import-system-user' (f3486d35-f7f7-4a69-bcd0-d8e5a35cb292) requires permission: inventory-storage.items.item.get

...

Data Import Robustness Enhancement 

Jira Legacy
serverFOLIO Issue TrackerSystem Jira
serverId6ccf3fe401505d01-3301b853-368a3c2e-983e90f1-20c466b11a49ee9b165564fc
keyPERF-646

25K records RECORDS_PER_SPLIT_FILE
Number of concurent tenantsJob profile 500
1K
5K
10K
1 TenantPTF - Create 212 minutes 55 secondsCompleted11 minutes 48 secondsCompleted





10 minutes 31 secondsCompleted09 minutes 32 secondsCompleted



2 TenantsPTF - Create 219 minutes 29 secondsCompleted15 minutes 47 secondsCompleted





18 minutes 19 secondsCompleted15 minutes 47 secondsCompleted



3 TenantsPTF - Create 2

24 minutes

15 seconds

Completed

25 minutes

47 seconds

Completed





24 minutes 
38 seconds
Completed

23 minutes

28 seconds








Test #3 With CI/CO 20 users and DI 25k records on each of the 3 tenants

...

Test#2 Multitenant  DI (9 concurrent jobs)Image Removed

Maximal DB CPU usage is about 95%

Image Added


Test#3 With CI/CO

...

  • tenant0_mod_source_record_storage.marc_records_lb = 9674629
  • tenant2_mod_source_record_storage.marc_records_lb = 0
  • tenant3_mod_source_record_storage.marc_records_lb = 0
  • tenant0_mod_source_record_storage.raw_records_lb = 9604805
  • tenant2_mod_source_record_storage.raw_records_lb = 0
  • tenant3_mod_source_record_storage.raw_records_lb = 0
  • tenant0_mod_source_record_storage.records_lb = 9674677
  • tenant2_mod_source_record_storage.records_lb = 0
  • tenant3_mod_source_record_storage.records_lb = 0
  • tenant0_mod_source_record_storage.marc_indexers =  620042011
  • tenant2_mod_source_record_storage.marc_indexers =  0
  • tenant3_mod_source_record_storage.marc_indexers =  0
  • tenant0_mod_source_record_storage.marc_indexers with field_no 010 = 3285833
  • tenant2_mod_source_record_storage.marc_indexers with field_no 010 = 0
  • tenant3_mod_source_record_storage.marc_indexers with field_no 010 = 0
  • tenant0_mod_source_record_storage.marc_indexers with field_no 035 = 19241844
  • tenant2_mod_source_record_storage.marc_indexers with field_no 035 = 0
  • tenant3_mod_source_record_storage.marc_indexers with field_no 035 = 0
  • tenant0_mod_inventory_storage.authority = 4
  • tenant2_mod_inventory_storage.authority = 0
  • tenant3_mod_inventory_storage.authority = 0
  • tenant0_mod_inventory_storage.holdings_record = 9592559
  • tenant2_mod_inventory_storage.holdings_record = 16
  • tenant3_mod_inventory_storage.holdings_record = 16
  • tenant0_mod_inventory_storage.instance = 9976519
  • tenant2_mod_inventory_storage.instance = 32
  • tenant3_mod_inventory_storage.instance = 32 
  • tenant0_mod_inventory_storage.item = 10787893
  • tenant2_mod_inventory_storage.item = 19
  • tenant3_mod_inventory_storage.item = 19

PTF -environment ocp3 

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 2 database  instances, one reader, and one writer

    NameAPI NameMemory GIBvCPUsmax_connections
    R6G Extra Largedb.r6g.xlarge32 GiB4 vCPUs2731


  • MSK ptf-kakfa-3
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
  • Kafka topics partitioning: - 2 partitions for DI topics

...