Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In Progress(in review) + retesting results will be add to these report in scope of the
Jira Legacy
serverSystem Jira
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyPERF-681

...

  1. One record on one tenant could be discarded with error: io.netty.channel.StacklessClosedChannelException.
    Jira Legacy
    serverSystem Jira
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMODDATAIMP-748
    Reproduces in both cases with and without splitting feature enabled in at least 30% of test runs with 500k record files and multitenant testing.
  2. During the new Data Import splitting feature testing, items for update were discarded with the error: io.vertx.core.impl.NoStackTraceThrowable: Cannot get actual Item by id: org.folio.inventory.exceptions.InternalServerErrorException: Access for user 'data-import-system-user' (f3486d35-f7f7-4a69-bcd0-d8e5a35cb292) requires permission: inventory-storage.items.item.get. Less than 1% of records could be discarded due to missing permission for  'data-import-system-user'. Permission was not added automatically during the service deployment. I added permission manually to the database and the error does not occur anymore.
    Jira Legacy
    serverSystem Jira
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMODDATAIMP-930
  3. UI issue, when canceled or completed with error Job progress bar cannot be deleted from the screen.
    Jira Legacy
    serverSystem Jira
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMODDATAIMP-929
  4. Usage:
    • Should not use less than 1000 for RECORDS_PER_SPLIT_FILE. The system is stable enough to ingest 1000 records consistently and smaller amounts will incur more overheads, resulting in longer jobs' durations.  CPU utilization for mod-di-converter-storage for 500 RECORDS_PER_SPLIT_FILE(RPSF) = 160%, for 1000RPSF =180%, for 5K RPSF =380% and for 10K RPSF =433%, so in the case of selecting configurations 5K or 10K we recommend to add more CPU to mod-di-converter-storage service.
    • When toggling the file-splitting feature, mod-source-record-storage, mod-source-record-manager's tasks need to be restarted.
    • Keep in mind about the Kafka broker's disk size (as bigger jobs - up to 500K - can be run now), consecutive jobs may use up the disk quickly because the messages' retention time currently is set at 8 hours. For example with 300GB disk size, consecutive jobs of 250K, 500K, 500K sizes will exhaust the disk. 
  5. More CPU could be allocated to mod-inventory and mod-di-converter-storage

...

 ** -  up to 10 items were discarded with the error: io.vertx.core.impl.NoStackTraceThrowable: Cannot get actual Item by id: org.folio.inventory.exceptions.InternalServerErrorException: Access for user 'data-import-system-user' (f3486d35-f7f7-4a69-bcd0-d8e5a35cb292) requires permission: inventory-storage.items.item.get. Less than 1% of records could be discarded due to missing permission for  'data-import-system-user'. Permission was not added automatically during the service deployment. I added permission manually to the database and the error does not occur anymore.

Jira Legacy
serverSystem Jira
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyMODDATAIMP-930

...

With CI/CO 20 users and DI 25k records on each of the 3 tenants Splitting Feature Disabled

ocp3-mod-data-import:12

Image Modified

Data Import Robustness Enhancement

...

Memory utilization rich maximal value for mod-source-record-storage-b 88%  and for mod-source-record-manager-b 85%.

Test 2. Test with 1, 2, and 3 tenants' concurrent jobs with configuration RECORDS_PER_SPLIT_FILE = 10K, 2 runs for each test.

...

Test 2. Test with 1, 2, and 3 tenants' concurrent jobs with configuration RECORDS_PER_SPLIT_FILE = 10K, 2 runs for each test.

CPU utilization of  mod-di-converter-storage-b

 

RDS CPU Utilization 

Test 1. Test with 1, 2, and 3 tenants' concurrent jobs with configuration RECORDS_PER_SPLIT_FILE = 500, 2 runs for each test. Maximal  CPU Utilization = 95%

...

Service CPU Utilization 

Memory Utilization

Image Modified


RDS CPU Utilization  

RDS Database Connections


Image Modified

Appendix

Infrastructure ocp3  with the "Bugfest" Dataset

Records count :

  • tenant0_mod_source_record_storage.marc_records_lb = 9674629
  • tenant2_mod_source_record_storage.marc_records_lb = 0
  • tenant3_mod_source_record_storage.marc_records_lb = 0
  • tenant0_mod_source_record_storage.raw_records_lb = 9604805
  • tenant2_mod_source_record_storage.raw_records_lb = 0
  • tenant3_mod_source_record_storage.raw_records_lb = 0
  • tenant0_mod_source_record_storage.records_lb = 9674677
  • tenant2_mod_source_record_storage.records_lb = 0
  • tenant3_mod_source_record_storage.records_lb = 0
  • tenant0_mod_source_record_storage.marc_indexers =  620042011
  • tenant2_mod_source_record_storage.marc_indexers =  0
  • tenant3_mod_source_record_storage.marc_indexers =  0
  • tenant0_mod_source_record_storage.marc_indexers with field_no 010 = 3285833
  • tenant2_mod_source_record_storage.marc_indexers with field_no 010 = 0
  • tenant3_mod_source_record_storage.marc_indexers with field_no 010 = 0
  • tenant0_mod_source_record_storage.marc_indexers with field_no 035 = 19241844
  • tenant2_mod_source_record_storage.marc_indexers with field_no 035 = 0
  • tenant3_mod_source_record_storage.marc_indexers with field_no 035 = 0
  • tenant0_mod_inventory_storage.authority = 4
  • tenant2_mod_inventory_storage.authority = 0
  • tenant3_mod_inventory_storage.authority = 0
  • tenant0_mod_inventory_storage.holdings_record = 9592559
  • tenant2_mod_inventory_storage.holdings_record = 16
  • tenant3_mod_inventory_storage.holdings_record = 16
  • tenant0_mod_inventory_storage.instance = 9976519
  • tenant2_mod_inventory_storage.instance = 32
  • tenant3_mod_inventory_storage.instance = 32 
  • tenant0_mod_inventory_storage.item = 10787893
  • tenant2_mod_inventory_storage.item = 19
  • tenant3_mod_inventory_storage.item = 19

PTF -environment ocp3 

...

2 database  instances, one reader, and one writer

...

  • 4 m5.2xlarge brokers in 2 zones
  • Apache Kafka version 2.8.0

  • EBS storage volume per broker 300 GiB

  • auto.create.topics.enable=true
  • log.retention.minutes=480
  • default.replication.factor=3

...

Before Splitting Feature released

...

Test 1.2. Data-import of 250K records file with "PTF - Update" job profile

Service CPU Utilization 
Image Added

Memory Utilization
Image Added

RDS CPU Utilization 

Image Added

RDS Database Connections

Image Added

Test 1.3. Multitenant MARC Create (100k, 50k, and 1 record)

Service CPU Utilization 
Image Added

Memory Utilization
Image Added

RDS CPU Utilization 

Image Added

RDS Database Connections

Image Added


Test 1.4. Data-import of 250K records file with "PTF - Update" job profile

Service CPU Utilization 
Image Added

Memory Utilization 

Image Added

RDS CPU Utilization 

Image Added

RDS Database Connections

Image Added

CICO responce time graph
Image Added

Appendix

Infrastructure ocp3  with the "Bugfest" Dataset

Records count :

  • tenant0_mod_source_record_storage.marc_records_lb = 9674629
  • tenant2_mod_source_record_storage.marc_records_lb = 0
  • tenant3_mod_source_record_storage.marc_records_lb = 0
  • tenant0_mod_source_record_storage.raw_records_lb = 9604805
  • tenant2_mod_source_record_storage.raw_records_lb = 0
  • tenant3_mod_source_record_storage.raw_records_lb = 0
  • tenant0_mod_source_record_storage.records_lb = 9674677
  • tenant2_mod_source_record_storage.records_lb = 0
  • tenant3_mod_source_record_storage.records_lb = 0
  • tenant0_mod_source_record_storage.marc_indexers =  620042011
  • tenant2_mod_source_record_storage.marc_indexers =  0
  • tenant3_mod_source_record_storage.marc_indexers =  0
  • tenant0_mod_source_record_storage.marc_indexers with field_no 010 = 3285833
  • tenant2_mod_source_record_storage.marc_indexers with field_no 010 = 0
  • tenant3_mod_source_record_storage.marc_indexers with field_no 010 = 0
  • tenant0_mod_source_record_storage.marc_indexers with field_no 035 = 19241844
  • tenant2_mod_source_record_storage.marc_indexers with field_no 035 = 0
  • tenant3_mod_source_record_storage.marc_indexers with field_no 035 = 0
  • tenant0_mod_inventory_storage.authority = 4
  • tenant2_mod_inventory_storage.authority = 0
  • tenant3_mod_inventory_storage.authority = 0
  • tenant0_mod_inventory_storage.holdings_record = 9592559
  • tenant2_mod_inventory_storage.holdings_record = 16
  • tenant3_mod_inventory_storage.holdings_record = 16
  • tenant0_mod_inventory_storage.instance = 9976519
  • tenant2_mod_inventory_storage.instance = 32
  • tenant3_mod_inventory_storage.instance = 32 
  • tenant0_mod_inventory_storage.item = 10787893
  • tenant2_mod_inventory_storage.item = 19
  • tenant3_mod_inventory_storage.item = 19

PTF -environment ocp3 

  • 10 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
  • 2 database  instances, one reader, and one writer

    NameAPI NameMemory GIBvCPUsmax_connections
    R6G Extra Largedb.r6g.xlarge32 GiB4 vCPUs2731


  • MSK ptf-kakfa-3
    • 4 m5.2xlarge brokers in 2 zones
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • log.retention.minutes=480
    • default.replication.factor=3
  • Kafka topics partitioning: - 2 partitions for DI topics

Before Splitting Feature released

Module
ocp3-pvt
Mon Sep 11 09:33:28 UTC 2023
Task Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSizeR/W split enabled
mod-remote-storage13mod-remote-storage:2.0.324920447210243960512512false
mod-agreements8mod-agreements:5.5.2215921488128968384512false
mod-data-import7mod-data-import:2.7.11204818442561292384512false
mod-search30mod-search:2.0.1225922480204814405121024false
mod-authtoken7mod-authtoken:2.13.021440115251292288128false
mod-configuration7mod-configuration:5.9.12102489612876888128false
mod-inventory-storage1mod-inventory-storage:26.1.0-SNAPSHOT.66502208195210241440384512false
mod-circulation-storage15mod-circulation-storage:16.0.122880259215361814384512false
mod-source-record-storage11mod-source-record-storage:5.6.725600500020483500384512false
mod-calendar7mod-calendar:2.4.22102489612876888128false
mod-inventory12mod-inventory:20.0.622880259210241814384512false
mod-circulation9mod-circulation:23.5.622880259215361814384512false
mod-di-converter-storage8mod-di-converter-storage:2.0.52102489612876888128false
mod-pubsub8mod-pubsub:2.9.12153614401024922384512false
mod-users8mod-users:19.1.12102489612876888128false
mod-patron-blocks8mod-patron-blocks:1.8.021024896102476888128false
mod-source-record-manager9mod-source-record-manager:3.6.425600500020483500384512false
nginx-edge7nginx-edge:2023.06.1421024896128000false
mod-quick-marc7mod-quick-marc:3.0.01228821761281664384512false
nginx-okapi7nginx-okapi:2023.06.1421024896128000false
okapi-b8okapi:5.0.13168414401024922384512false
mod-feesfines7mod-feesfines:18.2.12102489612876888128false
mod-patron7mod-patron:5.5.22102489612876888128false
mod-notes7mod-notes:5.0.121024896128952384512false
pub-okapi7pub-okapi:2023.06.142102489612876800false


Service versions for Splitting Feature test

Module
ocp3-pvt
Mon Sep 25 12:43:06 UTC 2023
Task Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSizeR/W
split enabledmod-remote-storage13mod-remote-storage:2.0.324920447210243960512512falsemod-agreements8mod-agreements:5.5.2215921488128968384512false
split enabled
mod-data-import
7
10mod-data-import:2.7.2-SNAPSHOT.
1
1371204818442561292384512false
mod-search30mod-search:2.0.1225922480204814405121024false
mod-
authtoken
configuration
7
8mod-
authtoken
configuration:
2
5.
13
9.
0
12
1440
1024
1152
896
512
128
922
76888128false
mod-bulk-
configuration
operations7mod-
configuration
bulk-operations:
5
1.
9
0.
1
62
1024
3072
896
2600
128
1024
768
1536
88
384
128
512false
mod-inventory-storage1mod-inventory-storage:26.1.0-SNAPSHOT.66502208195210241440384512false
mod-circulation-storage15mod-circulation-storage:16.0.122880259215361814384512false
mod-source-record-storage
11
12mod-source-record-storage:5.6.725600500020483500384512false
mod-calendar7mod-calendar:2.4.22102489612876888128false
mod-inventory12mod-inventory:20.0.622880259210241814384512false
mod-circulation9mod-circulation:23.5.622880259215361814384512false
mod-di-converter-storage8mod-di-converter-storage:2.0.52102489612876888128false
mod-pubsub
8
9mod-pubsub:2.9.12153614401024922384512false
mod-users
8
9mod-users:19.1.12102489612876888128false
mod-patron-blocks
8
9mod-patron-blocks:1.8.021024896102476888128false
mod-source-record-manager
9
12mod-source-record-manager:3.6.
4
5-SNAPSHOT.2452560050002048
3500384512falsenginx-edge7nginx-edge:2023.06.1421024896128000
3500384512false
mod-quick-marc7mod-quick-marc:3.0.01228821761281664384512false
nginx-okapi7nginx-okapi:2023.06.1421024896128000false
okapi-b8okapi:5.0.13168414401024922384512false
mod-feesfines
7
8mod-feesfines:18.2.1
2102489612876888128falsemod-patron7mod-patron:5.5.2
2102489612876888128false
mod-notes7mod-notes:5.0.121024896128952384512false
pub-okapi7pub-okapi:2023.06.142102489612876800false

Service versions for retesting Splitting Feature test on Poppy release. 

Module
ocp3-pvt
Mon Sep 25 12:43:06 UTC 2023
Task Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSizeR/W split enabled
mod-data-import
10
36579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-data-import:
2
3.
7.2-SNAPSHOT.137
0.31204818442561292384512
falsemod-search30
FALSE
mod-search31579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-search:
2
3.0.
1
0225922480204814405121024
false
FALSE
mod-configuration
8
9mod-configuration:5.9.
1
22102489612876888128
false
FALSE
mod-bulk-operations
7
8mod-bulk-operations:1.1.0
.6
23072260010241536384512
false
FALSE
edge-ncip8edge-ncip:1.9.02102489612876888128FALSE
mod-inventory-storage
1
12mod-inventory-storage:
26
27.
1
0.0
-SNAPSHOT.665
0
2
2208
4096
1952
3690
1024
2048
1440
3076384512
false
FALSE
mod-circulation-storage
15
16mod-circulation-storage:
16
17.1.0
.1
22880259215361814384512
false
FALSE
mod-source-record-storage
12
13mod-source-record-storage:5.
6
7.
7
025600500020483500384512
false
FALSE
mod-calendar
7
8mod-calendar:2.
4
5.
2
02102489612876888128
false
FALSE
mod-inventory
12
13mod-inventory:20.1.0
.6
22880259210241814384512
false
FALSE
mod-circulation
9
10mod-circulation:
23
24.
5
0.
6
022880259215361814384512
false
FALSE
mod-di-converter-storage
8
9mod-di-converter-storage:2.1.0
.5
2102489612876888128
false
FALSE
mod-pubsub
9
10mod-pubsub:2.
9
11.
1
02153614401024922384512
false
FALSE
mod-users
9
10mod-users:19.
1
2.
1
02102489612876888128
false
FALSE
mod-patron-blocks
9
10mod-patron-blocks:1.
8
9.021024896102476888128
false
FALSE
mod-source-record-manager
12
15mod-source-record-manager:3.
6.5-SNAPSHOT.245
7.025600500020483500384512
false
FALSE
mod-quick-marc
7
8mod-quick-marc:
3
5.0.01228821761281664384512
false
FALSE
nginx-okapi
7
8nginx-okapi:2023.06.1421024896128000
false
FALSE
okapi-b
8
9okapi:5.
0
1.13168414401024922384512
false
FALSE
mod-feesfines
8
9mod-feesfines:
18
19.
2
0.
1
02102489612876888128
false
FALSE
mod-notes
7
8mod-notes:5.1.0
.1
21024896128952384512
false
FALSE
pub-okapi
7
8pub-okapi:2023.06.142102489612876800
false
FALSE


Methodology/Approach

To set splitting feature: Detailed Release Notes for Data Import Splitting Feature

...