OAI-PMH data harvesting[Concurrent Incremental] (Poppy)

Overview

  • The purpose of the OAI-PMH Concurrent Incremental Harvesting tests is to measure performance of Poppy release and to find possible issues, bottlenecks PERF-786 - Getting issue details... STATUS on OCP3 environment.
  • The previous results of Incremental OAI-PMH  PERF-660 - Getting issue details... STATUS

Summary

  • OAI-PMH - Incremental Harvesting:

    • Three tests have been executed by JMeter script to check performance of harvesting the following number of records 10K, 25K, 50K, 500K and 1 MLN with different OAI-PMH Behaviors :

      • Test 1. Record source set to Source record storage ;

      • Test 2. Record source set to Inventory* (data set limit in OCP3 - 250k) ;

      • Test 3.  Record source set to Source record storage and inventory.

    • Number of multiple concurrent harvests:
      • 2 harvests;
      • 4 harvests;
      • 6 harvests.
  • CPU utilization during all tests was relevant to number of concurrent harvests. 
    • Test #1 mod-oai-pmh-b: 2 harvests -   5%, 4 harvests - 10%,  6 harvests - 15%
    • Test #2 mod-oai-pmh-b: 2 harvests -   1%, 4 harvests - 3.7%, 6 harvests - 5.5%
    • Test #3 mod-oai-pmh-b: 2 harvests - 10%, 4 harvests - 15%,  6 harvests - 25%
  • Memory consumption was stable except of mod-inventory which grew slowly and mod-oai-pmh that grew up from 46% to 56%.  Tests:
    • Tests #1 and #3 mod-oai-pmh-b didn't exceed 40%
    • Test #2 mod-oai-pmh-b achieved 55%
  • RDS CPU utilization:
    • The averages CPU usage for  2 harvests - 15%
    • The averages CPU usage for  4 harvests - 20%
    • The averages CPU usage for  6 harvests - 25%
  • Durations of harvests differed significantly in tests #1,3 (SRS) and test #2 (Inventory) because of the date creation distribution fromDate and untilDate parameters.
  • Durations were not degraded by increased number of concurrent harvests.
  • Response times for tests can be found in expanded links in section Test #.  Record source

Improvements that can be noted in Poppy release:
1) Non-ECS environment with Poppy release can handle concurrent OAI-PMH 

Recommendations & Jiras

  • To prepare tests it's good point to populate complete_updated_date column in {tenant}_mod_inventory_storage.instance using migration. More info in Appendix section.
  • To avoid degradation on OAI-PMH response times check that DB top queries do not have DELETE and INSERT for marc_id values after cluster restart
  • To have the same starting conditions before running test with different Record source sets the edge-oai-pmh service was restarted, it was done to return the service memory usage to its starting(after deployment) value;

Test Runs & Results

Incremental harvesting


2 concurrent Incremental OAI-PMH4 concurrent Incremental OAI-PMH6 concurrent Incremental OAI-PMH

Number of harvested records

Test 1. Record source = Source record storage DurationTest 2. Record source = Inventory DurationTest 3. Record source = Source record storage and inventory Duration

Test 1. Record source = Source record storage Duration

Test 2. Record source = Inventory Duration

Test 3. Record source = Source record storage and inventory DurationTest 1. Record source = Source record storage DurationTest 2. Record source = Inventory DurationTest 3. Record source = Source record storage and inventory Duration

10000 records(10K)

00:02:0800:08:5500:01:3900:01:05

00:01:46

00:01:3100:01:0700:01:3200:01:14

25000 records(25K)

00:04:0900:16:2500:04:2700:02:3800:21:0000:04:3400:02:5200:20:3200:02:57

50000 records(50K)

00:07:4000:33:2500:08:1000:05:1700:32:4600:07:4400:05:3400:32:4700:13:25

500000 records(500K) / 250000 records(250K) in test #2

01:56:4002:33:3001:51:2401:58:3402:35:2901:48:4801:34:2902:37:4501:44:42

1000000 records(1MLN)

02:50:17not enough data02:39:0902:59:09not enough data02:50:2903:04:30not enough data02:58:50

Incremental harvesting

Test 1.  Record source = Source record storage

 Results for Test 1.  Record source = Source record storage
Test LabelNumber of harvested recordsAverage Response Times, msDuration
SRS 2 concurrent 10k100000.98200:02:08
SRS 4 concurrent 10k100000.35600:01:05
SRS 6 concurrent 10k100000.3700:01:07
SRS 2 concurrent 25k250000.68900:04:09
SRS 4 concurrent 25k250000.33100:02:38
SRS 6 concurrent 25k250000.38500:02:52
SRS 2 concurrent 50k500000.61600:07:40
SRS 4 concurrent 50k500000.33400:05:17
SRS 6 concurrent 50k500000.36400:05:34
SRS 2 concurrent 500k5000000.90301:56:40
SRS 4 concurrent 500k5000001.1201:58:34
SRS 6 concurrent 500k5000000.82901:34:29
SRS 2 concurrent 1Mln10000000.71802:50:17
SRS 4 concurrent 1Mln10000000.7702:59:09
SRS 6 concurrent 1Mln10000000.80203:04:30

This graph shows response times for GET request that retrieve data. For some reason for 4 and 6 concurrent harvests with 10k, 25k and 50k it decreases significantly affecting positively duration.

Service CPU Utilization

During five harvesting tests with 10K, 25k, 50K, 500K and 1MLN records CPU utilization remained steady for the same number of concurrent harvests.

The averages CPU usage for  2 harvests mod-oai-pmh-b = 5%, edge-oai-pmh-b = 3.5%, mod-source-record-storage-b = 2%, okapi-b = 1.5%, mod-inventory-storage-b = 1.5%

The averages CPU usage for  4 harvests mod-oai-pmh-b = 9%, edge-oai-pmh-b = 5.4%, mod-source-record-storage-b = 1.5%, okapi-b = 1.7%, mod-inventory-storage-b = 0.7%

The averages CPU usage for  6 harvests mod-oai-pmh-b = 15.5%, edge-oai-pmh-b = 9%, mod-source-record-storage-b = 1.5%, okapi-b = 2.4%, mod-inventory-storage-b = 1%

A few minor fluctuations were at the the beginning of each test. 

Service Memory Consumption

Memory consumption was stable. 

The averages memory consumption didn't exceed mod-oai-pmh-b = 40%, edge-oai-pmh-b = 31%, mod-source-record-storage-b = 37%, okapi-b = 37%, mod-inventory-storage-b = 14%

This graph for 10k, 25k, 50k records

This graph for 500k and 1 MLN records

This graph for 1 MLN records only

RDS CPU Utilization

Average CPU utilization was stable for the same number of concurrent harvests. 

The averages CPU usage for  2 harvests - 15%

The averages CPU usage for  4 harvests  - 20%

The averages CPU usage for  6 harvests  - 25-30%


RDS Database Connections

Number of database connection was about 440,

Database load

This graph shows top sql queries for OAI-PMH 10k, 25k, 50k

This graph shows top sql queries for OAI-PMH 500k, 1 MLN

Marked query runs after cluster start until 16:30 UTC. This query was found in pcp1 cluster also.

This graph for 1 MLN only. 4 and 6 concurrent harvests

Test 2.  Record source = Inventory

Service CPU Utilization

The averages CPU usage for  2 harvests mod-oai-pmh-b = 1%, edge-oai-pmh-b = 0.5%, mod-source-record-storage-b = 1.5%, okapi-b = 0.8%, mod-inventory-storage-b = 0.3%

The averages CPU usage for  4 harvests mod-oai-pmh-b = 3.7%, edge-oai-pmh-b = 1.5%, mod-source-record-storage-b = 1.6%, okapi-b = 1.2%, mod-inventory-storage-b = 0.4%

The averages CPU usage for  6 harvests mod-oai-pmh-b = 5.5%, edge-oai-pmh-b = 2%, mod-source-record-storage-b = 1.4%, okapi-b = 1.2%, mod-inventory-storage-b = 0.5%

This graph for 10k, 25k, 50k.

This graph for 250k

Service Memory Consumption

For 10k, 25k, 50k memory consumption for mod-oai-pmh was 28% at the beginning and grew up to 46% 

For 250k tests memory consumption for mod-oai-pmh was 55% at the beginning of 250k tests and stayed at this level 

This graph for 10k, 25k, 50k.

This graph for 250k

RDS CPU Utilization

RDS for 10k, 25k, 50k

Fluctuations on the screen explained by DELETE, INSERT queries with marc_id values connected to daily cluster restart. After 14:30 this process was finished and we observe graph for the tests

RDS for 250k

RDS Database Connections

Connections are the same as for other tests - 440.


Test 3.  Record source = Source record storage and inventory

 Results for Test 3.  Record source = Source record storage and Inventory
Test LabelAverage Response Times, msDuration
SRS+INV 2 concurrent 10k0.7100:01:39
SRS+INV 4 concurrent 10k0.61700:01:31
SRS+INV 6 concurrent 10k0.43900:01:14
SRS+INV 2 concurrent 25k0.77300:04:27
SRS+INV 4 concurrent 25k0.80200:04:34
SRS+INV 6 concurrent 25k0.40700:02:57
SRS+INV 2 concurrent 50k0.68400:08:10
SRS+INV 4 concurrent 50k0.62900:07:44
SRS+INV 6 concurrent 50k1.3100:13:25
SRS+INV 2 concurrent 500k1.0301:51:24
SRS+INV 4 concurrent 500k101:48:48
SRS+INV 6 concurrent 500k0.95301:44:42
SRS+INV 2 concurrent 1Mln0.65202:39:09
SRS+INV 4 concurrent 1Mln0.72102:50:29
SRS+INV 6 concurrent 1Mln0.76802:58:50

Service CPU Utilization

The averages CPU usage for  2 harvests mod-oai-pmh-b = 10%, edge-oai-pmh-b = 7%, mod-source-record-storage-b = 1.7%, okapi-b = 1.5%, mod-inventory-storage-b = 0.6%

The averages CPU usage for  4 harvests mod-oai-pmh-b = 15%, edge-oai-pmh-b = 10%, mod-source-record-storage-b = 1.5%, okapi-b = 2%, mod-inventory-storage-b = 0.8%

The averages CPU usage for  6 harvests mod-oai-pmh-b = 25%, edge-oai-pmh-b = 15%, mod-source-record-storage-b = 1.4%, okapi-b = 2.4%, mod-inventory-storage-b = 1%

The graph shows 10k, 25k, 50k, and 2 harvests of 500k

The graph demonstrate 500k and 1 MLN harvests

The graph demonstrate 1 MLN harvests only

Service Memory Utilization

Memory consumption was stable from OAI-PMH related modules. Mod-inventory didn't exceed 72%.

The averages memory consumption didn't exceed mod-oai-pmh-b = 40%, edge-oai-pmh-b = 29%, mod-source-record-storage-b = 37%, okapi-b = 37%, mod-inventory-storage-b = 15% , mod-inventory = 72%

RDS CPU Utilization

Average CPU utilization was stable for the same number of concurrent harvests, close to results in test #1..

Fluctuations on DB graphs explained that after everyday cluster start we observed DELETE queries from marc_indexers table with specific condition. Producing high load which affect response times of OAI-PMH. It happens each time after cluster restart.

It deletes rows from the table marc_indexers based on certain conditions defined in two separate subqueries.
It captures the marc_id values of the deleted rows
It inserts the distinct marc_id values from both subqueries into the table marc_indexers_deleted_ids to keep track of the deleted marc_id values. 

The averages CPU usage for  2 harvests - 15%

The averages CPU usage for  4 harvests  - 20%

The averages CPU usage for  6 harvests  - 25-30%

RDS Database Connections

Number of database connection was about 440 in all tests.

Database load

This graph shows 10k, 25k, 50k 

Top query:

  • WITH deleted_rows AS ( delete from marc_indexers mi where exists( select 1 from marc_records_tracking mrt where mrt.is_dirty = true and mrt.marc_id = mi.marc_id and mrt.version > mi.version ) returning mi.marc_id), deleted_rows2 AS ( delete from marc_indexers mi where exists( select 1 from records_lb where records_lb.id = mi.marc_id and records_lb.state = 'OLD' ) returning mi.marc_id) INSERT INTO marc_indexers_deleted_ids SELECT DISTINCT marc_id FROM deleted_rows UNION SELECT marc_id FROM deleted_rows2



Appendix

Methodology/Approach

OAI-PMH (incremental harvesting) was carried out by JMeter script from carrier with 2 main requests: 

  • /oai/records?verb=ListRecords&metadataPrefix=marc21_withholdings&apikey=[APIKey]
  • /oai/records?verb=ListRecords&apikey=[APIKey]&resumptionToken=[resumptionToken]

to extract the required number of records was used loop counter with following configuration:

  • 98 loop counts for 10K records;
  • 248 loop counts for 25K records;
  • 498 loop counts for 50K records;
  • 2498 loop counts for 250k records* 
  • 4998 loop counts for 500K records;
  • 9998 loop counts for 1MLN records

* - Test #2 data set limit

To run the incremental harvesting test the next time ranges were defined by experimental means. The time range for Test 2* was extended due to the impossibility of harvesting the defined number of records, but the next tests were run after adding 800K instances to database.


Start date Until date
Test 1.2022-12-212023-10-16
Test 2*.1962-12-212023-10-23*
Test 3. 2022-12-212023-10-16


OAI-PMH

Before testing OAI-PMH, following database commands to optimize the tables were executed (from https://folio-org.atlassian.net/wiki/display/FOLIOtips/OAI-PMH+Best+Practices#OAIPMHBestPractices-SlowPerformance):

REINDEX index <tenant>_mod_inventory_storage.audit_item_pmh_createddate_idx ;
REINDEX index <tenant>_mod_inventory_storage.audit_holdings_record_pmh_createddate_idx;
REINDEX index <tenant>_mod_inventory_storage.holdings_record_pmh_metadata_updateddate_idx;
REINDEX index <tenant>_mod_inventory_storage.item_pmh_metadata_updateddate_idx;
REINDEX index <tenant>_mod_inventory_storage.instance_pmh_metadata_updateddate_idx;
analyze verbose <tenant>_mod_inventory_storage.instance;
analyze verbose <tenant>_mod_inventory_storage.item;
analyze verbose <tenant>_mod_inventory_storage.holdings_record;


Execute the following query in a related database for removing existed 'instances' created by previous harvesting request and a request itself:

TRUNCATE TABLE fs09000000_mod_oai_pmh.request_metadata_lb cascade

Execute migration for complete_updated_date column as described here Migration scripts for OAI-PMH (note that in step 2. Update command set search_path = "{tenant}_mod_inventory_storage", "public"; may not work for some reason). It's ok to skip the command in scope of OAI-PMH.

Infrastructure

Environment: OCP3
Release: Poppy (2023 R2)

  • 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)
  • 2 instances of db.r6.xlarge database instances, one reader, and one writer 
  • MSK tenant
    • 4 brokers
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • og.retention.minutes=480
    • default.replication.factor=3

Modules

 All modules

OAI-PMH related modules:

ModuleTask Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSizeR/W split enabled
ocp3-pvt









Fri Feb 16 10:23:52 UTC 2024









mod-remote-storage15mod-remote-storage:3.0.124920447210243960512512FALSE
mod-ncip9mod-ncip:1.14.42102489612876888128FALSE
mod-finance-storage10mod-finance-storage:8.5.021024896102470088128FALSE
mod-agreements10mod-agreements:6.0.2215921488128000FALSE
mod-ebsconet9mod-ebsconet:2.1.1212481024128700128256FALSE
edge-sip29edge-sip2:3.1.12102489612876888128FALSE
mod-organizations10mod-organizations:1.8.02102489612870088128FALSE
mod-settings9mod-settings:1.0.22102489620076888128FALSE
edge-dematic9edge-dematic:2.1.11102489612876888128FALSE
mod-data-import43mod-data-import:3.0.71204818442561292384512FALSE
mod-search36mod-search:3.0.5225922480204814405121024FALSE
mod-tags10mod-tags:2.1.02102489612876888128FALSE
mod-authtoken15mod-authtoken:2.14.121440115251292288128FALSE
edge-courses2edge-courses:1.3.02102489612876888128FALSE
mod-notify9mod-notify:3.1.02102489612876888128FALSE
mod-inventory-update10mod-inventory-update:3.2.12102489612876888128FALSE
mod-configuration10mod-configuration:5.9.22102489612876888128FALSE
mod-orders-storage10mod-orders-storage:13.6.02102489651270088128FALSE
edge-caiasoft9edge-caiasoft:2.1.02102489612876888128FALSE
mod-login-saml9mod-login-saml:2.7.22102489612876888128FALSE
mod-erm-usage-harvester10mod-erm-usage-harvester:4.4.12102489612876888128FALSE
mod-licenses10mod-licenses:5.0.22248023121281792384512FALSE
mod-password-validator10mod-password-validator:3.1.0214401298128768384512FALSE
mod-gobi9mod-gobi:2.7.12102489612870088128FALSE
mod-bulk-operations9mod-bulk-operations:1.1.923072260010241536384512FALSE
mod-fqm-manager12mod-fqm-manager:1.0.32300026001282048384512FALSE
mod-graphql11mod-graphql:1.12.02102489612876888128FALSE
mod-finance9mod-finance:4.8.02102489612870088128FALSE
mod-erm-usage10mod-erm-usage:4.6.02102489612876888128FALSE
mod-lists13mod-lists:1.0.52300026001282048384512FALSE
mod-copycat9mod-copycat:1.5.02102489612876888128FALSE
mod-entities-links10mod-entities-links:2.0.4225922480400144001024FALSE
mod-permissions23mod-permissions:6.4.02168415445121024384512FALSE
pub-edge8pub-edge:2023.06.142102489612876800FALSE
mod-orders9mod-orders:12.7.122048144010241024384512FALSE
edge-patron9edge-patron:5.0.02102489625676888128FALSE
edge-ncip9edge-ncip:1.9.22102489612876888128FALSE
mod-users-bl9mod-users-bl:7.6.021440115251292288128FALSE
mod-invoice9mod-invoice:5.7.221440115251292288128FALSE
mod-inventory-storage14mod-inventory-storage:27.0.424096369020483076384512FALSE
mod-user-import10mod-user-import:3.8.02102489612876888128FALSE
mod-sender9mod-sender:1.11.02102489612876888128FALSE
edge-oai-pmh9edge-oai-pmh:2.7.221512136010241440384512FALSE
mod-data-export-worker9mod-data-export-worker:3.1.223072280010242048384512FALSE
mod-rtac20mod-rtac:3.5.02102489612876888128FALSE
mod-circulation-storage18mod-circulation-storage:17.1.722880259215361814384512FALSE
mod-source-record-storage17mod-source-record-storage:5.7.525600500020483500384512FALSE
mod-calendar9mod-calendar:2.5.02102489612876888128FALSE
mod-event-config10mod-event-config:2.6.02102489612876888128FALSE
mod-courses9mod-courses:1.4.82102489612876888128FALSE
mod-inventory18mod-inventory:20.1.722880259210241814384512FALSE
mod-email9mod-email:1.16.02102489612876888128FALSE
mod-circulation12mod-circulation:24.0.1122880259215361814384512FALSE
mod-di-converter-storage11mod-di-converter-storage:2.1.52102489612876888128FALSE
mod-pubsub13mod-pubsub:2.11.32153614401024922384512FALSE
edge-orders9edge-orders:2.9.12102489612876888128FALSE
edge-rtac10edge-rtac:2.6.22102489612876888128FALSE
mod-template-engine9mod-template-engine:1.19.12102489612876888128FALSE
mod-users12mod-users:19.2.22102489612876888128FALSE
mod-patron-blocks11mod-patron-blocks:1.9.021024896102476888128FALSE
edge-fqm12edge-fqm:1.0.12102489612876888128FALSE
mod-audit9mod-audit:2.8.02102489612876888128FALSE
mod-source-record-manager19mod-source-record-manager:3.7.825600500020483500384512FALSE
nginx-edge8nginx-edge:2023.06.1421024896128000FALSE
mod-quick-marc9mod-quick-marc:5.0.11228821761281664384512FALSE
nginx-okapi8nginx-okapi:2023.06.1421024896128000FALSE
okapi-b9okapi:5.1.13168414401024922384512FALSE
mod-feesfines10mod-feesfines:19.0.02102489612876888128FALSE
mod-invoice-storage10mod-invoice-storage:5.7.021872153610241024384512FALSE
mod-service-interaction10mod-service-interaction:3.0.22204818442561290384512FALSE
mod-patron9mod-patron:6.0.02102489612876888128FALSE
mod-data-export11mod-data-export:4.8.711024896102476888128FALSE
mod-oai-pmh11mod-oai-pmh:3.12.824096369020483076384512FALSE
edge-connexion9edge-connexion:1.1.12102489612876888128FALSE
mod-notes9mod-notes:5.1.021024896128952384512FALSE
mod-kb-ebsco-java9mod-kb-ebsco-java:4.0.02102489612876888128FALSE
mod-data-export-spring14mod-data-export-spring:3.0.21204818442561536384512FALSE
mod-login10mod-login:7.10.12144012981024768384512FALSE
mod-organizations-storage10mod-organizations-storage:4.6.02102489612870088128FALSE
pub-okapi8pub-okapi:2023.06.142102489612876800FALSE
mod-eusage-reports9mod-eusage-reports:2.0.02102489612876888128FALSE