Skip to end of banner
Go to start of banner

<TBD>OAI-PMH performance dependencies between CI/CO and data import

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Overview

  • The purpose of the concurrent OAI-PMH, data import and CI/CO tests is to determine the areas which may be affected by increasing of harvests frequency. 

Summary

  • During test executions start it was observes growth of Service Memory Usage for all services. It's connected to the cluster daily start. For major services memory usage didn't exceed the level of 60%. The highest level was registered for mod-source-record-manager 107% and mod-inventory-b 98%. After tests for Scenario 1 it achieved its stable level and didn't change.
  • Running OAI-PMH, DI and CI/CO simultaneously it has been shown that the environment can handle such load.
  • CI/CO response times during DI and OAI-PMH didn't degrade after a row of DI (create and update job profiles).
  • After  90 minutes of full harvest the growth of CPU utilization for mod-oai-pmh-b up to 188 % was observed during 10 minutes with getting back to steady state ( 5-7 % ).
  • Service CPU Utilization at the beginning of DI mostly used by mod-di-converter-storage-b ( 253 % ), mod-inventory-b ( 172 % ), mod-quick-marc-b ( 108 % ). For the rest of modules it was under 70%. At the highest level it was mod-di-converter-storage-b ( 453 % ), mod-inventory-b ( 190 % ), mod-quick-marc-b ( 121 % ).
  • RDS CPU Utilization during incremental harvesting didn't exceed 60 % for all DI job profiles (1.000 records). Data export took 40%  But for full harvesting with DI Create job profile (100.000 records) it became instantly 96 %  and stayed on this level major part of process. DI Update used up to 90%. 
  • All oai-pmh tests were executed by EBSCO Harvester in the AWS ptf-windows instance.
  • During full harvesting (504) Gateway Timeout issue happened after all DI create and update were done so it didn't affect the results. It happened during all two Full harvesting runs with returned instances count ( during first 5 hours CI/CO - 1764989 records, other - 1166089 out of total 10433728 ).

Recommendations & Jiras

  • During testing observed unhealthy behaviour from mod-remote-storage-b service side (reason Health checks failed with these codes: [404]). PERF-618 - Getting issue details... STATUS The same unhealthy behaviour was from mod-licenses-b and mod-service-interaction-b (reason Health checks failed with these codes: [502])


Test Runs & Results


Data import duration and CI/CO response times with DI & OAI-PMH results

Test #

CI/CO

Scenario

Job profile

Duration

CI average 

CO average 

Load level

Comments

Scenario 1

OAI-PMH incremental

5 hours

DI MARC Bib Create

PTF - Create 2

00:00:48

0.961

1.398

For scenario 1

1K (with pause ~5 min)


DI MARC Bib Update

PTF - Updates Success - 1

00:00:56

0.706

1.125


DI MARC Bib Create

PTF - Create 2

00:00:43

0.843

1.402


DI MARC Bib Update

PTF - Updates Success - 1

00:00:44

0.848

1.335


Scenario 2

OAI-PMH full mode

DI MARC Bib Create

PTF - Create 2

00:53:30

1.0781.545

For scenario 2

100K (with pause ~5 min)


DI MARC Bib Update

PTF - Updates Success - 1

01:04:38

0.7251.231

DI MARC Bib Update

PTF - Updates Success - 1

01:05:48

0.691.249

5 hours

DI MARC Bib Update

PTF - Updates Success - 1

01:17:580.9031.333

DI MARC Bib Update

PTF - Updates Success - 1

01:18:080.7371.221

DI MARC Bib Update

PTF - Updates Success - 1

01:21:210.621.106Last 30 minutes without OAI-PMH

Comparisons

This table contains CI/CO response times without DI & OAI-PMH

Requests

50th pct

75th pct

95th pct

Average

Check-Out Controller

0.862

0.935

1.133

0.904

Check-In Controller

0.581

0.633

0.827

0.629

Comparison table for CI/CO response times


CI/CODI Create 1k + oai-pmh
DI Update 1k + oai-pmh
DI Create 100k + oai-pmh
DI Update 100k + oai-pmh
RequestsAverageAveragedelta, %Averagedelta, %Averagedelta, %Averagedelta, %
Check-Out Controller0.9041.39835.341.12519.641.54541.491.23126.56
Check-In Controller0.6290.96134.550.70610.911.07841.650.72513.24

Scenario 1

Response time

This table shows s40 minutes of CI/CO

Service CPU Utilization

TBD

Service Memory Utilization

TBD

RDS CPU Utilization

TBD

Scenario 2

Response time

The table shows first 5 hours of CI/CO (it contains Create and 2 Updates with 100.000 record file

The table shows second 5 hours of CI/CO (it contains 3 Updates with 100.000 record file

Service CPU Utilization


Service Memory Utilization


RDS CPU Utilization



Errors

Scenario 1 - no errors

Scenario 2

All errors are connected to 

Check-Out Controller

Request nameNumber
POST_circulation/check-out-by-barcode (Submit_barcode_checkout)_POST_4224
GET_inventory/items (Submit_barcode_checkout)_GET_2004
GET_groups_ID (Submit_patron_barcode)_GET_4001


Appendix

Methodology/Approach

Circulation rules should be modified before CI/CO test in Circulation rules editor to run it without issues from POST_circulation/check-out-by-barcode (Submit_barcode_checkout) side.

Partitions number should be equal to 2 in all DI related topics.

Before running OAI-PMH with full harvest, following database commands to optimize the tables should be executed (from https://wiki.folio.org/display/FOLIOtips/OAI-PMH+Best+Practices#OAIPMHBestPractices-SlowPerformance):

REINDEX index <tenant>_mod_inventory_storage.audit_item_pmh_createddate_idx ;
REINDEX index <tenant>_mod_inventory_storage.audit_holdings_record_pmh_createddate_idx;
REINDEX index <tenant>_mod_inventory_storage.holdings_record_pmh_metadata_updateddate_idx;
REINDEX index <tenant>_mod_inventory_storage.item_pmh_metadata_updateddate_idx;
REINDEX index <tenant>_mod_inventory_storage.instance_pmh_metadata_updateddate_idx;
analyze verbose <tenant>_mod_inventory_storage.instance;
analyze verbose <tenant>_mod_inventory_storage.item;
analyze verbose <tenant>_mod_inventory_storage.holdings_record;

  1. Execute the following query in a related database for removing existed 'instances' created by previous harvesting request and a request itself:

TRUNCATE TABLE fs09000000_mod_oai_pmh.request_metadata_lb cascade

Infrastructure

  • 8 m6i.2xlarge EC2 instances located in US East (N. Virginia)
  • 2 instances of db.r6.xlarge database instances, one reader, and one writer 
  • MSK ptf-kakfa-3
    • 4 brokers
    • Apache Kafka version 2.8.0

    • EBS storage volume per broker 300 GiB

    • auto.create.topics.enable=true
    • og.retention.minutes=480
    • default.replication.factor=3
  • Front End:

    • Item Check-in (folio_checkin-8.0.100000491)
    • Item Check-out (folio_checkout-9.0.100000595)

Modules

ModuleTask Def. RevisionModule VersionTask CountMem Hard LimitMem Soft limitCPU unitsXmxMetaspaceSizeMaxMetaspaceSizeR/W split enabled
ocp2-pvt
Mon Jul 03 14:54:13 UTC 2023
mod-inventory-storage4579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-inventory-storage:26.0.022208195210241440384512FALSE
mod-inventory3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-inventory:20.0.0-SNAPSHOT.39222880259210241814384512FALSE
mod-source-record-storage5579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-source-record-storage:5.6.525600500020483600384512FALSE
mod-source-record-manager3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-source-record-manager:3.6.0-SNAPSHOT.19724096368810242048384512FALSE
mod-data-import3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-data-import:2.7.0-SNAPSHOT.1011204818442561292384512FALSE
mod-di-converter-storage1579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-di-converter-storage:2.1.0-SNAPSHOT.322102489612876888128FALSE
mod-data-import-converter-storage3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-data-import-converter-storage:1.16.0-SNAPSHOT.1322102489612876888128FALSE
mod-remote-storage3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-remote-storage:2.0.0-SNAPSHOT.8324920447210243960512512FALSE
mod-users3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-users:19.2.0-SNAPSHOT.5842102489612876888128FALSE
mod-configuration3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-configuration:5.9.2-SNAPSHOT.2912102489612876888128FALSE
mod-circulation-storage3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-circulation-storage:16.1.0-SNAPSHOT.3052153614401024896384512FALSE
mod-circulation3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-circulation:23.5.0-SNAPSHOT.55621024896102476888128FALSE
mod-authtoken3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-authtoken:2.14.0-SNAPSHOT.23821440115251292288128FALSE
mod-pubsub3579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/mod-pubsub:2.10.0-SNAPSHOT.1242153614401024922384512FALSE
pub-okapi2579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/pub-okapi:2022.03.022102489612876800FALSE
okapi-b2579891902283.dkr.ecr.us-east-1.amazonaws.com/folio/okapi:5.1.0-SNAPSHOT.13523168414401024922384512FALSE

Partitions

  • No labels