Overview

The purpose of the OAI-PMH Incremental Harvesting tests is to measure performance of Poppy release and to find possible issues, bottlenecks per PERF-660 - Getting issue details... STATUS on OCP2 environment.
The purpose of the OAI-PMH Full Harvesting tests is to measure the performance of Poppy release by the EBSCO Harvester recommended tool and to find possible issues, and bottlenecks per PERF-659 - Getting issue details... STATUS on OCP2 environment.

Summary

OAI-PMH - Incremental Harvesting:
- Three tests have been executed by JMeter script to check performance of harvesting the following number of records 10K, 25K, 50K, 500K and 1 MLN with different OAI-PMH Behaviors :
  - Test 1. Record source set to Source record storage ;
  - Test 2. Record source set to Inventory ;
  - Test 3. Record source set to Source record storage and inventory;
- Harvesting time is similar in both tests Test1 and Test3, but for Test2(10K, 25K, 50K )it`s take about 50% more time to processes, because of the date creation distribution, from 1962-2023 were created about 250K and 2023-10-23 were created about 800K instances;
- The CPU usage was consistent throughout all of the tests and didn`t exceed 5% for each services, on the begging of each test we observed a spike in CPU usage that lasted for a few seconds;
- Memory utilization was stable, except edge-oai-pmh service;
- Database CPU utilization reached maximum of 15%, number of DB connections = 140;
OAI-PMH - Full Harvesting:
- Three tests have been executed using EBSCO Harvester to check performance with different OAI-PMH Behaviors :
  - Test 4. Record source set to Source record storage. Test duration is about 16 hours 42 min, 10403507 - returned Inventory instances, 76.4 GB of data stored on the disk.
  - Test 5. Record source set to Inventory. Test duration is about 1 hours 46 min, 1122521- returned Inventory instances, 1.96 GB of data stored on the disk.
  - Test 6. Record source set to Source record storage and inventory. Test duration is about 16 hours 50 min, 11526892 - returned Inventory instances, 78.4 GB of data stored on the disk.
- The average CPU Utilization during Test 4, Test 6 was about mod-oai-pmh-b = 7%, edge-oai-pmh-b = 3%, mod-source-record-storage-b = 1%, okapi-b = 1.2%, mod-inventory-storage-b = 0.5% for Test 5, these values were 1-2% lower;
- Memory utilization was without any problems. Service edge-oai-pmh-b was not restarted after each test as on the previous runs to check for memory leaks. After Test 4 memory utilization reached about 53% and during the next Test 5 and Test 6 were fluctuations in the range of 45-55%
- Average CPU utilization for Test 4-6 was about 16%, number of DB connections = 140.

Comparison results
After analysis of the OAIMPH Incremental Harvesters logs, after each request is executed /oai/records?verb=ListRecords&apikey=[APIKey]&resumptionToken=[resumptionToken], in Jmeter the waiting time was added, which is used in the program to save the response to the file. Also for Test 2 800K instances were generated. Therefore, it will be incorrect to directly compare the processing time and resource usage, since the system load and the number of RPS have changed.

Nevertheless, in comparison to OAI-PMH data harvesting (Orchid), OAI-PMH data harvesting (Orchid) by EBSCO Harvester several important points can be distinguished:
Incremental Harvesting
1) The duration of the havering is similar
2) After stabilization, the CPU utilization in both tests does not exceed 5% and the RDS CPU Utilization test was about 15%.
3) At the beginning of all tests, there is a sharp increase in CPU usage, but in Poppy release the maximum value is much lower than in Orchid, CPU usage stabilization occurs within a few minutes in Poppy , compared to 30 minutes in Orchid
4) Memory usage. In Poppy mod-oai-pmh service does not use 100% of the memory. The edge-oai-pmh-b service has a similar memory usage profile on both releases.
5) RDS CPU in Orchid has no spikes at the beginning of each test.

Full harvest

1) Same as Incremental Harvesting. Memory usage. On the orchid, mod-oai-pmh does not use 100% of the memory. The edge-oai-pmh-b service has a similar memory usage profile on both releases.
2) DB CPU usage is more even. In both releases, there are still spikes on ocp2-db-01 and ocp2-db-02, but they may be caused by the OAIMPH Harvester program.

Improvements that can be noted in Poppy release:
1) There is no degradation in request processing time, as duration is approximately the same;

2) Fixed high memory consumption by mod-oai-pmh service;

3) At the beginning of the tests, there are no sharp spikes of services CPU usage on and the database CPU usage.

4) The service CPU utilization is very low ~ 7%, RDS CPU utilization is also very low ~ 15%. So it`s enough resources to perform another actions in the system.

Recommendations & Jiras

To have the same starting conditions before running test with different Record source sets the edge-oai-pmh service was restarted, it was done to return the service memory usage to its starting(after deployment) value;
Run the incremental harvesting tests with different Max records per response values, for example 200, 500 etc.;
Сonduct a more detailed analysis of why the edge-oai-pmh service is consuming a lot of memory and does not erase after the tests are finished;
Generate 1 Million instances with a uniform distribution over time 2022-12-21 2023-10-16.

Test Runs & Results

Incremental harvesting

Number of harvested records	Test 1. Record source = Source record storage Duration	Test 2. Record source = Inventory Duration	Test 3. Record source = Source record storage and inventory Duration	Orchid source = Source record storage Duration	Orchid source = Source record storage and inventory Duration
10000 records(10K)	2 min 8 sec	3 min 44 sec	2 min 4 sec	not tested	not tested
25000 records(25K)	4 min 43 sec	6 min 50 sec	4 min 13 sec	3 min 50s	4 min 32 s
50000 records(50K)	9 min 12 sec	12 min 48 sec	8 min 12 sec	not tested	not tested
500000 records(500K)	1 hours 18 min	1 hours 19 min	1 hours 15 min	1 hr 14min	1 hr 7min
1000000 records(1MLN)	2 hours 29 min	2 hours 29 min	2 hours 24 min	2 hr 1min	2 hr 21 min

Full harvesting using EBSCO Harvester

Record source	Duration	Number of returned instances	volume in GB of returned data	Number of files	Orchid Duration
Source record storage	16 hours 42 min	10403507	76.4	104,737	~ 17 h
Inventory	1 hours 46 min	1122521	1.96	11,227	not tested
Source record storage and inventory	16 hours 50 min	11526892	78.4	115,971	~ 18 h

Record source

Duration

Number of returned instances

volume in GB of returned data

Number of files

Orchid

Duration

Source record storage

16 hours 42 min

10403507

76.4

104,737

~ 17 h

Inventory

1 hours 46 min

1122521

1.96

11,227

not tested

Source record storage and inventory

16 hours 50 min

11526892

78.4

115,971

~ 18 h

Incremental harvesting resources utilization

Test 1. Record source = Source record storage

Service CPU Utilization

During four harvesting tests with 10K, 50K, 500K and 1MLN records CPU usage remained steady, a few minor fluctuations were at the the beginning of each test. The averages CPU usage for mod-oai-pmh-b = 3%, edge-oai-pmh-b = 2.5%, mod-source-record-storage-b = 1.5%, okapi-b = 1.2%, mod-inventory-storage-b = 0.5% . After the middle of the 4th test(1Mln records), something launched a hidden JMeter script, which causes a significant increase in CPU consumption, but didn`t affect processing time.

Service Memory Utilization

Memory utilization was without any problems, except for the edge-oai-pmh-b service. At the beginning of the testing, it consumed approximately 20% of memory, but 30 minutes after the test finished, it was consuming around 45%

RDS CPU Utilization

Average CPU utilization during 4 test was about 13%

RDS Database Connections

Number of database connection was about 140.

Test 2. Record source = Inventory

Service CPU Utilization

Service Memory Utilization

Memory utilization was without any problems, except for the edge-oai-pmh-b service. At the beginning of the testing, it consumed approximately 18% of memory, but 30 minutes after the test finished, it was consuming around 34%

RDS CPU Utilization

Average CPU utilization during 4 test was about 15%

RDS Database Connections

Number of database connection was about 140.

Test 3. Record source = Source record storage and inventory

Service CPU Utilization

Service Memory Utilization

Memory utilization was without any problems, except for the edge-oai-pmh-b service, during the third test, memory consumption the same at the previous tests, Between 500K and 1MLN records test, there was a period of two hours during which the tests were not running's, the system was not loaded at all, and memory consumption by edge-oai-pmh-b did not decrease during this period.

RDS CPU Utilization

Average CPU utilization during 4 test was about 12%

RDS Database Connections

Number of database connection was about 140.

Full harvesting resources utilization

Test 4. Record source = Source record storage

Service CPU Utilization

During the harvesting tests the averages CPU usage for mod-oai-pmh-b = 7%, edge-oai-pmh-b = 3%, mod-source-record-storage-b = 1%, okapi-b = 1.2%, mod-inventory-storage-b = 0.5% . After the test CPU utilization returned to the before test condition.

Service Memory Utilization

Memory utilization was without any problems, except for the edge-oai-pmh-b service, during the test memory consumption was increasing, and 1 hour after the test finished memory consumption did not decrease

RDS CPU Utilization

Average CPU utilization during the test was about 16%

RDS Database Connections

Number of database connection was about 140.

Test 5. Record source = Source record storage

Service CPU Utilization

During the harvesting tests the averages CPU usage for mod-oai-pmh-b = 4%, edge-oai-pmh-b = 2%, mod-source-record-storage-b = 1%, okapi-b = 1.3%, mod-inventory-storage-b = 0.5% . After the test CPU utilization returned to the before test condition.

Service Memory Utilization

Full harvesting test were run one after another without edge-oai-pmh-b service restarting, memory consumption was stable, didn`t increase.

RDS CPU Utilization

Average CPU utilization during the test was about 16%

RDS Database Connections

Number of database connection was about 140.

Test 6. Record source = Source record storage

Service CPU Utilization

During the harvesting tests the averages CPU usage for mod-oai-pmh-b = 7%, edge-oai-pmh-b = 4%, mod-source-record-storage-b = 1%, okapi-b = 1.2%, mod-inventory-storage-b = 0.5% . After the test CPU utilization returned to the before test condition.

Service Memory Utilization

For all services memory consumption was stable. Service edge-oai-pmh-b was not restarted before the test, memory utilization varied from 45% to 55%.

RDS CPU Utilization

Average CPU utilization during the test was about 17%.

Spike at 18.10-18.20 caused by a sharp increase in the number of requests to ocp2-db-01.

DB PerfInsights graph

RDS Database Connections

Number of database connection was about 140.

Appendix

Methodology/Approach

OAI-PMH (incremental harvesting) was carried out by JMeter script from carrier with 2 main requests:

/oai/records?verb=ListRecords&metadataPrefix=marc21_withholdings&apikey=[APIKey]
/oai/records?verb=ListRecords&apikey=[APIKey]&resumptionToken=[resumptionToken]

to extract the required number of records was used loop counter with following configuration:

98 loop counts for 10K records;
248 loop counts for 25K records;
499 loop counts for 50K records;
5000 loop counts for 500K records;
10000 loop counts for 1MLN records;

To run the incremental harvesting test the next time ranges were defined by experimental means. The time range for Test 2* was extended due to the impossibility of harvesting the defined number of records, but the next tests were run after adding 800K instances to database.

	Start date	Until date
Test 1.	2022-12-21	2023-10-16
Test 2*.	1962-12-21	2023-10-23*
Test 3.	2022-12-21	2023-10-16

OAI-PMH (full harvesting)

Before running OAI-PMH with full harvest, following database commands to optimize the tables were executed (from https://folio-org.atlassian.net/wiki/display/FOLIOtips/OAI-PMH+Best+Practices#OAIPMHBestPractices-SlowPerformance):

REINDEX index <tenant>_mod_inventory_storage.audit_item_pmh_createddate_idx ;
REINDEX index <tenant>_mod_inventory_storage.audit_holdings_record_pmh_createddate_idx;
REINDEX index <tenant>_mod_inventory_storage.holdings_record_pmh_metadata_updateddate_idx;
REINDEX index <tenant>_mod_inventory_storage.item_pmh_metadata_updateddate_idx;
REINDEX index <tenant>_mod_inventory_storage.instance_pmh_metadata_updateddate_idx;
analyze verbose <tenant>_mod_inventory_storage.instance;
analyze verbose <tenant>_mod_inventory_storage.item;
analyze verbose <tenant>_mod_inventory_storage.holdings_record;

Execute the following query in a related database for removing existed 'instances' created by previous harvesting request and a request itself:

TRUNCATE TABLE fs09000000_mod_oai_pmh.request_metadata_lb cascade

Full harvesting tests were running from ptf-windows machine using EBSCO Harvesting . The following cmd command (cmd should be run in the same directory as EBSCO Harvester) start EBSCO Harvester:

OAIPMHHarvester.exe -HarvestMode=full -DefinitionId=poppy-marc21-with-holdings -HarvesterWebClientTimeout_Seconds=0s=0

With the following definition

Harvest definition

<?xml version="1.0" encoding="UTF-8"?>

<id>Orchid</id>

<Description>Orchid</Description>

<Urls>

<Url>https://edge-ptf-ocp2-00.int.aws.folio.org/oai/eyJzIjoiVDNUSzAzR2QyViIsInQiOiJmczA5MDAwMDAwIiwidSI6ImZzMDkwMDAwMDAifQ=?verb=ListRecords&metadataPrefix=marc21_withholdings&set=all&from=2018-10-18T00:00:00Z&until=2018-10-19T00:00:00Z</Url>

</Urls>

<MetadataFormat>marc21_withholdings</MetadataFormat>

</Sets>

<!--<setSpec>NameOfSetSpecGoesHere</setSpec>

</Sets> -->

</HarvestDefinition>

Infrastructure

Environment: OCP2
Release: Poppy (2023 R2)

9 m6i.2xlarge EC2 instances located in US East (N. Virginia)
2 instances of db.r6.xlarge database instances, one reader, and one writer
MSK tenant
- 4 brokers
- Apache Kafka version 2.8.0
- EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- og.retention.minutes=480
- default.replication.factor=3

Modules

Module ocp2-pvt Mon Oct 23 15:48:03 UTC 2023	Task Def. Revision	Module Version	Task Count	Mem Hard Limit	Mem Soft limit	CPU units	Xmx	MetaspaceSize	MaxMetaspaceSize	R/W split enabled
pub-edge	8	pub-edge:2022.03.02	2	1024	896	128	768	0	0	false
mod-inventory-storage	1	/mod-inventory-storage:26.1.0-SNAPSHOT.696	2	2208	1952	1024	1440	384	512	false
edge-oai-pmh	8	edge-oai-pmh:2.7.0-SNAPSHOT.141	2	1512	1360	1024	1440	384	512	false
mod-source-record-storage	13	mod-source-record-storage:5.7.0-SNAPSHOT.247	2	5600	5000	2048	3500	384	512	false
mod-inventory	13	mod-inventory:20.1.0-SNAPSHOT.446	2	2880	2592	1024	1814	384	512	false
mod-circulation	10	mod-circulation:24.0.0-SNAPSHOT.601	2	2880	2592	1536	1814	384	512	false
mod-source-record-manager	15	/mod-source-record-manager:3.7.0-SNAPSHOT.240	2	5600	5000	2048	3500	384	512	false
mod-quick-marc	8	mod-quick-marc:5.0.0-SNAPSHOT.114	1	2288	2176	128	1664	384	512	false
nginx-okapi	8	nginx-okapi:2023.09.21	2	1024	896	128	0	0	0	false
okapi-b	9	okapi:5.0.1	3	1684	1440	1024	922	384	512	false
mod-oai-pmh	10	mod-oai-pmh:3.12.0-SNAPSHOT.362	2	4096	3690	2048	3076	384	512	false

Browser not supported