Overview

The purpose of the OAI-PMH tests is to measure performance of Nolana release and to find possible issues, bottlenecks per PERF-332

Environment

	Version	CPU	Memory	Xmx	MaxMetaSpaceSize	Tasks Count	Task Rev Number
mod-oai-pmh	3.10.0	1024 (Morning Glory: 2048)	2000 \| 2248 (Morning Glory: 1845 \| 2048)	1440	512	2	1 \| 2 (R/W split)
edge-oai-pmh	2.5.1	1024	1360 \| 1512	1440 (MG: 952)	512	2	1
mod-inventory-storage	25.0.1	1024	1952\| 2208 (Morning Glory: 1684 \| 1872)	1440	512	2	1 \| 2 (R/W split)
mod-source-record-storage	5.5.2	1024	1440 \| 1536	908	512	2	1 \| 2 (R/W split)
okapi	4.14.7	1024	1440 \| 1684	922	512	3	1

Summary

Average response time per request with resumption token 646ms ( compared to Morning Glory's 600ms).
Incremental calls performed - 81852 (Bugfest data set 1 user and 35 DB connections)*.
Thread block errors and subsequent OOMs happened about 50% of the time. This is likely due to a fast rate of incremental calls by the JMeter test script. We did not test with 40 requests/min rate (which in Morning Glory proved to be successful 100% of the time). Since this version of mod-oai-pmh does not have any changes except for upgrading to RMB v35.0.0, the OOMs seen in Morning Glory are also seen here.
Tested with both DB R/W split enabled and disabled. Harvest durations are about the same and the failure rate (due to OOMs) is about the same when DB R/W split is enabled or disabled.

* Note: Bugfest dataset was used because it has more SRS records than PTF's dataset.

Test Results

Test Number	Duration	R/W Split Enabled	Total Requests	Requests/sec	Average	Completed
1	14h 05m	N	81351	1.54	0.621	Y
2	14h 40m	N	81351	1.6	0.643	Y
3	15h 20m	N	81852	1.504	0.662	Y
4	14h 17m	N	81852	1.454	0.683	Y
5	28m	N	2125	1.406	0.772	N
6	14h 4m	N	81852	1.593	0.625	Y
7	29m	N	2263	1.49	0.735	N
8	37m	Y	2094	1.36	0.921	N
9	14h 40m	Y	81852	1.611	0.619	Y
10	30m	Y	2205	1.568	0.689	N
11	14h	Y	81852	1.602	0.622	Y
12	15h	Y	81852	1.469	0.677	Y
13	37m	Y	2113	1.339	0.884	N

Successful Tests

Successful Harvest: Service CPU Utilization

The following is a typical set of graphs from one of the successful harvests (without enabling DB R/W split).

mod-oai-pmh can spike up to over 300% initially. The other modules barely incur any high CPU utilization above their baselines. mod-inventory-storage spiked for a few minutes due to the initial instances transfer.

Successful Harvest: Service Memory Usage

No issues seen with memory on a good run.

Database CPU Utilization

The DB CPU utilization shows spikes in the first 30-60 minutes for transferring instances and relatively even and smooth for the rest of the harvest. Surprisingly some spikes on the DB read node also showed up because this is a test that does not have DB R/W split enabled.

Failed Tests

The following are graphs of a failed harvest. These failed harvests happened within the first 60 minutes.

Failed Harvest: Service CPU Utilization

Unlike a typical harvest, there isn't an initial mod-oai-pmh spike right off the bat, but there is a long drag out for 15 minutes and then a huge spike due to memory running low before one of the two mod-oai-pmh container crashed.

Failed Harvest: Service Memory Usage

In a failed harvest mod-oai-pmh's memory usage started out fine but spiked to over a 100% then crashed.

Failed Harvest: Database CPU Utilization

RDS CPU Utilization in a failed harvest does not show any unusual pattern compared to a successful harvest.

DB R/W Split Enabled

With DB R/W Split enabled, all the graphs for successful and failed harvest cases are the same as when DB R/W split was not enabled.

Failed Harvest (With R/W Split enabled)

\

Browser not supported