OAI-PMH data harvesting (Nolana)

Overview  

The purpose of the OAI-PMH tests is to measure performance of Nolana release and to find possible issues, bottlenecks per PERF-332

Environment


VersionCPUMemoryXmxMaxMetaSpaceSizeTasks CountTask Rev Number
mod-oai-pmh3.10.0

1024

(Morning Glory: 2048)

2000 | 2248

(Morning Glory: 1845 | 2048)

144051221 | 2 (R/W split)
edge-oai-pmh2.5.110241360 | 15121440 (MG: 952)51221
mod-inventory-storage25.0.11024

1952| 2208

(Morning Glory: 1684 | 1872)

144051221 | 2  (R/W split)
mod-source-record-storage5.5.210241440 | 153690851221 | 2 (R/W split)
okapi4.14.710241440 | 168492251231

Summary

  • Average response time per request with resumption token 646ms ( compared to Morning Glory's 600ms).
  • Incremental calls performed - 81852 (Bugfest data set 1 user and 35 DB connections)*
  • Thread block errors and subsequent OOMs happened about 50% of the time. This is likely due to a fast rate of incremental calls by the JMeter test script. We did not test with 40 requests/min rate (which in Morning Glory proved to be successful 100% of the time).  Since this version of mod-oai-pmh does not have any changes except for upgrading to RMB v35.0.0, the OOMs seen in Morning Glory are also seen here.
  • Tested with both DB R/W split enabled and disabled. Harvest durations are about the same and the failure rate (due to OOMs) is about the same when DB R/W split is enabled or disabled.

Note: Bugfest dataset was used because it has more SRS records than PTF's dataset.

Test Results

Test NumberDurationR/W Split EnabledTotal RequestsRequests/secAverageCompleted
114h 05mN813511.540.621Y
214h 40mN813511.60.643Y
315h 20mN818521.5040.662Y
414h 17mN818521.4540.683Y
528mN21251.4060.772N
614h 4mN818521.5930.625

Y

729mN22631.490.735N
837mY20941.360.921N
914h 40mY818521.6110.619Y
1030mY22051.5680.689N
1114hY818521.6020.622Y
1215hY818521.4690.677Y
1337mY21131.3390.884N

Successful Tests

Successful Harvest: Service CPU Utilization

The following is a typical set of graphs from one of the successful harvests (without enabling DB R/W split).

mod-oai-pmh can spike up to over 300% initially.  The other modules barely incur any high CPU utilization above their baselines. mod-inventory-storage spiked for a few minutes due to the initial instances transfer. 

Successful Harvest: Service Memory Usage

No issues seen with memory on a good run.

 

Database CPU Utilization

The DB CPU utilization shows spikes in the first 30-60 minutes for transferring instances and relatively even and smooth for the rest of the harvest. Surprisingly some spikes on the DB read node also showed up because this is a test that does not have DB R/W split enabled. 


Failed Tests

The following are graphs of a failed harvest. These failed harvests happened within the first 60 minutes. 

Failed Harvest: Service CPU Utilization

 Unlike a typical harvest, there isn't an initial mod-oai-pmh spike right off the bat, but there is a long drag out for 15 minutes and then a huge spike due to memory running low before one of the two mod-oai-pmh container crashed. 

Failed Harvest: Service Memory Usage 

In a failed harvest mod-oai-pmh's memory usage started out fine but spiked to over a 100% then crashed. 

Failed Harvest: Database CPU Utilization

RDS CPU Utilization in a failed harvest does not show any unusual pattern compared to a successful harvest.

DB R/W Split Enabled

With DB R/W Split enabled, all the graphs for successful and failed harvest cases are the same as when DB R/W split was not enabled.





Failed Harvest (With R/W Split enabled)

\