OAI-PMH data harvesting (Morning Glory)
Overview
The purpose of the OAI-PMH tests is to measure performance of Morning Glory release and to find possible issues, bottlenecks per PERF-263
Environment
- mod-oai-pmh v3.9.1
- edge-oai-pmh v2.5.0
- mod-source-record-manager v3.4.1
- mod-source-record-storage v5.4.0
- mod-inventory-storage v24.0.3
- okapi v4.14.2
Specifically, the following settings were used
CPU | Memory | Xmx | MaxMetaSpaceSize | Tasks Count | Task Rev Number | |
---|---|---|---|---|---|---|
mod-oai-pmh | 2048 | 1845 | 2048 | 1440 | 512 | 2 | 4 |
edge-oai-pmh | 1024 | 1360 | 1512 | 952 | 128 | 2 | 3 |
mod-inventory-storage | 1024 | 1684 | 1872 | 1440 | 512 | 2 | 8 |
Summary
- Average response time per request with resumption token 600ms ( compared to Lotus's 850ms).
- Incremental calls performed - 82299 (Bugfest data set 1 user and 20 DB connections)*.
- OOM happens frequently if followed the recommended setting (soft limit < maxMetaspaceSize + XmX). Only when changed to soft limit > maxMetaspaceSize + Xmx the harvests completed successfully.
- Thread block errors and subsequent OOMs happened about 50% of the time. This is likely due to a fast rate of incremental calls by the JMeter test script. When changed to 40 requests/min, there were no more errors, but this is a very small rate that would take over 30 hours for the harvest of 8M records to complete.
* Note: Bugfest dataset was used because it has more SRS records than PTF's dataset.
Test Results
Test 1
This test was done with a database freshly restored from Bugfest (Morning Glory). There was neither reindexing on Elastic Search nor recreating the indexes and "analyze" the inventory-storage tables.
- 8.26M records were transferred and harvested in about 19 hours
- Each incremental call to harvest took about 811ms, and a total of 82,300 calls.
- No memory or CPU issues observed.
- mod-oai-pmh started out spiking up to 50% for about 40 minutes. This is during the initial transfer of instances.
- No memory issues observed starting when the test was performed on 8/24 at 22:00
- RDS CPU utilization graph doesn't show any abnormality
Test 2
Test 2 was done with re-indexing on Elastic Search, re-create the relevant database indexes and Analyzed the tables to update the table stats.
The test failed after 26 minutes with a 502 error:
HTTP 502 Service temporarily unavailable.
Please check back in a minute or two.
If the issue persists, please report it to EBSCO Connect.
- Only 3,5700,000 instances got transferred.
- One mod-oai-pmh task crashed at 106% memory level
- 1339 incremental API calls to harvest were made. Each averaged 1173ms.
Tests 3 and 4
Tests 3 and 4 also suffered the same fate of running out of heap space memory. Shortly after the harvests were launched (during the initial transfer of instances) one of two OAI-PMH tasks crashed, leading to a timeout on the client side and the whole harvest came to a complete halt. Below are the memory and CPU graphs of tests 3 and 4.
Test 5
After adjusting the memory's soft limit to be greater than Xmx + XMetaSpaceSize, the harvest did not crash and completed successfully in 13 hours.
CPU | Memory | Xmx | MaxMetaSpaceSize | Tasks Count | Task Rev Number | |
---|---|---|---|---|---|---|
mod-oai-pmh | 2048 | 2000 | 2048 | 1440 | 512 | 2 | 5 |
edge-oai-pmh | 1024 | 1360 | 1512 | 952 | 128 | 2 | 3 |
mod-inventory-storage | 1024 | 1684 | 1872 | 1440 | 512 | 2 | 8 |
This time the test was launched from carrier-io so the timing is even better than of the first test. Response times were much faster as well.
- 8.26M records were transferred and harvested in about 13 hours and 40 minutes
- Each incremental call to harvest took about 592ms, and a total of 82,300 calls.
- No memory or CPU issues observed.
CPU utilizations are typical for an OAI-PMH harvest, with mod-oai-pmh leading the pack spiking at 50% initially for about half an hour during the initial instance transfers, but settled down at around 5% thereafter.
okapi and its variants (nginx-okapi, pub-okapi) also spiked initially for about 10 minutes but subsided afterward.