OAI-PMH data harvesting [LOTUS]

Overview 

The purpose of these set of tests is to measure performance of Lotus release and to find possible issues, bottlenecks. PERF-231 , PERF-233 OAI-PMH Lotus release -performance testing  IN PROGRESS

Environment

  • mod-oai-pmh:3.7.0  
  • edge-oai-pmh:2.4.2
  • mod-source-record-manager:3.3.0
  • mod-source-record-storage:5.3.0
  • mod-inventory-storage:23.0.0
  • okapi:4.13.0


Summary

  • Average response time per request with resumption token 0.852s  ( compared to KIWI's 0.874s. (+36%)).
  • No memory leaks or unexpected CPU spikes
  • Issue with long response time due to absents of underlying records for instances that may lead to timeouts is still exists MODOAIPMH-390
  • Incremental calls performed - 35,667  (PTF data set). Test failed due to timeout.
  • Incremental calls performed - 99,477 (Bugfest data set 1 user and 20 DB connections)*
  • Incremental calls performed - 43,968 (Bugfest data set 1 user 35 DB connections)* 
  • Incremental calls performed - 68,350 (Bugfest data set 5 users and 20 DB connections)* 

Note: Bugfest dataset was used because it has more SRS records than PTF's dataset.

Observations

  1. Two identical tests may have different number of calls from edge-oai-pmh because mod-oai-pmh retrieves instances from the database in random order and in case if there are no underlying records for instances it will call DB one more time. While this process occurs in random order it may cause different number of calls to harvest same amount of data.
  2. A risk of client timeouts can occur if the dataset is missing a lot of underlying SRS records exists
  3. There is large number of timeouts while data transferring from inventory-storage to oai-pmh which can lead to data loss (and also may cause different number of requests from client side to complete a job).
  4. All tests ended as expected and without the errors on client side. And it looks like Lotus is more stable than Kiwi. 
  5. Test with 35 DB_MAXPOOLSIZE is less stable due to more database timeouts occurring on mod-oai-pmh, as the result ±40% records was missed. 
  6. For PERF-233 - Getting issue details... STATUS we performed 5 concurrent harvests, which seemed to be  working as expected (and each harvest worked on a separate process, having a different request_id). However due to the high load initially there were lots of DB connection timeouts on mod-oai-pmh, as a result the number of instances got transferred were less than expected. Below is an example of the numbers of instances that got transferred from mod-inventory-storage to oai-pmh database for each of the concurrent harvests (Number of instances expected in a transfer is 8158821(verified by one user)).  
Harvest Request IDTotal Instances TransferredPercentage of expected
0abdab4c-2efd-484d-99dc-c7d642c7e974 6,488,82179%
a1440c4a-971d-4f03-acf8-f2e1f5108b776,648,82179%
7ac9163d-8102-4d92-acf8-56215d1925238,158,821100%
c2c0b6d2-5f04-428c-8366-5c2e50f65c388,128,821100%
eca40f67-bd62-4d1f-a935-42c6239670bc6,498,82179%


Test flow

Test consist of few calls:

Initial call that was performed only once

/oai/records?verb=ListRecords&metadataPrefix=marc21_withholdings&apikey=[APIKey]

Subsequent harvesting calls:

/oai/records?verb=ListRecords&apikey=[APIKey]&resumptionToken=[resumptionToken]


These calls were performed repeatedly, harvesting 100 records each time until there is no more data in [tenant]_mod_oai_pmh.instances table to harvest.

[resumptionToken] was set to 100, returning in initial call response and in each harvesting call until there is no more records to harvest. When all data has being harvested - resumptionToken will not return with the response.


Test Results

Test 1 (PTF data set)

Duration 6 hr 24 m

Calls performed - 35 667

Average response time 0.642s











Test 2 (Bugfest data set)

Calls performed: 99,477

*Part before Vusers spike contain 68,791 calls

duration ±20 h

Average response time 0.852s