OAI-PMH Support (UXPROD-993)

[UXPROD-3772] Implement Retry-after property for OAI-PMH response Created: 05/Aug/22  Updated: 30/Nov/23

Status: Analysis Complete
Project: UX Product
Components: None
Affects versions: None
Fix versions: None
Parent: OAI-PMH Support

Type: New Feature Priority: P2
Reporter: Magda Zacharska Assignee: Magda Zacharska
Resolution: Unresolved Votes: 0
Labels: loc_dependency, orchid-candidate
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Defines
is defined by EDGOAIPMH-90 Retry-after header support Open
is defined by MODOAIPMH-452 Retry-after header support implementa... Open
is defined by MODOAIPMH-462 Provide endpoint for aborting a harve... Open
Epic Link: OAI-PMH Support
Front End Estimate: Out of scope
Back End Estimate: XL < 15 days
Back End Estimator: Viachaslau Khandramai (Inactive)
Back-End Confidence factor: 90%
Development Team: Firebird
PO Rank: 0
Rank: Cornell (Full Sum 2021): R5

 Description   

Current situation or problem:

Multiple concurrent harvests consume a lot of FOLIO resources (especially on inventory and SRS side) and can make the system unresponsive. Users often trigger multiple harvests without being aware that there is already another harvest in progress.

With the harvest monitoring tools that were implemented in Lotus release it is possible to determine how many harvests are in progress and how many records have already been processed. Based on this information the system should provide the response to the harvesting software that the system is busy (503 -response code) with Retry-After period specified.

In scope
When another harvest is already in progress response is returned with 503 status and Retry -After period is specified in the response-header. The retry-after header should have a format:
Retry-After: Fri, 31 Dec 1999 23:59:59 GMT
The value of the remaining time should be calculated considering the time the harvest started, the total number of the records to be processed, the number of records already processed)

The response should also include:

  • RequestId of the harvest currently in progress
  • Time the harvest started
  • Total number of records to be harvested
  • Number of records already harvested
  • Estimated time to completion (based on the parameters mentioned above)

Resending the requests will be consumer responsibility.

Questions:
Determine the exact number of the concurrent harvests that can be supported. Here is the relelated PTF report: https://folio-org.atlassian.net/wiki/pages/viewpage.action?pageId=1392949

Proposed solution/stories

Links to additional info
OAI-PMH documentation: http://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm#HTTPResponseFormat
FOLIO OAI-PMH performance tests for concurrent harvests: https://folio-org.atlassian.net/wiki/pages/viewpage.action?pageId=1392949#OAIPMHdataharvesting[LOTUS]-Observations
FOLIO OAI-PMH harvest monitoring APIs: https://github.com/folio-org/mod-oai-pmh#harvesting-statistics-api
Questions


Generated at Fri Feb 09 00:34:41 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.