Hardening of OAI-PMH

Hardening of OAI-PMH

Stakeholders

Stakeholders

PO - @Magda Zacharska 

SA - @Mikhail Fokanov 

 

FOLIO's implementation of OAI-PMH  was developed with a requirement that the full inventory harvest is a rare event, done once or twice a year.  The full harvest should be supported by incremental harvests that can be scheduled on a daily basis  to make sure all the inventory changes are included in the library's catalogs.  However, in production we see that the full harvests are triggered multiple times during a day, sometimes concurrently, sometimes inadvertently.  The purpose of this page is to list the proposed initiatives and stories that will allow to harden OAI-PMH implementation and make it more robust.

 

Table

Date

Type

Details

Plan / Action items

Jira

Status/Timeline

Date

Type

Details

Plan / Action items

Jira

Status/Timeline

1

Mar 24, 2022 

Story

Noticed that two processes to fill instances table of mod OAI PMH module were created

Verify if the issue can be recreated or if it was related to the client requesting the harvest

MODOAIPMH-403: Spike. Investigate creation of two processes to fill instances table of mod OAI PMH moduleClosed

Closed/ no longer occurs

2

Mar 11, 2022 

Initiative

Investigate possible ways to limit impact of the full harvest on inventory and SRS storage

Review available options, create required stories and prioritize the work

MODOAIPMH-400: Spike: Throttle Requests To mod-inventory-storageClosed

 

3

Mar 29, 2022 

Story

Prevent users from inadvertently triggering multiple full harvests 

Limit of max initial concurrent oai-pmh requests for tenant

MODOAIPMH-404: Limit of max initial concurrent oai-pmh requests for tenantClosed

Sprint 138

4

Mar 30, 2022 

initiative

Determine how many full harvests the system can support before it crashes

Review available options, create required stories and prioritize the work

PERF-233: Determine how many concurrent full harvests edge and mod-oai-pmh can supportClosed

PTF team: Sprint 137, Sprint 138

5

Dec 15, 2021 

Initiative

Perform analysis for the case when handling client waits while oai-pmh searching for instances with underlying records

Review available options, create required stories and prioritize the work

MODOAIPMH-383: Spike: handling client waits while oai-pmh searching for instances with underlying recordsClosed

 

6

Feb 21, 2022 

Initiative

Investigate handling invalid XML characters in the library data

Prioritize the work described in MODIPMH-402

MODOAIPMH-396: Spike: Invalid XML characterClosed

 

7

Mar 21, 2022 

Story

Implement handling invalid XML characters in the library data

Prioritize the work

MODOAIPMH-402: Verify if harvest can handle control characters in quoted literalsClosed

 

8

Mar 24, 2022

Story

Performance testing for each release

Performance testing for the Lotus release

PERF-231: OAI-PMH Lotus release -performance testingClosed

PTF team, Sprint 137, Sprint 138 

9

Apr 4, 2022 

Initiative

Issues still occur. Need to carefully analyze the logs to have data for RCA

Perform analysis on the logs

MODOAIPMH-405: Spike - Investigate mod-oai-pmh logs when debug mode is enabledClosed

Sprint 137

10

Apr 5, 2022 

bug

 

 

MODOAIPMH-407: bad data in item.statisticalCodeIds stops harvestClosed

Sprint 138

11

Apr 5, 2022 

bug

 

 

MODOAIPMH-406: marc_21withholdings doesn't return "no records" response when expectedClosed

Sprint 138

12

Apr 11, 2022 

Story

Collect mod-oai-pmh instances statistics

 

MODOAIPMH-408: mod-oai-pmh instances statisticsClosed

 

13

Apr 18, 2022 

Story

Build API for harvesting statistics

 

MODOAIPMH-412: API for harvesting statisticsClosed

Sprint 138