Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
Verifying the connection to OAI-PMH service

...

When a harvest starts the system checks first the inventory for the updates (new or modified records) then retrieves underlying MARC Bib records from SRS.   If the harvest was triggered with the metadataPrefix set to marc21_withholdings, the holdings and items data is append  as described in MODOAIPMH-102.   

Max number of records per response

...

The "from" and "until" parameters can take include timestamp and the format is then: yyyy-mm-ddThh:mm:ssZ.  This is especially helpful during the troubleshooting when the harvest needs to be limited to some specific hours when the records were updated.  The specified time is in UTC (Coordinated Universal Time).   

  • Consecutive requests (with resumption token):

...

  • Running concurrently multiple full harvests.  Note that sharing the harvesting link with multiple users will most likely lead to starting multiple harvests at the same time
  • Running full harvest during large updates to inventory and SRS records (importing data, reloading data)

Time required to complete

...

How long it will take for the harvest to complete depends on:

...

Sending the request with request identifier, for example GET /oai/request-metadata/{requestId}/failed-to-save-instances, will return the list of UUIDs for the records that failed to save.  The list might be used for troubleshooting existing data issues in FOLIO inventory or SRS.  For more information see mod-oai-pmh README.md file or FOLIO API documentation.

Suppressed records

When the Suppressed records processing setting (Settings→ OAI-PMH→ Behavior) is set to "Transfer suppressed records with discovery flag value", the records marked as suppressed are included in the response with added subfield t.   

...

Code Block
languagexml
titleDeleted record example
collapsetrue
<record>
<header status="deleted">
<identifier>oai:edge-bugfest-iris.folio.ebsco.com:fs09000000/ce064ce6-3d9c-4765-a3cf-564289f59b58</identifier>
<datestamp>2021-10-22T18:50:22Z</datestamp>
<setSpec>all</setSpec>
</header>
</record>


If you use an API calls to delete Instances the following steps are required to assure that the discovery is updated as well:

  • Set ldr05 to "d" in the underlying SRS record.  This can be done by editing the record in QuickMarc.
  • Let the incremental harvest get the information about the deleted record.
  • Delete instance record and corresponding SRS record via  through API call

Slow Performance 

The harvesting of  5 million records should not take more than 11 hours in Juniper and less than 10 in Kiwi.  It is highly recommended to run REINDEX, VACUUM and ANALYZE after major updates to the inventory tables in PostgresSQL database.  It is highly recommended to run ANALYZE on a regular basis.

Code Block
languagesql
titleREINDEX and ANALYSE command
collapsetrue
REINDEX index <tenant>_mod_inventory_storage.audit_item_pmh_createddate_idx ;
REINDEX
xxxx=> REINDEX index audit<tenant>_holdingsmod_record_pmh_createddate_idx;
REINDEX
xxxx=> REINDEX index inventory_storage.audit_holdings_record_pmh_createddate_idx;
REINDEX
xxxx=> REINDEX index audit<tenant>_holdingsmod_record_pmh_createddate_idx;
REINDEX
xxxx=> REINDEX index inventory_storage.holdings_record_pmh_metadata_updateddate_idx;
REINDEX
xxxx=> REINDEX index <tenant>_mod_inventory_storage.item_pmh_metadata_updateddate_idx;
REINDEX
xxxx=> REINDEX index <tenant>_mod_inventory_storage.instance_pmh_metadata_updateddate_idx;
REINDEX
xxxx=> analyze verbose <tenant>_mod_inventory_storage.instance;
ANALYZE
xxxx=> analyze verbose <tenant>_mod_inventory_storage.item;
ANALYZE
xxxx=> analyze verbose <tenant>_mod_inventory_storage.holdings_record;
ANALYZE

Forcing record updates

Staff actions to Instances, Holdings, and Items through the UI automatically trigger updates. However API based processes modifying the discovery flag, locations, or other fields to the storage endpoints does not trigger an update. To trigger records for update, you must issue a GET to any of the following /inventory endpoints and then a PUT the record retrieved to trigger an update. Although this process is slow, a maximum of 4 threads is recommended:

  • /inventory/instances
  • /inventory/holdings
  • /inventory/items