2019-07-01 - Data Migration Subgroup Agenda and Notes

Date

at 11 EST

Link to meeting: https://zoom.us/j/204980147

Discussion items

TimeItemWhoNotes
0New meeting linkDalePlease note. We will be meeting using a new Zoom link going forward: https://zoom.us/j/204980147 The old one expired and is no longer available.
5Welcome and assign note taker.Dale

Welcome and request for someone to take notes. Notetaker: Tod Olson

30Discussionvarious

We will discuss bulk API requirements for migration.

Some jiras of relevance: FOLIO-1932 - Getting issue details... STATUS ., FOLIO-2050 - Getting issue details... STATUS , UXPROD-1826 - Getting issue details... STATUS , MODINVSTOR-295 - Getting issue details... STATUS , MODINVSTOR-296 - Getting issue details... STATUS

And for some background and discussion from other sites (cited by Anatolii Starkov):

https://evertpot.com/http/207-multi-status
https://apihandyman.io/api-design-tips-and-tricks-getting-creating-updating-or-deleting-multiple-resources-in-one-api-call/
https://medium.com/paypal-engineering/batch-an-api-to-bundle-multiple-paypal-rest-operations-6af6006e002
https://developers.google.com/gmail/api/guides/batch

This is a complex topic, and we will do well to get oriented  in this session, and discuss what is possible. We can continue with requirements and stories future sessions.

Issues include:

  • APIs in migration to support updates and overlays in order to update record post-migration, to help load the delta if running old production while doing the migration.
    • Might be able to get away without it for bibliographic, but maybe not for circulation or acquisitions.
    • May be simpler to be offline during migration.
    • Need to come to some conclusion about the extent to which we need to support overlays or gap loads.
      • May treat new records and updates differently
      • PostgreSQL offers support of for UPSERT concept, as one implementation possibility, with a query parameter.
      • A more RESTful option might by to have separate support for POST, PUT, etc.
    • Wayne Schneider does not think we have stories for a batch overlay, may need one.
    • Jon Miller would like to see some support for projections, where you state only which parts of the data you wish returned, simplifies client code.
    • Jeremy Huff points out that these would be useful, but may need to prioritize. Probably best to prioritize batch POST and PUT so that provides a path forward, even if it does not take complexity out of the client.
    • One complication is the need to keep track of the relationship between the old IDs and the FOLIO UUIDs, so that you can ensure the correct records get updated and relationships between entities are preserved.
    • There's an outstanding question about where we are in batch POST and rolling out to modules. Please makes comments on  FOLIO-2050 - Getting issue details... STATUS .
    • Need to have good error feedback on what records cause failures, and whether failure is for entire batch or for single records. There may be a tradeoff between performance and granular error reporting, likely more of a problem when data is less controlled.

Link to Acquisitions Interface Fields
Link to  FOLIO Record Data Elements  (contains links to specific spreadsheets, but most of them are not up to date.)

Action Items

Dale will invite Jeremy Huff to demo the workflow POC in the latter half of July.

The SIG will work on Bulk load requirements for the PO to take to the development team