2018-02-19 - Data Migration Subgroup Agenda and Notes

Date

Zoom Connect Information

Topic: Data Migration Subgroup

Time: Feb 19, 2018 11:00 AM Eastern Time (US and Canada)

    Please download and import the following iCalendar (.ics) files to your calendar system.

    Weekly: https://zoom.us/meeting/276260561/ics?icsToken=7a985191c2234bd670b45e98a048b9e934791fc63f5d73a99b35c234efaae7ba

Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/276260561 https://zoom.us/j/276260561

Attendees

Goals

  • One thing that's on our minds at Chicago: the need to have loading APIs defined for all of the data we need to migrate far enough in advance for us to work our data conversion, test workflows, and give feedback on gaps.

    We have a few that we can start working with, but we'll need time with all of them.

  • - define the types of data that needs to be migrated in order to work with the system (e.g. bibs, holdings, items, ... )
    - acceptable load times

Discussion items

TimeItemWhoNotes
5 minWelcome and Introductions, call for Note taker

Note taker: Ingolf

1 minSubgroup convener(s)

Chris Manley will do next week.

1 minReminder: space for document sharing

The space for sharing our documents is here.

10 minProgress on setting up instances for loading test dataAllThe demo of the single-server installation is ongoing in the SysOps SIG. We will have a discussion in the next Thursday session. There is also the sys-ops slack channel to discuss.
15 minJon Miller at U Chicago will discuss his experience with loading usersDale and Tod

UChicago (Jon Miller) has loaded 90,000 users in 2 h 30 min. For a user upload this is just acceptable, but extrapolated to an inventory load this will be too slow. The Import API calls the regular User API each time a user is created. It commits one user at a time. This procedure must be transformed into a real bulk upload procedure.

This issue should be put in JIRA so developers (and we) can find out what state of affairs it is.

Inventory loading is more important. We should keep up the "illusion" that we can do data migration within a day (yes? different opinions).

15 minTheodor Tolstoy and Patty Waninger will discuss mapping III Sierra data into the Folio inventory moduleTheodor Tolstoy and Patty Waninger

Conversion table
GitHub repo
Inventory Metadata Elements (alpha) in the FOLIO wiki.

JSON-object, how Items and Holdings would look like in FOLIO. Mapping Sierra Items to FOLIO Items.

Reference data: how do we say what our material types are ?

Download of Items from Sierra.

Intersection with Resource Access and Metadata Management. Do we need to take something about this to RA & MM ? We need a description of these fields. What was the definition / the expectation of having this type of data ? "Data can go sour".

We need to fix the data before we stick it into FOLIO ! For example if an item is a microfilm then the loan rule says "the item doesn't circulate". 

Sierra holds data which are not included in the item ==> denormalization.

Next Steps: Inventory Management is under RM, Charlotte is PO. Status: Under active development through March.

Concept of Locations in FOLIO is under discussion in the RM SIG, "sore point" at the moment. This group is not part of the discussion.

Campus Identity Management Feed for their users in Chicago (Tod). Currently the users are refreshed every day from Campus Identity Source. Some patrons have to be imported manually, e.g. those with outstanding fees. (User data with) fines and fees have to be kept for 7 years by state law. User data mapping is pretty straightforward, a little confusing with date fields.

Theodor: Load of 15,000 mock users as a task for next week.

20 minContinue discussion of data coverage and data mapping to the Folio APIs.All

Ann Highsmith's spreadsheets are https://drive.google.com/drive/folders/1aUMIqc4SwRzOGR4yPzrRSRElwM2CQ3uY

Uschi Klute's spreadsheet with fields for patron data in LBS (Library System used in GBV): https://drive.google.com/open?id=1EucKWdISlcG8EpfPY34hrvWqCsZi2v-s

  • 5 fields with library specific meanings. Spreadsheet: Important fields for User Data Migration from GBV. No good support for this level of detail. "User Details App". Institutions have to bring these issues to the Users Management SIG - outside of our purview. But we can force this issue.


We need to talk further about the spreadsheets.





Action items

  • Jon Miller will bring the issue of the long loading times of the User Import API to Indexdata. The API should also add permissions to users.
  • Theodor and Charlotte will have a meeting about the Locations / Inventory Management issue.
  • (Chris Manly) Get Charlotte on next weeks' call
  • Theodor: load of 15,000 mock users