2018-05-14 - Data Migration Subgroup Agenda and Notes
Date
Attendees
- patty.wanninger
- Christopher Creswell
- Matthew Harrington
- Michelle Suranofsky
- Wayne Schneider
- Sharon Beltaine
- Dale Arntson
- Guest appearance by Christie Thomas
Goals
Discussion items
Time | Item | Who | Notes |
---|---|---|---|
Welcome | Tod Olson convened the meeting |
| |
Marc Batch Loader Subgroup | A new subgroup formed: https://discuss.folio.org/t/marc-batch-loads-into-folio-new-subgroup/1792
Discussion: there seems to be enough representation on the subgroup to suit the needs of the migration/sys-ops group. If communication is needed, Wayne will be "keeping an ear out." Any information and questions can be sent to Ann-Marie Breaux (Deactivated), | ||
Migration Tools |
Discussion: this topic was mixed with a review of WOLFcon. Wayne recapped the session he led that laid out requirements for a useful tool that could load different kinds of data. At the meeting, the group described a tool that could load lots of different kinds of data into the relative JSON storage format. In addition, the loader would be able to handle MARCXML and the MARC Format for Holdings Data for inventory records, which include instance, holding and item. A requirement of the tool might include managing storage. For example, couple manage UUID → legacy ID mapping. Dale would like the functions of MARC loading and managing storage in FOLIO to be optional. We discussed the need for Sys-Ops to have a product owner who could take this requirement and express it as a JIRA ticket so it could be evaluated and given developer time. Tod agreed to write a proposal for a PO to be reviewed at Thursday's meeting; Sharon Beltaine will share the Reporting SIG's request with Tod. Wayne had several ideas about the loader; he thinks it should be a separate FOLIO module, and that a bulk load API might be designed to work for both migration and onging loads. Data migration requires very high performance, more error checking, and you are not loading into a live system. The current API, which is a one-at-a-time way to post JSON, is very slow; it took U Chi 16 hours to load 70,000 users. | ||
Test Data |
Discussion: The goal is to provide something resembling data that looks and acts real. Dale said we should be using real data in CI/CL and performance testing. Patty said that EBSCO intends to set up a demo system with a fictional university library with various scenarios and real-looking data. Perhaps other early implementers will want a similar sandbox. The test data needs to include 1.) production-type data 2.) Engineered edge cases. |
Action items
- Michelle Suranofsky will be the convener next week.