AHighsmith demoed the voyager_item_to_folio_item.tsv spreadsheet, asking for comments on format and decision points.
PWanninger pointed out some fields, such as price, might map to other records, such as acquisitions. Patty also asked why item_type.? was identified with a question mark in the source field; Highsmith responded that the question correlated with the comment and indicated in this case that the tenant had to make a decision on which of 3 data elements from the item_type table should be used.
Long discussion ensued about how to handle data which exists in source records (regardless of source system) that doesn't yet have an obvious destination in a FOLIO record.
Which SIG would be responsible for commenting on "missing" item data? It would at least be a combination of Metadata Management SIG, which may have already covered these issues in working groups last fall and Resource Access, depending on data element. (
CManly asked what the best way was to consolidate lists of "missing" elements across various systems? Should this group come up with list(s) of consolidated data elements for each record type across a variety of systems and then take those lists to the appropriate SIG? DArntson said perhaps the mapping work should take place in the wider context of the relevant data models; CWhitt responded that this type of mapping data provides a kind of "sanity check" against the work that has been done in the SIGs vis-a-vis how the system should work. CManly said he'd like to take a list of "missing" data elements to User Management SIG at this point for review and see what and whether said data elements should be included. PWanninger referred to this process as doing gap analysis, based on published JSON.
IKuss asked which SIG these lists should go to? CManly – Data Migration comes up with composite lists by app; MWinkler pointed out the lists should go to Product Council to make sure the review is coordinated with appropriate SIG or SIG subgroup.
PWanninger asked if anyone in group had put up a local instance. AHighsmith reported Texas A&M close to having an instance up. Patty mentioned there was a plan for a test instance and pointed out that as instances proliferate, they will diverge from demo and more lessons will be learned.
Discussion about documentation
PWanninger asked about overall data map, i.e. a kind of data dictionary. TOlson pointed out the json schemas have a fair amount of inline documentation and suggested it might be possible to auto-generate a data dictionary from the jsons on github.
AHighsmith asked why json schemas didn't document common constraints, such as maximum string length. TOlson said such constraints more typically come out of database column widths, which the JSON isn't dealing with and wondered how we could comment or add descriptions to json schemas so that documentation would be in one place. WSchneider posted the link listed above, https://dev.folio.org/doc/api, saying that this is the data dictionary that is currently posted, which is generated from the source code modules. He wondered if this was the data dictionary that needed to be used. When PWanninger pointed out that the json schemas on that site didn't contain constraints such as max string length, CManly responded that we would probably find, working with JSON, that the records are much less constrained. WSchneider pointed out that format constraints could be added to the interface if desired and generally agreed to. Wayne also said incoming data would be validated against these schemas, which delineate required elements.
TOlson pointed out that some of the json schemas could benefit from some prose description; WSchneider responded that a user could make a pull request to put such explanation into the raml repository. IndexData is using 'description' field occasionally to store such comments.
Action items
@Next meeting – review what we have in data maps to see how we begin to create composite list of "missing" elements