| Data integrity issues | Christie Thomas & Lisa McColl | - Working Document of issues
- May 27, 2021, PC discussion minutes
- Some relevant Jiras
- We don't have clear documentation for flow of data, how will bulk edit work, how will authorities work? We have issues and no plan to address them.
- We need to think of all of the apps that might touch a bib record throughout the system. Ex - bib records that we get at point of acquisition - may not have a SRS record. For several reasons not confident that can get through the entire workflow. Data isn't being handled in the way we expect, no consistency across apps, some edits may preclude future edits.
- Want to make sure we aren't fixing just immediate problems and not looking further down the road.
- Data import, SRS, and Inventory - unless editing all records manually you must have MARC, although FOLIO is supposed to be schema-neutral.
- Cornell and Texas A&M are still planning to implement this summer, hoping the hotfixes get the immediate issues solved, though still concerned about the long term roadmap issues.
- KM - data profile building is quite complicated, hadn't been aware of that. Saw a couple of places where a couple of fields with edits requires the SRS record to be updated. If you're in the instance record why would you think you need to protect the SRS record? If you're not updating MARC, you should be able to
- Lisa - expanding functionality of data import, which is increasing complexity. Solutions coming along now look ugly, require more workarounds.
- CT - doing thing at scale with data import also brings more risk, need high confidence. Community would like to see these issues addressed before moving on to new functionality.
- CT - would like to see more attention to this in the roadmap, and would like to be less MARC-centric. Feels that some decisions have been made on what is quickest and easiest.
- Walls -two things I see as vital to getting rid of our MARC dependency: 1. make MARC to Inventory mapping complete and roundtrippable and 2. relationalize the Inventory data structure more to allow for bulk editing
Neslin - I'd add the ability to find marc records separate from the inventory record so we are not dependent on indexing of inventory record fields for marc record searching Walls - I think we if we got Inventory to MARC fully roundtrippable, we could stop storing the MARC entirely
- Jenn -Inventory is currently built purposely to not be roundtrippable - it’s meant to be a simpler curated view of the bib info
not that that couldn’t change CT - FOLIO should support our workflows, not dictate them - Jenn - would like to see a group to look at these data flow issues - don't see the necessary people running for PC or TC
- CT - agree we need to see these issues addressed sooner in the process
- Brooks - MARC dependency is less about FOLIO than about export for discovery layers. Chalmers isn't using MARC, have a translation layer from the external db
- CT - but the data is translated to MARC for use by FOLIO
- CT - we have our canonical bib record in OCLC, but need to store it locally and FOLIO needs to find it.
- KM - support the idea of getting a working group together
- Walls - what kind of data is being left out of the Inventory records when importing from the MARC? What makes the MARC 'richer' for indexing purposes?
- CT - for instance we have a lot of subject data in our MARC that we are not bringing into Inventory because as an institution we have decided that we do not need to know that for the managing of the resource within FOLIO.
Jenn - part of it is data left out and part of it is flattening - inventory doesn’t have subfields so you flatten into one field and can’t search by subfield for instance
- Peter - is the holdings the issue, or is there other data in the MARC record that you don't find in SRS. What are you not finding in SRS when you look at the MARC record?
- CT - up to the institution to decide what info they need in Inventory.
For instance we have a lot of subject data in our MARC that we are not bringing into Inventory because as an institution we have decided that we do not need to know that for the managing of the resource within FOLIO. - Jenn - documentation would help with this. Hosted institutions can only see the API
- Walls - the idea of losing data is scary, so why not just make the MARC to Inventory map complete and round-trippable?
- Peter -the Inventory data model was designed to be not as granular as MARC can provide, so it wouldn't survive a round trip.
- Lisa - the buffer between us and developers is a problem here, need a better venue
- Peter - this sounds like PC responsibility
- KM - think PC Exec was going to get a charge
- KM - re: roundtripping with MARC - the MARC format makes that hard to do
- Jenn -a charged group seems needed, and I think it should have ties to PC and TC. We can’t hand wave the implementation, that’s part of the current problem
- Paula - what action should we take? Endorse the Cornell document above?
- KM-Exec PC - will form a small group to take a deep dive into all of these issues and draft medium and long term vision for changes, documentation, and prioritization. Group should include dev, SME, TC, PC voices.
Peter - the Cornell doc might be a good start, needs some work to turn it into a charter - Peter will bring this up at PC Exec tomorrow and give us feedback
|