2017-07-07 FOLIO Data Model/Codex Working Group Meeting notes
Date
Jul 7, 2017
Attendees
@Kristin Martin
@Cate Boerema (Deactivated)
@Peter McCracken
@Sebastian Hammer
Oliver Pesche
@Leah Elzinga
@Felix Hemme
@Kristen Wilson
@Andrea Loigman
@Christie Thomas
@Khalilah Gambrell
@jdolivarez
@Lisa McColl McColl
@Ann-Marie Breaux (Deactivated)
@Kathryn Harnish
@Hkaplanian
@Peter Murray
@VBar
@Filip Jakobsen
Kimie Ouyang
@Michael Winkler
@Jacquie Samples
@Christopher Spalding
@Charlotte Whitt
@Lynn Whittenberger
@Karen Newbery
Agenda
8:30 - 8:45 - Settling in and quick introductions
8:45 - 9:45 - Data model and architecture (Vince Bareau)
9:45 - 10:30 - Metadata analysis part 1 (Kathryn Harnish)
10:30 - 10:40 - Break
10:40 - 11:10 - Metadata analysis part 2 (Kathryn Harnish)
11:10 - 12:10 - Wireframes (Cate Boerema)
12:10 - 12:30 - Buffer for discussion overrun and/or additional topics
Documents
We have a google drive folder for this group which contains today’s deck, member list etc. Look for wireframes there.
https://drive.google.com/drive/folders/0B7-0x1EqPZQKN21rWVMwX0pDR1E
Notes
Data model and architecture
Presentation is available here: https://drive.google.com/open?id=1KGlSSa0ajY9XcXb4QWD5_A8TxMVZWjthObRjw6seVnw
In this presentation, Vince reviewed the purpose of the codex and the types of codex data objects included in the current model: instance, holdings/item, coverage, location, and package. He also talked about the microservices approach, which means that each application (or domain) within FOLIO can have its own version of the instance with unique data elements. These instances can be linked across domains. The data stored on the central codex record will be very minimal, with most of the richer data being stored in specific domains.
Questions/Disussion:
Will Codex records referencing many external records/foreign keys (in FOLIO or external KBs)?
Sort of — each domain is actually creating codex records — so it’s more like the domain records are referencing the codex
If you create instances in multiple domains, there will actually be multiple codex records that are linked together
This model is distributed — domains reference the same IDs, but can define and store their own specific content
Like BF, FOLIO will use pieces of different sources to aggregate up to the complete view of the thing
How do you prevent conflicting data across domains?
In some cases you may have differing data across domains
KB may help with this by providing a trusted source of metadata
Have the ability to link up to those metadata values between different domains
How to you do reporting and queries across domains?
Reporting might be a separate domain
Could look across different areas and reconcile
For reporting, it will be hard to rely on microservices
Separate reporting database or data warehouse seems like the most likely solution
How will data be stored? What does denormalized mean?
Codex is really intended to be an abstraction layer
Implementation behind the scenes is up to each domain
Expecting a lot of different interpretations
Local storage may be in a relational database
Codex for KB may be more of a passthrough — in that case database and storage models are up to developers of each domain
Relational databases are normalized and have tables that representseach element (title, author, etc.)
When you want to produce an output or record that represents something like a book, you have to join against multiple tables
In denormalized environment, you have a separate table that is created for the resource you’re describing
Codex will be using JSON and will be flat
For Instance/Item relationships, will an instance always have the same item records — or will it have different items in certain domains?
We think in general it will be the same set of records
Will just have different metadata
Each microservice might keep track of subsets of data that relate to an item
What is the logic behind the BF-like approach?
We like the pragmatic approach to work, instance, item mapping
Fits well with current models
On negative side, BF is still in flux, use within libraries is still in flux
Other problem is that BF won’t be the only model
Our choices are related to the notion that MARC will continue to be around and need to be supported
Goal to allow libraries to stay with MARC, move to BF, or do some of both
How do we determine what details of a resource cause you to differentiate between different instances or items?
Instance
Mean to represent something relatively narrow
Has a resource type
Ebook, print book, audio book would be three different instances
Kind of mashes together FRBR manifestation and express
Discovery layer might be the place where different instances are rolled up together
Work
May not be used initially
Would be a way for libraries to pull together instances
Gives you a handle to find different related instances
Items
Copies would be represented as different items
You would circulate at the item level
Items in circulation would relate to items in the codex
Whether circ needs to reference an instance that an item is related to, is unclear right now
Circ functions at the item level most of the time but you still need access to the instance level data
Package
Not to be confused with a vendor package
Simply a way to cluster things together
Won’t play a critical role in circ
Metadata analysis
@Kathryn Harnish, can you post your slides here or link from Google Drive?
Kathryn presented an overview of the initial approach to defining metadata elements for codex records and mapping that data from another source. At the beginning phases, this work has been limited the print monographs use case and a MARC environment. E-resources and other metadata schema will be addressed in separate conversations. Kathryn presented two use cases: searching for a title and creating a purchase order for a print monograph. For each of these, she suggested data elements that would be needed to complete the task. She then walked through a more detailed mapping of proposed codex fields and mappings from MARC to the codex.
Questions/Discussion
Since holdings/item are one record, if you have a run of bound journal volumes, will the holding info be repeated in each record?
Yes, the holding info would be repeated
Will see advantages on technical side — flattened structure
From a functional perspective, can still see things at the appropriate level
Is there a reason to have a separate holding record?
There may be
Will depend on mapping work and needs for SIGs
Getting data back out is another important point — needed for collaborative collecting
Distinction is that you can still maintain the MARC or other standard structure if that’s appropriate to you
From the Codex perspective, we’ve abstracting info up and flattening it out
At a codex level, the data is available and it may not need to be represented in the same way as the source
How data is stored may not be how it’s managed
Efficiencies come in how you manage, not how you store
Recommendation to ID fields and look at management use cases to determine structure
Do I have to maintain previous metadata format? How do we enforce it if we do want to maintain that previous structure?
No, you won’t be required to stay with the source structure
Can map data into something else
Enforcement of previous standard would be more related to metadata management, rather than codex
For print serials, where will you attach your purchase order?
Don’t want to attach to a specific item — PO is for all items
How much complexity do we need?
Container record could be a solution here
Or attach it to instance
In Acq domain, need to model the thing that you’re purchasing
But that can be defined in the acq microservice
Important to know subscribed/unsubscribed
Kathryn will show some different definitions of things like coverage dates, subscription dates, etc.
Will need to do some work on how to define “currently held”
Does it need to be explicit in the codex or can it be inferred?
In the model of Codex as pass through, subscription information could be provided by KB or other external system
But if you need to search or filter on this, then we’ll have to have something in the codex
Also can’t assume that other micro services exist — library might not be using acquisitions
Role of Codex
It’s there to help you navigate and help you recognize things
Doesn’t give you detailed answers
Basically just tells you - do I hold this or not
If you need to know more detail, that's beyond the scope of codex
If you have a item ID, but no holdings ID, then you don’ have a holding — can be used to filter out items you don’t hold
Essentially, the codex is “things I know about"
The presence of a holding tells you whether those things are part of your library’s collection/inventory
How do we get info out of the system — e.g. data delivery for shared ILL?
Could use reporting
Expectation is that data deliveries for consortium or other would probably come out of the codex
Through codex can get at other info like source metadata, inventory etc.
Also have to support live look-ups — e.g., Relay D2D
This would be built as a plug ins or app that would sit on top of the codex
Suppressed fields
Could this be done on the front end (domain?) rather than Codex?
Might be an implementation decision
What causes separate records?
E.g., hardcover and paperback ISBNs — are they two records?
If we want to track format field, they would be different records
Comes down to whether we’d want to distinguish them
How minimal is minimal?
Keep it at small a possible, knowing we can always get details somewhere else
Part of what we’re seeing now is evolution in thinking and understanding
Functional people won’t make this determination — our job is to define what we need to support tasks
Implementation decisions will be driven by architecture
That’s OK as long as use cases can be addressed
Wireframes
Wireframes available here: https://drive.google.com/drive/folders/0B18Bhhmr94zaSlI3M3RnN042UFE
Cate presented a series of wireframes that demonstrated the results of a codex search. She also showed how users would be able to drill down into each search result, displaying additional information and potentially linking into other apps.
Questions/Discussion
What is the primary use case for this search?
Are there separate searches/workflows for other domains — e.g. orders, users, requests?
Yes, searching for domain-specific record types would be done in dedicated apps
The use case for this search is identifying an instance, item, or package
Essentially asking: is this something I hold?
It’s a bit like a KB search
We began referring to this as the “inventory search"
Would you ever want to filter by items?
Maybe not
If searching by barcode, you would see just instance result that contains that item
We are thinking we’ll remove the item filters from this UI
It’s still important to be able to filter by electronic vs physical results
Restricting inventory search scope to codex might be good for v1
Is there interest in a truly global search?
This would be a search that allows you to search across all apps and see a mix of results — instances, users, orders, etc.
It would be nice to be able to start with general search, so that you don’t have to make a decision about which search to use
This would be the universal search to locate objects in system — if you want to edit or create, you would need to go into something different
Prototype from the beginning has had recommendation for universal search across all domains
Has been hard to engage deeply so far
Universal search function wouldn’t necessarily be an app, but a layer over everything
Universal search is probably something we want long term
Would be very complex in terms of understanding the user and what they have rights to view/edit each record type
Analogy is Mac search/magnifying glass in top corner
Bento box style could be one approach to show results
Might also be able to set personal defaults to return only certain record types
If you are going have a inventory search, do you want to edit in same place too?
This search makes sense as a way to get to instance or item you’re looking for
But it might make more sense to create/edit within the specific domain app
If you have links to other modules, you don’t need to recreate editing/creating screens within universal search
Tabbed view might be helpful to navigate between different apps
Inventory search results are navigational, summary results
When you select something and look at detail, you are essentially going into another domain to see the details
Distinction is very much dependent on when you transition from one domain to another
If linking between smaller apps, would that make the navigation easier?
Doesn’t necessarily solve the problems
May alleviate the onion effect — many layers of records being opened over top of each other
Might be able to let apps deliver the edit view as overlays within the main app
We don’t have to get everything completely right from the get-go
Codex will not be cast in stone
If someone doesn’t like universal search, they don’t have to use it — there will be other searches/apps
It’s also possible someone could build a different codex in future
Next steps
We will use the Metadata Management SIG's meeting time slot for future codex discussions (Thursdays at noon Eastern time)
We will aim to complete an initial codex model that can be implemented by the end of July
Highest priority at this point is to validate the existing work on the codex against use cases
Kathryn will develop an agenda for next week's meeting to work on this task
If additional time is needed to look at new wireframes, we will schedule an ad hoc meeting
The RM SIG will continue to work on development of the acquisitions app concurrently