2018-08-13 Reporting SIG notes
Date
Attendees
Present? | Name | Organization | Present? | Name | Organization |
---|---|---|---|---|---|
X | Sharon Beltaine | Cornell University | Peter Murray | Index Data | |
Elizabeth Berney | Duke University | Erin Nettifee | Duke University | ||
Joyce Chapman | Duke University | Karen Newbery | Duke University | ||
Elizabeth Edwards | University of Chicago | X | Tod Olson | University of Chicago | |
X | Claudius Herkt-Januschek | SUB Hamburg | X | Scott Perry | University of Chicago |
X | Doreen Herold | Lehigh University | Robert Sass | Qulto | |
X | Anne L. Highsmith | Texas A&M | X | Simona Tabacaru | Texas A&M |
Vince Bareau | EBSCO | Mark Veksler | EBSCO | ||
Harry Kaplanian | EBSCO | X | Kevin Walker | The University of Alabama | |
Ingolf Kuss | hbz | Charlotte Whitt | Index Data | ||
Lina Lakhia | SOAS | X | Michael Winkler | OLE | |
X | Joanne Leary | Cornell University | Uschi Klute | GBV | |
X | Michael Patrick | The University of Alabama | X | Holly Mistlebauer | Cornell University |
X | Nassib Nassar | Index Data | X | Chris Manly | Cornell University |
Guests: Joseph Zucca and Kate, UPenn Libraries
Discussion items
Item | Who | Notes |
---|---|---|
Assign Notetaker, Take Attendance, Review agenda | Sharon | Today's notetakers: Doreen Herold Last week's notetaker: Anne Highsmith |
MetriDoc | Joe Zucca | Joe Zucca and Kate from UPenn Libraries provided an introduction to the University of Penn Libraries Metridoc Data Farm Project: Open Source Data Warehousing. See https://metridoc.library.upenn.edu/ for more information. Looking to redevelop to integrate various reporting tools (e.g. Tableau, Origin, or some app like those; any app that can interact with MySQL); redevelopment under auspices of IvyPlus Originally developed with funds from the IMLS From interface can download data to spreadsheet; example of data captured: from OCLC Relais which is imported nightly Data can be expensive to capture (a pretty significant problem) Data is hard to get to: comes from a significant variety of sources with a desire to co-locate it for a comprehensive analysis Platform that integrates and normalizes data to allow for the drawing of linkages to relate the data; extract, transform, load, support analysis Metridoc provides a basic framework but is built on a technology that's difficult to work with; the rewrite will provide for greater sustainability as well as provide for opportunity to do analysis across institutions Sharon noted that Metridoc as an OLF project empowers members to set up a local data warehouse facilitated by Metridoc; because Metridoc is currently immersed in rebuilt could delay the robust data warehouse environment we're looking for to provide for testing Tod: How do we turn FOLIO data that lands in a data warehouse? What needs to be anonymized? What needs to be scrubbed? Metridoc provides a framework for ingesting/transforming but we would be responsible for FOLIO as a source of data for Metridoc (work that could be done by OLE developer?) What about JSON? Has not been experimented with yet; Anne says it could potentially be supported; Sharon: task of figuring out how to transform data after it comes out of FOLIO, must decide how to define boundaries, working with privacy issues Sharon: data integrity as a challenge? Joe: not an issue that's arisen up until now; will have to think carefully of where Metridoc sits in FOLIO; Tod: problem has to be solved in FOLIO Sharon: looked at open source tools such as BIRT? Joe: no but there's interest, range of tools to work with Metridoc should be broad, want to integrate as many different types of data (eg gate count) Joe: assembling data sources as required a lot of effort; collection of reference and instructional data (mapping scheme to acquire data from Insights); seeing that kind of data along other data types provides for powerful analysis (Suma (NC State): https://www.lib.ncsu.edu/projects/suma) Metridoc: middleware, place for posting data, normalized, federated data source; FOLIO members need to determine interface and need to determine transformation of data exported from FOLIO for import to Metridoc Metridoc can only do so much; we need to determine structure of data to figure out how this will work; we want to have a set of logical normalizations; data flowing to Metridoc with identifiers (microservices design) Where does data integrity break down; not having the data come out in a way that can't be normalized; emphasis on testing will start to inform us, helping us determine standards needed for data warehouse Many thanks to Joe and Kate for helping us understand the potential for Metridoc with FOLIO! |
Link Your "Yes-In App" Reports | All | (Reminder) Important Notes to Reporting SIG members:
|
Topics for Future Meetings | All | Review and update Topics for Future Reporting SIG Meetings |
Other Topics? | All | Any other topics to discuss today? |