2018-08-13 Reporting SIG notes

Date

Attendees

Present?NameOrganizationPresent?NameOrganization
XSharon BeltaineCornell University
Peter MurrayIndex Data

Elizabeth BerneyDuke University
Erin NettifeeDuke University

Joyce ChapmanDuke University
Karen NewberyDuke University

Elizabeth EdwardsUniversity of ChicagoXTod OlsonUniversity of Chicago
XClaudius Herkt-JanuschekSUB HamburgXScott PerryUniversity of Chicago
XDoreen HeroldLehigh University
Robert SassQulto
XAnne L. HighsmithTexas A&MXSimona TabacaruTexas A&M

Vince BareauEBSCO
Mark VekslerEBSCO

Harry KaplanianEBSCOXKevin WalkerThe University of Alabama

Ingolf Kusshbz
Charlotte WhittIndex Data

Lina LakhiaSOASX

Michael Winkler

OLE
XJoanne LearyCornell University
Uschi KluteGBV
XMichael PatrickThe University of AlabamaXHolly MistlebauerCornell University
XNassib NassarIndex DataXChris Manly

Cornell University


Guests: Joseph Zucca and Kate, UPenn Libraries


Discussion items

ItemWhoNotes
Assign Notetaker, Take Attendance, Review agendaSharon

Today's notetakers: Doreen Herold

Last week's notetaker: Anne Highsmith

MetriDocJoe Zucca

Joe Zucca and Kate from UPenn Libraries provided an introduction to the University of Penn Libraries Metridoc Data Farm Project: Open Source Data Warehousing. See https://metridoc.library.upenn.edu/ for more information.

Looking to redevelop to integrate various reporting tools (e.g. Tableau, Origin, or some app like those; any app that can interact with MySQL); redevelopment under auspices of IvyPlus

Originally developed with funds from the IMLS

From interface can download data to spreadsheet; example of data captured: from OCLC Relais which is imported nightly

Data can be expensive to capture (a pretty significant problem)

Data is hard to get to: comes from a significant variety of sources with a desire to co-locate it for a comprehensive analysis

Platform that integrates and normalizes data to allow for the drawing of linkages to relate the data; extract, transform, load, support analysis

Metridoc provides a basic framework but is built on a technology that's difficult to work with; the rewrite will provide for greater sustainability as well as provide for opportunity to do analysis across institutions

Sharon noted that Metridoc as an OLF project empowers members to set up a local data warehouse facilitated by Metridoc; because Metridoc is currently immersed in rebuilt could delay the robust data warehouse environment we're looking for to provide for testing

Tod: How do we turn FOLIO data that lands in a data warehouse? What needs to be anonymized? What needs to be scrubbed? Metridoc provides a framework for ingesting/transforming but we would be responsible for FOLIO as a source of data for Metridoc (work that could be done by OLE developer?)

What about JSON? Has not been experimented with yet; Anne says it could potentially be supported; Sharon: task of figuring out how to transform data after it comes out of FOLIO, must decide how to define boundaries, working with privacy issues

Sharon: data integrity as a challenge? Joe: not an issue that's arisen up until now; will have to think carefully of where Metridoc sits in FOLIO; Tod: problem has to be solved in FOLIO

Sharon: looked at open source tools such as BIRT? Joe: no but there's interest, range of tools to work with Metridoc should be broad, want to integrate as many different types of data (eg gate count)

Joe: assembling data sources as required a lot of effort; collection of reference and instructional data (mapping scheme to acquire data from Insights); seeing that kind of data along other data types provides for powerful analysis (Suma (NC State): https://www.lib.ncsu.edu/projects/suma)

Metridoc: middleware, place for posting data, normalized, federated data source; FOLIO members need to determine interface and need to determine transformation of data exported from FOLIO for import to Metridoc

Metridoc can only do so much; we need to determine structure of data to figure out how this will work; we want to have a set of logical normalizations; data flowing to Metridoc with identifiers (microservices design)

Where does data integrity break down; not having the data come out in a way that can't be normalized; emphasis on testing will start to inform us, helping us determine standards needed for data warehouse

Many thanks to Joe and Kate for helping us understand the potential for Metridoc with FOLIO!

Link Your "Yes-In App" ReportsAll

(Reminder)

Important Notes to Reporting SIG members:

  • If your report is identified as a "Yes - In-App Report": Please drop a copy in the appropriate functional area folder in the Library Stats folder in the Reporting SIG Google Drive folder and link them to the "Link to Sample" column in the Reporting SIG Master Spreadsheet, if you haven' yet. It really helps Product Owners to have a report to view to understand what is needed for a given in app report.
  • Please keep track of all the reports you added to the spreadsheet as Product Owners may add questions and remarks to the spreadsheet.
  • If you do not agree with a certain report being identified as "No In-App report" please turn to Holly.
Topics for Future MeetingsAll

Review and update Topics for Future Reporting SIG Meetings

Other Topics?AllAny other topics to discuss today?

Action items

  •