2018-08-13 Reporting SIG notes

2018-08-13 Reporting SIG notes

Date

Aug 13, 2018

Attendees

Present?

Name

Organization

Present?

Name

Organization

Present?

Name

Organization

Present?

Name

Organization

X

Sharon Beltaine

Cornell University



Peter Murray

Index Data



Elizabeth Berney

Duke University



Erin Nettifee

Duke University



Joyce Chapman

Duke University



Karen Newbery

Duke University



Elizabeth Edwards

University of Chicago

X

Tod Olson

University of Chicago

X

Claudius Herkt-Januschek

SUB Hamburg

X

Scott Perry

University of Chicago

X

Doreen Herold

Lehigh University



Robert Sass

Qulto

X

Anne L. Highsmith

Texas A&M

X

Simona Tabacaru

Texas A&M



Vince Bareau

EBSCO



Mark Veksler

EBSCO



Harry Kaplanian

EBSCO

X

Kevin Walker

The University of Alabama



Ingolf Kuss

hbz



Charlotte Whitt

Index Data



Lina Lakhia

SOAS

X

Michael Winkler

OLE

X

Joanne Leary

Cornell University



Uschi Klute

GBV

X

Michael Patrick

The University of Alabama

X

Holly Mistlebauer

Cornell University

X

Nassib Nassar

Index Data

X

Chris Manly

Cornell University


Guests: Joseph Zucca and Kate, UPenn Libraries



Discussion items

Item

Who

Notes

Item

Who

Notes

Assign Notetaker, Take Attendance, Review agenda

Sharon

Today's notetakers: Doreen Herold

Last week's notetaker: Anne Highsmith

MetriDoc

Joe Zucca

Joe Zucca and Kate from UPenn Libraries provided an introduction to the University of Penn Libraries Metridoc Data Farm Project: Open Source Data Warehousing. See https://metridoc.library.upenn.edu/ for more information.

Looking to redevelop to integrate various reporting tools (e.g. Tableau, Origin, or some app like those; any app that can interact with MySQL); redevelopment under auspices of IvyPlus

Originally developed with funds from the IMLS

From interface can download data to spreadsheet; example of data captured: from OCLC Relais which is imported nightly

Data can be expensive to capture (a pretty significant problem)

Data is hard to get to: comes from a significant variety of sources with a desire to co-locate it for a comprehensive analysis

Platform that integrates and normalizes data to allow for the drawing of linkages to relate the data; extract, transform, load, support analysis

Metridoc provides a basic framework but is built on a technology that's difficult to work with; the rewrite will provide for greater sustainability as well as provide for opportunity to do analysis across institutions

Sharon noted that Metridoc as an OLF project empowers members to set up a local data warehouse facilitated by Metridoc; because Metridoc is currently immersed in rebuilt could delay the robust data warehouse environment we're looking for to provide for testing

Tod: How do we turn FOLIO data that lands in a data warehouse? What needs to be anonymized? What needs to be scrubbed? Metridoc provides a framework for ingesting/transforming but we would be responsible for FOLIO as a source of data for Metridoc (work that could be done by OLE developer?)

What about JSON? Has not been experimented with yet; Anne says it could potentially be supported; Sharon: task of figuring out how to transform data after it comes out of FOLIO, must decide how to define boundaries, working with privacy issues

Sharon: data integrity as a challenge? Joe: not an issue that's arisen up until now; will have to think carefully of where Metridoc sits in FOLIO; Tod: problem has to be solved in FOLIO

Sharon: looked at open source tools such as BIRT? Joe: no but there's interest, range of tools to work with Metridoc should be broad, want to integrate as many different types of data (eg gate count)

Joe: assembling data sources as required a lot of effort; collection of reference and instructional data (mapping scheme to acquire data from Insights); seeing that kind of data along other data types provides for powerful analysis (Suma (NC State): https://www.lib.ncsu.edu/projects/suma)

Metridoc: middleware, place for posting data, normalized, federated data source; FOLIO members need to determine interface and need to determine transformation of data exported from FOLIO for import to Metridoc

Metridoc can only do so much; we need to determine structure of data to figure out how this will work; we want to have a set of logical normalizations; data flowing to Metridoc with identifiers (microservices design)

Where does data integrity break down; not having the data come out in a way that can't be normalized; emphasis on testing will start to inform us, helping us determine standards needed for data warehouse

Many thanks to Joe and Kate for helping us understand the potential for Metridoc with FOLIO!

Link Your "Yes-In App" Reports

All

(Reminder)

Important Notes to Reporting SIG members:

  • If your report is identified as a "Yes - In-App Report": Please drop a copy in the appropriate functional area folder in the Library Stats folder in the Reporting SIG Google Drive folder and link them to the "Link to Sample" column in the Reporting SIG Master Spreadsheet, if you haven' yet. It really helps Product Owners to have a report to view to understand what is needed for a given in app report.

  • Please keep track of all the reports you added to the spreadsheet as Product Owners may add questions and remarks to the spreadsheet.

  • If you do not agree with a certain report being identified as "No In-App report" please turn to Holly.

Topics for Future Meetings

All

Review and update Topics for Future Reporting SIG Meetings

Other Topics?

All

Any other topics to discuss today?

Action items