2018-02-19 Reporting SIG Notes

2018-02-19 Reporting SIG Notes

Date

Feb 19, 2018

Attendees

Present?

Name

Organization

Present?

Name

Organization

Present?

Name

Organization

Present?

Name

Organization

 

Vince Bareau

EBSCO

 

Katalin Lovagne Szucs

Qulto

X

Sharon Beltaine

Cornell University

 

John McDonald

EBSCO

 

Elizabeth Berney

Duke University

 

Peter Murray

Index Data

 

Ginny Boyer

Duke University

 

Erin Nettifee

Duke University

 

Joyce Chapman

Duke University

X

Karen Newbery

Duke University

 

Elizabeth Edwards

University of Chicago

X

Tod Olson

University of Chicago

 

Claudius Herkt-Januschek

SUB Hamburg

 

Scott Perry

University of Chicago

x

Doreen Herold

Lehigh University

 

Robert Sass

Qulto

X

Anne L. Highsmith

Texas A&M

X

Simona Tabacuru

Texas A&M

 

Filip Jakobsen

Index Data

X

Mark Veksler

EBSCO

 

Harry Kaplanian

EBSCO

 

Kevin Walker

The University of Alabama

X

Ingolf Kuss

hbz

 

Charlotte Whitt

Index Data

 

Lina Lakhia

SOAS

 

Michael Winkler

Cornell University

X

Joanne Leary

Cornell University

 

Christine Wise

SOAS

 

Michael Patrick

The University of Alabama

 

 

 

 

Goals

  • Discuss Data Lake Proof of Concept Project Design & Goals

  • Review Reporting Tools

  • Resource/Format working group needs rep(s) from Reporting SIG

  • plan future topics

Discussion items

Item

Who

Notes

Item

Who

Notes

Assign Notetaker, Take Attendance, Review agenda

Sharon

Previous Notetaker: Doreen Herold

Today's Notetaker: Simona Tabacaru

 POC/Data Lake Project

Sharon, Tod, Anne, Mark, Vince, Joanne, Doreen, Scott, Karen

  • structure of  POC/Data Lake Project

  • current status of Tod's data loader Python script

    • Tod Olson has a script for creating loans on github
      https://github.com/todolson/folio-loan-tool.git

    • It's not complete, but it does this:

      • 1. authenticates

      • 2. gets a list of users

      • 3. gets a list of items

      • 4. creates a list of user/ item pairs to loan

      • 5. POSTS loan request - not yet working It's on GitHub, so people can see how it works.

    • Each step is simplistic. There are a number of things marked TODO, so some suggestions for further work if we can get some help fleshing it out.

    • schedule another small group meeting?

  • important considerations for a data lake environment

questions?

Notes:

  • Sharon gave a short overview of the POC/Data Lake Project.

  • A small work group met to discuss the Proof of Concept (POC) for a Data Lake on 2/13/2018.

  • Minutes notes and recording of the meeting are available.

  • The POC/Data Lake is a 3-week project; we are in the 2nd week with a deadline for completion – March 2nd.

  • The goal is to design a Data Lake environment. The group clarified what kind of data will go into the Data Lake and what type of report should be built. The report should include information from all three areas of FOLIO (patron, circulation and inventory).

  • Tod Olsen will write a Python script to load the data.

  • The working group decided to use an open source tool, BIRT, as the reporting tool for this project. Chris Creswell will write the BIRT report.

Reporting will be done in 2 steps:

  • EBSCO will give Chris a data extract from the Data Lake

  • Chris will try to connect BIRT to the Data Lake

Tod Olson shared his notes about the Python script. The scope of the script is to create loans automatically. The idea is to:

  • pull users from user storage

  • pull items from item storage

  • make random loans

Tod will connect with Matt Reno to work on this.

Someone from Texas A&M has some Python skills and could take a look at the script.

  • Sharon will help setting up some meetings to help Tod with the Python script:

  • One meeting between Tod and the person from Texas A&M

  • One meeting between Tod and Matt Reno

  • One meeting between Matt Reno & Chris Creswell

Update on this project at the next meeting.

 

Current Reporting Tools

All

Review of Current Reporting Tools used by Reporting SIG participants

  • any additional tools?

Most of the Reporting SIG participants are using SQL in relational database system to generate reports

  • Duke University uses a combination of SQL + PERL + IBM Cognos + some ExLibris canned reports. Data source: ALEPH

  • University of Alabama: same

  • University of Chicago: Access + Excel. The assessment librarian uses Tableau

Sharon: We don’t have experience with these open source tools. Do we need to do an analysis of the current reporting tools? We should create a short list of tools that will work well in this environment.

  • Is there someone from FOLIO/EBSCO/Index Data that can help us come up with the short list of tools?

  • We will probably be tasked to provide training on these tools. This will be part of our roles.

Mark: This request should go to the Product Council – they might be able to assign resources to help with the tool analysis request.

  • Our goal is to recommend either commercial reporting tools or open source reporting tools. This would likely be a decision made by each institution participating in FOLIO.

  • We need to determine what skills are needed to use the open source reporting tools.

  • We need to have a list of tools that we are recommending.

Questions to consider:

  • How does BIRT work?

  • Should we test each tool with the current set-up?

  • Do we need a list of current issues (top 3 or top 5 issues) known for each reporting tool? I.e.: One known problem with Cognos is that data goes through 2 transformations; for invoicing packages it’s difficult to get item data.

  • Is it worth it for us spending time with coming up with a list of issues since we are going to a new environment?

Some issues are documented on the Master Spreadsheet. For example, on the Metadata management tab it is discussed the data integrity and consistency checking (a category of reports) – we need to be able to do these kind of reports in FOLIO. We’ll use the Master Spreadsheet as source for analysis.

 

Resource/Format Working Group

Sharon

update from 2/13/18 meeting

  • This group is working on an inventory set-up (how the data will be structured and how the data will look)

  • Sharon is attending as a place holder, but we need someone from Reporting SIG to attend their meetings

  • We’ll try to build a report using the structure that they are using

  • We need to look through those data elements

More info at a next meeting.

Wishlist

Sharon

Reporting SIG Master Spreadsheet

  • including separate tab for wishlist functionality (what you'd like but do not actually have yet)

Reporting SIG Master Spreadsheet/ Import-export tab

  • On the Import-Export tab – the column Link to sample/Wishlist has been edited to Link to sample only.

  • The Wishlist is separated on another spreadsheet/tab on the Master Spreadsheet

Additional Topics?

All

Other topics? - None suggested

Future Topics

Sharon

Topics for Future Reporting SIG Meetings

  • Email Sharon or add your topic directly on the wiki page – Topics for Future Reporting SIG Meetings

  • Ask around what are some good reporting tools that we can use.

  • Sharon will get in touch with Mark to see if someone is documenting the data architecture set-up.

Action items