2018-09-10 Reporting SIG notes
Date
Attendees
Present? | Name | Organization | Present? | Name | Organization |
---|---|---|---|---|---|
X | Sharon Beltaine | Cornell University | Peter Murray | Index Data | |
Elizabeth Berney | Duke University | Erin Nettifee | Duke University | ||
Joyce Chapman | Duke University | Karen Newbery | Duke University | ||
Elizabeth Edwards | University of Chicago | X | Tod Olson | University of Chicago | |
X | Claudius Herkt-Januschek | SUB Hamburg | X | Scott Perry | University of Chicago |
X | Doreen Herold | Lehigh University | Robert Sass | Qulto | |
Anne L. Highsmith | Texas A&M | X | Simona Tabacaru | Texas A&M | |
Vince Bareau | EBSCO | Mark Veksler | EBSCO | ||
Harry Kaplanian | EBSCO | X | Kevin Walker | The University of Alabama | |
X | Ingolf Kuss | hbz | Charlotte Whitt | Index Data | |
Lina Lakhia | SOAS | X | Michael Winkler | OLE | |
X | Joanne Leary | Cornell University | Uschi Klute | GBV | |
X | Michael Patrick | The University of Alabama | X | Holly Mistlebauer | Cornell University |
X | Nassib Nassar | Index Data | X | Angela Zoss | Duke University |
X | Veit Köppen | University Magdeburg |
Discussion items
Item | Who | Notes |
---|---|---|
Assign Notetaker, Take Attendance, Review agenda | Sharon | Today's notetaker: Ingolf Last week's notetaker: Tod Olson, Sharon Beltaine |
Welcome New Members | Angela, Veit | Please welcome our news Reporting SIG members, Angela Zoss from Duke University and Veit Köppen from the Library at the University Magdeburg. -introductions, background Angela is the Assessment and Data Visualization Analyst in the Assessment and User Experience department at Duke University Libraries. She has been working on Data Visualization for many years. Veit has been working at the University of Magdeburg for 4 years. He is an expert and lecturer on Data Warehouse Technologies. He is also the Head of IT Applications at the University Libraries. |
Data Warehouse Architecture | Nassib | Nassib will walk the Reporting SIG through "A Library Data Platform Architecture for FOLIO, which provides an outline for the requirements, architecture, and implementation of an environment to support the extensive data analysis and reporting needs of institutions implementing FOLIO. This provides the next step after the groundwork laid in our planning for a reference data warehouse environment. Nassib would like your feedback, so please bring your questions. Meeting Notes Nassib provides an overview of the Library Data Platform Architecture for FOLIO . The architecture is divided into FOLIO Core, the Data Analysis Platform and the Reporting Tool. In addition, there might be external data sources which can stream data into the platform. An important conecpt (of the microservice architecture) is that not all data are stored in a single location. There is transaction processing versus analytical processing. For transaction processing, operational data are fragmented in fairly granular databases. This is very responsive and quick. It is realized via database indexes. Analytical processing, in contrast, is done for Reporting. One is interested in a set of columns, thus looks at columns in the records. Analytical processing competes resource contention and is very slow. Most library data are highly structured, suitable for databases. One needs an ETL-extraction for them. The concern about Referential Inegrity has to be taken into account, because of multiple database storages. There are two types of data extraction (for reporting) from the storage modules. One type is Batch ETL for extracting data from the storage modules in batch mode. Batch ETL will also be used to extract external data. The other one is Streaming ETL. This will stream the data that come up in operational transactions (in Okapi) through a Message Queue, do some kind of ETL on them and feed the result into the Reporting Database. Streaming ETL can be time-consuming to implement. Maintanance is also cost-efficient. The development team will need more developers to do this. "Star Schema" is an example of analytical processing. Another idea is to use columns stores; this is more suitable for analytical processing. Discussion / Concerns
|
Report Prototyping Update | Report Prototype Workgroup | To support the initial steps for development of a reporting data warehouse environment, a small workgroup has formed to prototype some simple reports in the functional areas of loans, inventory, and users. As part of this effort, the workgroup has mapped and diagrammed the data elements required for 2 circulation reports with the assistance of Emma Boettcher, PO for Loans, and Charlotte Whitt, PO for Inventory. This effort lays the groundwork for the Reporting SIG to begin prototyping additional reports to support the development required for our future reporting environment. Members of the workgroup will describe the steps taken to create these initial report prototypes. Feedback and questions are encouraged as we step through this process. Meeting Notes Joanne, Charlotte and Emma worked on an operational structure in which we can develop the prototypes. They first focused on loans, inventory and users. They present the first two prototypes to the group. These are two basic types of reports:
See the documents in the Google drive folder of the Report Prototypes Group ! The Circulation Detail Report has owning library, item-ID, charge date and patron group as selection fields. One needs to look at the Loan Rules, these capture much information. The Item Detail Report needs many details from Inventory. It lists titles which have been changed in a specific date range. We need to get the data for these reports in the JSON schema and the RAML definitions to build a data dictionary (Tod). The schemas and definitions need to be updated by the developers. Disucssion about de-referencing data Concern (Tod): "I expect a direct link to the patron itself, not just a link to the loan rule. It would be cleaner if we didn't have to parse the loan rule. The loan transaction has to have a reference to a patron and to an item." Response (Joanne): "A circ matrix ID links to the loan rule; it creates a link to the patron group.The item ID is recorded with the Circ Transaction directly." We may want to identify the candidates for stable reference data. It would be nice to have a fair amount of data that needs not be de-referenced. |
Topics for Future Meetings | All | Review and update Topics for Future Reporting SIG Meetings -during our September 17 meeting, Holly Mistlebauer will walk us through the upload of our data warehouse report information into the FOLIO JIRA system |
Other Topics? | All | Any other topics to discuss today? |