Skip to end of banner
Go to start of banner

2017-11-27 Reporting SIG Notes

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Date

Attendees

Goals

  • learn more about technical architecture design for Reporting in Folio

Discussion items

TimeItemWhoNotes
Assign Notetaker, take attendance, add agenda itemsTod

Notetaker: Anne L. Highsmith

Convener: Tod Olson

  Data Lakes and Data Warehouses Peter, All

 Peter Murray to discuss Data Lakes and Data Warehouses (see discussion thread here)

Peter Murray (PM) showed part of a draft presentation designed to to give a Deep Technical View of FOLIO architecure.The recording of the presentation can be found on the Reporting SIG's Google Drive space at:


When the presentation was complete, there was some discussion among committee members, including the following points:

  • Tod Olson (TO) said that the presentation raised some concerns. First, the issue of referential integrity among the identifiers of a transaction, e.g. the user/circ/inventory pieces of a circulation checkout and how they must be preserved.  Their experience with other systems has been that maintaining such referential integrity is a problem. For example, research on a library's own data is one of its principal uses of reporting capability. Peter Murray (PM) responded that such integrity is maintained via the FOLIO apis. The apis guarantee such integrity because the Okapi layer will not allow the installation of a micro service that violates data integrity and consistency among micro services.
  • Michael Winkler (MW) commented that PM's statements imply that there will need to be an initial, massive load of data into the reporting system, otherwise there will be no data to report on, and that that initial load probably won't happen through the Okapi layer because of the time it would take. TO added that the storage designer would have to provide and Extract, Transform, Load (ETL) process based on a standardized framework. PM responded that he disagreed with that, at least in part; that the api has to have an openly published definition that will provide such a framework. 
  • TO made the point that he didn't feel complete denormalization of codes to labels was necessary, since this is a function at which relational databases could excel and that the reporting database should be left to do that. MW interjected that such denormalization is one of the functions of a data warehouse, that one shouldn't have to do a table join to determine what a code means. Also, that the data warehouse captures what is valid at a point in time; if the label for a location codes changes over time, for instance, then the data warehouse should capture that.
  • TO raised the issue that different tenants will want the data refinery (the "landing place" in the reporting system for raw data before it's integrated into the warehouse) to reflect their data, which will require work with storage designers of both the data refinery itself and the micro services modules. PM repeated that the emphasis should be on working with those who design storage in the reporting system; that designers who work with the storage micro services in the operational system must be free to select whatever storage format is appropriate, e.g. relational db in some cases or document-oriented in others.
  • Ingolf Kuss (IK) asked why are the transactions not optimized for reporting (a point PM had mentioned earlier). PM replied that it was very much an operational issue, that the Okapi layer is optimized for speed of transactions.

CODEX DesignVince, AllVince Bareau to discuss how the design of the CODEX will impact reporting functionality

Other topics?
other topics?

Next Meeting: Mon Dec 4

Build Agenda for Nov Dec 4 Meeting:

-George Stachokas to talk about how he has included Acquisitions data elements into the Reporting SIG Master Spreadsheet (or has captured this in another document)

-"Paradigm Shift" Subgroup reports on strategies for developing Reporting requirements given the proposed within-module/cross-module technical design for reporting functionality

-lead for MARC Fields Subgroup needed

Action items

  •  
  • No labels