Project updates
The Reporting SIG is using small working groups to address priorities and complete our work. Each week, we will provide updates to the Reporting SIG from these various reporting-related groups and efforts. Please include updates on specific JIRA issues for prototype or query development workflow.
Additional JIRA tickets for Angela to create for prototypes or queries
Community & Coordination:
- reminder: Please make sure your institution has a Reporting Representative listed on the Reporting Survey
FOLIO Reporting development
- Overview of development work on derived tables (NN)
- If you have large data tables at your institution, you can install cstore_fdw (open source) on your LDP (not supported by AWS RDS, however.) See PostgreSQL with Columnar Tables section of LDP User Guide for more information. This can speed up querying these large tables.
Reporting Data Privacy Working Group:
- update on request to Product Council to have FOLIO create centralized documentation for fields containing personal data (Sharon, Ingolf)
- see Central Store for Data Privacy Fields
- suggestion: Ask the developers to annotate the JSON schema and find a way to extract list of personal data fields? JSON schemas not in predictable locations, however, so difficult to track.
- From NN:
The issue is that the reporting community in FOLIO currently support data privacy and GDPR compliance by anonymizing personal data in the reporting database; but they have no way to know which fields in FOLIO contain personal data, other than by periodically reviewing the entire data model - consisting of more than 100 tables - and doing their own bottom-up analysis. This means that for libraries using FOLIO's reporting database, the database can become non-compliant under GDPR if FOLIO's schema changes. So the reporting community need some way to be notified about which fields contain personal data as FOLIO schemas change over time; otherwise libraries regulated by GDPR will have difficulties.
This issue could be addressed by (1) continuously updated documentation, in such a way that the reporting community can receive automatic notification of changes; or it could be addressed by (2) engineering, by making the information available electronically at runtime.
Let me clarify, with respect to JSON schemas, that the reporting model is to store the data from FOLIO's internal databases, not transformed data provided by APIs. The JSON schemas currently used in FOLIO correspond to the API-based data and not necessarily to the internal data. So extending the current JSON schemas with annotations would not address the problem.
Engineering a runtime solution is of course possible, and would be ideal, but it also would be (a) unlikely to be prioritized by FOLIO's product management and (b) probably designed not in consultation with the reporting community. In the interest of supporting GDPR-regulated libraries in the near term, I suggest that the documentation approach may be more pragmatic.
RA/UM Working Group
- no updates this weekworking on derived tables
MM Working Group
- ?Jennifer Eustis is new liaison to MM Reports Subgroup
- several queries getting created and tested against both folio_snapshot and UChicago
ERM Working Group