2022-01-31 Reporting SIG Meeting notes

Date

Attendees

Present?

Name

Organization

Present?

Name

Organization

xArthur AguileraUniversity of Colorado, BoulderxLinda MillerCornell University
xSharon BeltaineCornell University
Nassib NassarIndex Data
xErin BlockUniversity of Colorado, BoulderxElena O'MalleyEmerson

Nancy BolducCornell University
Tod OlsonUniversity of Chicago

Lloyd ChittendenMarmotxJean PajerekCornell University
xAxel DoerrerUniversity MainzxMichael PatrickThe University of Alabama

Shelley DoljackStanford UniversityxEric PenningtonTexas A&M

x

Stefan DombekLeipzig University
Scott PerryUniversity of Chicago
xJennifer EustisU. Massachusetts Amherst / Five CollegexNatalya PikulikCornell University

Alissa HafeleStanford UniversityxVandana ShahCornell University

Ingolf Kusshbz
Amelia SuttonU. Massachusetts

Kim LaineCornell University
Simona TabacaruTexas A&M

Joanne LearyCornell UniversityxKevin WalkerThe University of Alabama
xEliana LimaFenway Library OrganizationxAngela ZossDuke University

Eric LuhrsLehigh University


Discussion Items

Item

Who

Notes

Attendance & NotesAngela

Attendance & Notes

  • Today's attendance-taker: Linda Miller
  • Today's note-takers:  Team Leads for project updates

Announcements /
Reminders

Angela

Recruiting New Query Developers

  • The Reporting SIG is always on the look-out for new query developers. Please let us know if you are interested in doing query development or if there are others at your institution who might be a good fit.


Planning future meeting topicsAll

What should we schedule for SIG meetings over the next couple of weeks?

  • Intro to SQL (today, more?)
    • nice to keep this in mind, have refreshers and see how others are teaching this
  • Git/GitHub
    • also nice
    • Git plugin for DBeaver
    • actually some urgency? came up in MM (both Git/GitHub and the DBeaver plugin)
    • first foundational Git
    • then maybe GitHub Desktop
    • then maybe command line
    • remember, we have a wiki page with some details
    • then focus on SIG-specific GitHub practices
  • Share experiences on hosting options for reporting databases
    • might inform how we develop a vision and strategy
    • the more information we have on how the LDP tools are used, the better we will be able to communicate that back to the community
    • Cornell: Sharon will check
    • Duke: Angela will check
    • Texas A&M: probably Eric, maybe Jason
    • MSU: maybe Mark Arnold to talk about LDLite
    • Chicago: Tod? Scott? Christie?
    • European institutions mostly waiting for Metadb, not really hosting yet (direct access to FOLIO database)
    • Open question: when will people have time for this; could be a while
      • maybe target March?
  • Develop a FOLIO Reporting Vision and Strategy
    • not urgent, but really important (quadrant 2)
    • might be good to have some discussions about past/current issues before planning
  • Advanced SQL topics
    • extracting from JSON (arrays of objects vs arrays of scalars; scaling to larger queries)
  • Others?
Intro to SQL, pt. 3Angela

Angela will lead some introductory SQL sessions for anyone wishing to participate

  • Can use DBeaver and connect to folio_snapshot LDP
  • If not using DBeaver:
    • Use a browser-based database tool (sqliteonline)
    • Download copy ("dump") of the folio_snapshot LDP (SQL file)
    • Angela will walk through how to load the data into sqliteonline
  • Slides from Library Carpentry SQL lessons
    • Basic idea behind relational databases
    • Selecting and sorting
    • Filtering and limiting
    • Aggregating and calculating (next time, start with Group By
    • Ordering and commenting
    • Aliases
    • Joins

See also:


Updates and Query Demonstrations from Various Reporting Related Groups and EffortsCommunity & Coordination, Reporting Subgroup Leads

Project updates

Reporting development is using small subgroups to address priorities and complete work on report queries.  Each week, these groups will share reports/queries with the Reporting SIG.  Reporting development team leads are encouraged to enter a summary of their work group activities below.

RA/UM Working Group


MM Working Group

  • Meetings are 1st Tuesday of the month, 12-1pm ET via zoom using the usual FOLIO password. Our lab sessions are open to everyone. Please bring your questions, examples, and comments about reporting and metadata.
  • Our goals this year:
    • Work on converting our LDP derived tables and queries to metadb
    • Work on learning GitHub, git, and more sql


ERM Working Group

  • Discussed ERM Goals 2022
    • Complete documentation on existing derived tables
    • Migrate ERM LDP queries for Metadb use
    • How to get more ERM SIG more involved for...
      • (real life) reporting requirements
      • checking results of report on real data
    • How can development for the integration of eHoldings data be funded
  • Housekeeing: Github issues left overs, how long to keep meeting recordings etc.
  • ERM Prototype and Query Development Status
  • Meetings are bi-weekly on tuesdays 11am ET alternating with RM Working Group
    • Next meeting will be at 1st, february
    • Contact Axel Dörrer if you would like to get a calendar invitation.


RM Working Group

  • group is working on RM-related derived tables and queries for Metadb
  • working on LDP to Metadb table mapping spreadsheet to prepare for transition to Metadb
  • looking at using https://mermaid-js.github.io/mermaid/#/ to document data models in RM areas of FOLIO 
  • several queries for RM completed, but still need documentation, testing, and review
  • for latest updates, see RM Prototype and Query Development Status
  • Meetings are biweekly  on Tuesdays 11am-noon ET; contact Sharon Markus if you would like to join us


Reporting SIG Documentation Subgroup

  • Honeysuckle documentation is live on https://docs.folio.org/docs/
  • Iris documentation is in progress, due December 15
  • Additional Context
    • The Reporting SIG has representation on the Documentation Working Group, which is building end-user documentation for https://docs.folio.org/docs/ (mostly linking to existing documentation over on GitHub)


External Statistics Working Group

  • no updates currently
  • new organizational/tracking scheme for JIRA, with pointers to queries in folio-analytics repository
  • New organizational structure for External Statistics reports
    • external statistics reports (e.g., ACRL) typically require running queries from different functional reporting areas
    • these reports will be captured in JIRA under one UXPROD-XXXX report cluster issue, then the descriptions will point to each of the queries required to run them on the folio-analytics repository
    • institutions will need to rank each of these 8 new UXPROD-XXXX report cluster issues
    • each reporting development team will take responsibility for the queries in their area for the external statistics clusters


Product Council



For all recent work on FOLIO Reporting SQL development:


Topics for Future MeetingsAll
  • Follow-up on MARC status, Quickmarc/Data Import conflicts
  • How to strengthen connections to SIGs and their developers to be kept in the loop about changes to the data model
  • Show and tell
    • how are institutions using the LDP
    • examples of using the local schema
    • Cornell's report ticketing system
    • Rollout plans from institutions
    • Ask someone on the sysadmin side to talk about LDP administration (Jason Root?)
    • What is done in JIRA? (JIRA clean up)
  • Training topics
    • adding test data in FOLIO snapshot
    • How to do ad hoc querying with the derived tables
    • How to use the LDP app
    • using KNIME to build reports (LDP edition)
    • use of local schema for custom tables
    • more on MARC (ask Jennifer)
    • using different applications (other than DBeaver)
    • Insomnia for API queries?
    • SQL
    • LDP to metadb
    • Metadb implications
      • if schema changes, will that be relatively seamless? old fields still in history, but new fields in current?
      • if fields are deprecated and removed, what happens if they come back again?
      • how will deleted records show up?
    • Git/GitHub
    • Panorama
    • LDLite (again?)
  • Discussion:
    • how to rank clusters with institutional rankings going away?
    • consortia SIG is talking about a central office app of some kind; that app might need to deal with consortia-wide reporting
    • how to train:
      • wiki pages with examples? form a small project team? (e.g., how to pull from marc, how to use local schema)
      • each dev team create training about the data structures in that area?
    • Query style (ask Nassib)
    • Gathering institutional query repositories; someone can propose a new page or a new addition to an existing page on FOLIO Analytics?
    • Follow up on how devs can work with SMEs better to decide on and advertise data model changes
  • Upcoming:
    • SQL advice/query optimization (Axel)
    • query demo - MARC (Tod)
    • Intro to SQL training (as time allows)
    • revisit discussion on openness/transparency in communication and open source software; decide as a group what we would like to advocate for
    • reports from implementers on experiences with hosting solutions


Review and update Topics for Future Reporting SIG Meetings 





  • A test Action Item (Ingolf)