2024-09-16 Reporting SIG Meeting notes

Date

Sep 16, 2024

How to Join the Meeting

Meetings are held on Zoom on the first 4 Mondays of each month at 11:00 am Eastern U.S. time (see this time in your time zone). The first and third Mondays focus on Reporting Development topics, and the second and fourth Mondays focus on Reporting Business topics.

Here are the Zoom details:

Join from PC, Mac, Linux, iOS or Android: https://openlibraryfoundation.zoom.us/j/601231377?pwd=ZVFtQWxUaTFLb1J3b1JPdlZqZU1lQT09

Or iPhone one-tap (US Toll):  +13017158592,,601231377# or +13126266799,,601231377#

Or Telephone:

    Dial: +1 408 638 0968 (US Toll) or +1 646 558 8656 (US Toll)

    Meeting ID: 601 231 377

   Find your local number: https://openlibraryfoundation.zoom.us/u/ketrxUP4uW

Attendees

 

Present?

Namex

Organization

Present?

Namex

Organization

 

Arthur Aguilera

University of Colorado, Boulder

 

Sharon Beltaine

Cornell University

 

Erin Block

University of Colorado, Boulder

 

Shannon Burke

Texas A&M

 

Suzette Caneda

Stanford University

 

Dung-Lan Chen

Skidmore College

 

Lloyd Chittenden

Marmot

x

Ann Crowley

Cornell University

 

Tim Dannay

Mount Holyoke College

 

Danielle Dempsey

Villanova University

 

Axel Doerrer

University Mainz

 

Shelley Doljack

Stanford University

x

Stefan Dombek

Leipzig University

 

Jennifer Eustis

U. Massachusetts Amherst / Five College

 

Lynne Fors

Wellesley College

 

Lisa Furubotten

Texas A&M

x

Mike Gorrell

Index Data

 

Alissa Hafele

Stanford University

 

Lucy Harrison

GALILEO

 

Kara Hart

Wellesley College

 

Corrie Hutchinson

Index Data

 

Jamie Jesanis

MCPHS

 

Jeanette Kalchik

Stanford University

 

Harry Kaplanian

EBSCO

x

Sarah Kasten

University of Chicago

 

Tim Kiser

Michigan State University

x

Kevin Kishimoto

Stanford University

 

Ingolf Kuss

HBZ

 

Alexander Lao

Stanford University

 

Joanne Leary

Cornell University

x

Eliana Lima

Fenway Library Organizationder

 

Eric Luhrs

Lehigh University

 

Kathy McCarthy

EBSCO

 

Lisa McColl

Lehigh University

x

Linda Miller

Cornell University

 

Joseph Molloy

Spokane Public Library

 

Kathleen Moore

 

 

Nassib Nassar

Index Data

x

Elena O'Malley

Emerson

 

Tod Olson

University of Chicago

x

Jean Pajerek

Cornell University

x

Kimberly Pamplin

Texas A&M University 

x

Scott Perry

University of Chicago

 

Natalya Pikulik

Cornell University

 

Emily Sanford

Michigan State University

x

Bob Scheier

Holy Cross

 

Vandana Shah

Cornell University

 

Linnea Shieh

Stanford University

 

Rebekah Silverstein

Oklahoma State University

 

Susie Skowronek

Oakland University

 

Ken Smith

Valdosta State University

x

Kimberly Smith

Middle Tennessee State University

 

Clare Spitzer

Stanford University

 

Amelia Sutton

U. Massachusetts

 

Simona Tabacaru

Texas A&M

 

Huey-Ning Tan

Stanford University

 

Vitus Tang

Stanford University

 

Christie Thomas

University of Chicago

 

Irina Trapido

Stanford University

 

Catherine Tuohy

Emmanuel College

 

Patrick Waite

U. Mass Amherst

Visitors:

 

 

 

 

 

Discussion Items

Item

Who

Notes

Item

Who

Notes

Attendance & Notes

Scott

Attendance & Notes

  • Today's attendance-taker: Linda (or substitute)

  • Today's note-takers:  Team Leads for project updates

 

Announcements/Reminders

Scott

Announcements:

  • No meeting next Monday due to WolfCon and now Workshopping of queries today

  • Wolfcon 2024 coming up in September

    • Here is a summary of information about WOLFCon 2024 from Jesse Koennecke:

      • Location and Dates: Senate House, University of London, September 24-26, 2024

      • Registration Open: Join us at Senate House, University of London. September 24-26, 2024. Register now through July 31, 2024 for an early bird discounted rate.

      • Learn more about WOLFcon 2024: Want to learn more about the Open Library Foundation and WOLFcon? Be sure to visit our website where you can learn more about the foundation, members projects, communities, and the annual conference.

 

  • About the Reporting SIG meeting schedule

    • Meetings are held on the first 4 Mondays of each month at 11:00 am Eastern U.S. time (see this time in your time zone). The first and third Mondays focus on Reporting Development topics, and the second and fourth Mondays focus on Reporting Business topics. 

    • "business" means topics like presentations on reporting functionality and new features, new reporting applications, surveys and studies on reporting, etc.

    • "development" means working on derived tables and report queries for the folio-analytics GitHub repository

    • "workshopping" queries could be scheduled during any of these meetings, and it would be great to have topics and/or questions in advance so we can prepare to walk through the answers/approaches, such as "how to I fix this inventory query to get rid of the duplicates?" or "what is the best way to calculate totals in this finance query?" 

     

Ongoing Topics:

  • Workshopping your queries

    • part of each Reporting SIG business meeting will be devoted to time to work through any query questions you may have

    • please reach out to @Christie Thomas if you have a question you would like to "workshop" during an upcoming Reporting SIG meeting

 

  • Impacts of New Fields and Features (Sharon)

 

  • Upcoming Reporting SIG meeting topics (tentative)

    • Reporting App use at various institutions

    • More Metadb training

    • Wolfcon debrief

 

  • Any new members?

    • Welcome/introductions

 

 

SIG Recruitment:

We will need to be recruiting for a variety of roles in the coming months. Please consider whether you would be interested. Please reach out to @Scott Perry or @Sharon Beltaine with any questions.

  • Representative for the Documentation Working Group 

  • Query developers

 

Review upcoming and already implemented changes to Metadb

Mike Gorrell

Software Updates

We have a few things in a pre-1.4 or 1.3.2 release stage that will be tested

  • Performance enhancements during Syncs - one related to processing data that has been seen before but hasn’t changed, and another that reads Kafka messages using concurrent consumers during sync

  • Environment variable to override the default FOLIO Analytics version during a Metadb build.

We are working on a feature that will be moving to a test phase soon: “Full JSON Transformation”. This feature will allow the Metadb administrator to identify JSON that will be transformed into Metadb tables. We are hoping to have a version be deployed into our Beta Test environment in October.

  • This feature will NOT result in automatically transforming 100% of the JSON

    • That may be computationally expensive

    • May be wasteful - just as there are tables/schemas that may be skipped, some JSON is not needed/no value in having it be transformed.

  • This will replace the need for some of the derived table work/SQL

  • We will target which JSON gets extracted based on Derived Tables and their usage

  • Testing will require a selection of tables/JSON Paths to be transformed and then:

    • Verify the JSON is unpacked correctly

    • Note the additional time the sync takes

    • Verify the data continues to be updated after the sync

    • Is anyone interested in testing as well as helping determine what the priority would be for tables/JSON for our testing efforts?

More information September 30th.

Related questions:

  • Will this end the issue of new fields not being picked up? I am not sure what this issue is - but unlikely that this feature will impact new fields.

  • What happens to deprecated fields? To be confirmed, but I imagine they will stay represented in Metadb but become null when they are deprecated.

  • Does anyone want to share their usage?

 

Which tables and indexes are being used in your Metadb/LDP?

See the full description: https://pgpedia.info/p/pg_stat_user_tables.html

  • seq_scan = Number of sequential scans of this table - aka the table was used

  • idx_scan = Number of Index Scans of this table - aka the index was used

  • Metadb:

SELECT schemaname, relname, seq_scan, last_seq_scan, idx_scan, last_idx_scan from pg_stat_user_tables WHERE schemaname LIKE 'folio_%' AND relname NOT LIKE 'zzz\_\_\_%' AND relname NOT LIKE '%\_\_' ORDER BY schemaname, relname
  • LDP:

SELECT schemaname, relname , seq_scan, idx_scan from pg_stat_user_tables where schemaname = 'public' or schemaname = 'folio_reporting' ;

 

Other questions:

  • When will LDP support end?

 

Recurring Items (Updated weekly, but not always discussed in meeting)

Item

Who

Notes

Item

Who

Notes

Review of In-Progress Projects (Recurring)

 

 

Review the release notes for FOLIO Analytics, LDP, LDLite, LDP Reporting App, ldpmarc, Metadb Projects (Recurring)

 

 

Updates and Query Demonstrations from Various Reporting Related Groups and Efforts Projects (Recurring)

Community & Coordination, Reporting Subgroup Leads

Project updates

Reporting development is using small subgroups to address priorities and complete work on report queries.  Each week, these groups will share reports/queries with the Reporting SIG.  Reporting development team leads are encouraged to enter a summary of their work group activities below.

D-A-CH Working Group (D-Reporting)

  • Ongoing topics: Workshopping, DBS statistics

    • Unification of reference data when mapping data from German union catalogs to FOLIO. Collaboration with the MM working group (D-A-CH).

    • Statistics after data anonymization according to GDPR. Collaboration with the RA/UM working group (D-A-CH).

  • Preparations for the German conference FOLIO Days in Bamberg

  • Meetings: Contact @Stefan Dombek if you would like to get a calendar invitation

 

Product Council

 

Reporting SIG Documentation Subgroup

  • Poppy documentation is live on https://docs.folio.org/docs/

  • Quesnelia documentation is in development

  • Additional Context

    • The Reporting SIG has representation on the Documentation Working Group, which is building end-user documentation for https://docs.folio.org/docs/ (mostly linking to existing documentation over on GitHub)

 

For all recent work on FOLIO Reporting SQL development: