2023-01-19 Reporting SIG Meeting notes

2023-01-19 Reporting SIG Meeting notes

Date

Jan 19, 2023

Attendees

 Present?

Name

Organization

Present?

Name

Organization

 Present?

Name

Organization

Present?

Name

Organization

x

Arthur Aguilera

University of Colorado, Boulder

 

Eric Luhrs

Lehigh University

x

Sharon Beltaine

Cornell University

x

Linda Miller

Cornell University

 

Erin Block

University of Colorado, Boulder

 

Nassib Nassar

Index Data

 

Nancy Bolduc

Cornell University

x

Elena O'Malley

Emerson

 

Shannon Burke

Texas A&M

 

Tod Olson

University of Chicago

 

Suzette Caneda

Stanford University

 

Jean Pajerek

Cornell University

 

Lloyd Chittenden

Marmot

 

Michael Patrick

The University of Alabama

 

Tim Dannay

Mount Holyoke College

 

Eric Pennington

Texas A&M

 

Axel Doerrer

University Mainz

x

Scott Perry

University of Chicago

 

Shelley Doljack

Stanford University

x

Natalya Pikulik

Cornell University

 

Stefan Dombek

Leipzig University

 

Bob Scheier

Holy Cross

 

Jennifer Eustis

U. Massachusetts Amherst / Five College

x

Vandana Shah

Cornell University

x

Lynne Fors

Wellesley College

 

Linnea Shieh

Stanford University

 

Lisa Furubotten

Texas A&M

 

Clare Spitzer

Stanford University

x  

Alissa Hafele

Stanford University

 

Amelia Sutton

U. Massachusetts

x

Kara Hart

Wellesley College

 

Simona Tabacaru

Texas A&M

x

Jamie Jesanis

Wellesley College

 

Huey-Ning Tan

Stanford University

 

Jeanette Kalchik

Stanford University

 

Vitus Tang

Stanford University

 

Kevin Kishimoto

Stanford University

 

Irina Trapido

Stanford University

 

Ingolf Kuss

hbz

 

Catherine Tuohy

Emmanuel College

x

Alexander Lao

Stanford University

 

Kevin Walker

The University of Alabama

x

Joanne Leary

Cornell University

x

Angela Zoss

Duke University

x

Eliana Lima

Fenway Library Organization

 

 

 

Discussion Items

Item

Who

Notes

Item

Who

Notes

Attendance & Notes

Angela

Attendance & Notes

  • Today's attendance-taker: Linda (or substitute)

  • Today's note-takers:  Team Leads for project updates

Announcements /
Reminders

Angela

New: Join our Review Board

  • Great way to start learning to run LDP1 and Metadb queries, get comfortable with GitHub

  • No experience required! We will offer training.

  • Contact @Sharon Markus or @Angela Zoss (Old) if you have questions or would like to volunteer

 

Homework #1: Feedback on derived table documentation project

This summary from the project team gives some information about the state of the project and asks for feedback on two questions. We will have a discussion on these questions during the SIG meeting on Monday, January 23. Please review the document in advance. If you won’t be attending the Jan 23 meeting, feel free to share feedback in advance (via Slack, email, or comments on the document).

 

Homework #2: Reporting SIG member survey

Sharon and Angela would love to hear things we could do to improve your experience of the SIG. This survey is your chance to give us feedback! Please do take the survey, even if you are only somewhat active with the SIG. We would like responses by Friday, January 27.

 

Update on 2023 Goals:

  • Goals for 2023

  • After our Jan 5 meeting to discuss goals, we also discussed them on Monday, Jan 9

  • Overall there was agreement, but in the Monday meeting we added some rough categories to summarize the main areas for our goals

  • Many goals fall on the development teams, so please do consider joining a development team if you're not already active on one

  • Largely hoping to complete our in-progress projects, but a few additional priorities have been outlined

 

Any new members?

  • Welcome/introductions

  • Would anyone like a buddy? Like to be a buddy?

 

How to find our latest recordings

 

(Always) Recruiting New Query Developers

  • The Reporting SIG is always on the look-out for new query developers. Please let us know if you are interested in doing query development or if there are others at your institution who might be a good fit.

 

Tips on querying date fields

@Sharon Markus @Joanne Leary 

  • Slides: Working with Dates in SQL in the Library Data Platform

  • Basic date and time formats:

    • there are many formats for date and times, and you can use SQL to transform between different formats

    • date, interval, text/VARCHAR, timestamp, timestamptz

    • note about timezone - it will appear as an offset from UTC; so, -0500 is 5 hours behind UTC; Eastern US timezone flips between -0500 or -0400, depending on the time of the year and the daylight savings adjustments

  • Example of something you might want to do: create list of orders made on March 1, 2022

    • to be able to filter on a date, you might try to just type the date as text, but if the field is a timestamp, that won't necessarily work. there is extra time information in the field, and SQL isn't going to automatically extract just the simple day from the timestamp.

    • you can cast the timestamp to a date with ::DATE. 

  • Another task: filter on a range of dates

    • in our FOLIO Analytics report queries, we often use a common table expression (CTE, or a way of defining a set of values inside the query) at the top of the query to specify parameters, and those often include date ranges (one parameter for start date, another for end date)

    • then you can filter to records where the dates are greater than or equal to the start date and less than the end date (there is a special way to call the parameters in these queries, check slides for examples)

  • CURRENT_DATE is a function that returns the current date as a DATE; can include that as part of your SELECT statement if you want it to appear in your table of results

    • there is also a CURRENT_TIMESTAMP

  • NOW() returns a timestamptz

  • INTERVAL is something like week, day, month, etc.; e.g., INTERVAL '1 week'

    • can combine CURRENT_DATE with INTERVAL, e.g., CURRENT - INTERVAL '1 week' as start date and CURRENT_DATE as end date to get data from previous week

  • TO_CHAR

    • sorting or filtering dates in Excel might require dates that aren't "pretty"; might want to wait to format the date until Excel

    • if you do want to format in SQL, can use TO_CHAR, which will convert the date to something that looks nicer (e.g., mm/dd/yyyy)

    • But these can't be sorted as dates

    • This works with time components, too: need to use formatting codes like "mm/dd/yyyy hh:mi am"

    • if you're not sure if your date is a true date or just text that looks like a date, dates are aligned to the right in a column, and text is aligned to the left

    • If you want to use dates for calculations, sorting, or filtering, you probably need to keep them in their native format or cast them to a date or timestamp format

  • dates from JSON

    • If dates are in a data array, you have to pull them out from the JSON, and you will only get the date as text (the LDP1 software won't have done the date conversion for you); use casting to convert it to the format that matches both the data you're getting and also the final date you want

  • extracting date parts

    • this works on actual DATE/TIMESTAMP/TIMESTAMPTZ data, not just text

    • options include year, month, day, hour, minute, second, dow (this is day of week)

    • note that extract year may result in number formatted like 1,992; could convert those to VARCHAR if you don't need to do any additional calculation

    • can also get the names for the month and weekday

      • extract month number or day number or day of week number with DATE_PART

      • with TO_CHAR, can also specify a date part (e.g., "Mon", "Dy") to get the text version of the month or day

      • EXTRACT works like DATE_PART, just slightly different syntax; returns numbers

  • measuring length of date interval

    • AGE function can calculate days and times elapsed between two dates

      • start with end date, then start date for the age in positive numbers

    • Can also just subtract dates (easiest with simple dates)

    • Can also use INTERVAL to calculate a specific date based on a length of time relative to today (or another fixed date)

  • documentation for PostgreSQL date/time functions

    • sometimes PostgreSQL is different from other flavors of SQL, so maybe look specifically for postgres advice

Review of In-Progress Projects (Recurring)

 

Review the release notes for FOLIO Analytics, LDP1, LDLite, LDP Reporting App, ldpmarc, Metadb Projects (Recurring)

 

Updates and Query Demonstrations from Various Reporting Related Groups and Efforts Projects (Recurring)

Community & Coordination, Reporting Subgroup Leads

Project updates

Reporting development is using small subgroups to address priorities and complete work on report queries.  Each week, these groups will share reports/queries with the Reporting SIG.  Reporting development team leads are encouraged to enter a summary of their work group activities below.

RA/UM Working Group

 

MM Working Group

  • Meetings are 1st Tuesday of the month, 12-1pm ET via zoom using the usual FOLIO password. Our lab sessions are open to everyone. Please bring your questions, examples, and comments about reporting and metadata. MM notes

  • We will be fixing the LDP1 derived table instance_ext.sql which has a duplicate field (record_source and instance_source). The column record_source will be removed and comments will be added to the derived table.

  • If you are able to test ldpmarc 1.6 new release, please do so.

 

ERM Working Group

 

RM ACQ Working Group

 

Reporting SIG Documentation Subgroup

  • Morning Glory documentation is live on https://docs.folio.org/docs/

  • Nolana documentation is in review

  • Orchid documentation will be in progress soon, and plans are underway to include beta-level documentation for Metadb

  • Additional Context

    • The Reporting SIG has representation on the Documentation Working Group, which is building end-user documentation for https://docs.folio.org/docs/ (mostly linking to existing documentation over on GitHub)

 

External Statistics Working Group

  • no updates currently

  • new organizational/tracking scheme for JIRA, with pointers to queries in folio-analytics repository

  • New organizational structure for External Statistics reports

    • external statistics reports (e.g., ACRL) typically require running queries from different functional reporting areas

    • these reports will be captured in JIRA under one UXPROD-XXXX report cluster issue, then the descriptions will point to each of the queries required to run them on the folio-analytics repository

    • institutions will need to rank each of these 8 new UXPROD-XXXX report cluster issues

    • each reporting development team will take responsibility for the queries in their area for the external statistics clusters

 

Product Council

 

For all recent work on FOLIO Reporting SQL development:

 

Topics for Future Meetings

All

  • Jan 23: discuss derived table documentation homework

  • Feb 2 (tentative): Review Board Training

    • Identify a pull request that needs a review

    • Assign yourself to review a pull request

    • Reviewing a pull request using the check list

    • Communicating with a PR submitter

    • Differences between LDP1 and Metadb queries to look out for

  • Feb 13: review LDP1 Hosting Tips document

  • Exploring new recruitment/onboarding strategies (e.g., buddy system)

  • More work on asynchronous collaboration, how to engage in discussions and question answering more broadly

    • consider connecting discuss.folio.org with a Slack channel, to make sure any forum topics get highlighted on Slack as well?

  • Open question: should we update FOLIO LDP1-based Reporting First Implementers Grid