2025-05-05 Better Sample Data Meeting notes

2025-05-05 Better Sample Data Meeting notes

 Date

Apr 22, 2025

 Participants

  • @Yogesh Kumar, @Lee Braginsky, @Charlotte Whitt , @Kristin Martin, @Autumn Faulkner(regrets), @Tod Olson, @Shelley Doljack

 Goals

  • Plan the presentation at PC on 5/8/2025

  • Update on conversation with Owen on ERM data - plan for attending an upcoming meeting (5/20)

  • Follow up on the status of discussion topics and task?

 Discussion topics

Time

Item

Notes

Time

Item

Notes

 

General:

 

 

 

  1. Eureka platform to be rolled out with the Sunflower release (TBD)

Sunflower - New GA date TBD

 

  1. Planning the presentation for the PC

Plan for 15 minutes:
1. Super short intro (back ground) - 2 min

  1. Lee give refresher of the anonymization work - reached out to Community members

  2. Shelley to present the work on anonymization

  3. Finally (if time permits) - talk about work for improving data in the FOLIO reference environments - like FOLIO snapshot

    1. ERM data (https://folio-org.atlassian.net/wiki/spaces/ERMSIG/pages/955711505/ERM+Sample+Data)

  4. PC - a review and update process (going forward)

 

Charlotte can start a slide deck and save it in our shared google folder:

 

Updates for FOLIO Snapshot

Charlotte is making progress on update of Inventory instance records (in total 36 instance records with Source ID = FOLIO). As of 4/22 we now have 20 records that have been cataloged:

https://folio-quesnelia.dev.folio.org/inventory?filters=staffSuppress.false%2Csource.FOLIO&qindex=instanceAdministrativeNotes&query=better%20sample%20data&sort=title

The 20 catalogued titles have been backed up as json files.

Add 100 MARC records

Set up data import jobs.

  • Decide to do a wiki page where we document how the FOLIO Snapshot data is built, and other relevant information for test users. @Autumn Faulkner has started a google doc to keep track on the changes we have done.

 

 

  1. Stanford is working on sample authority data to be ready soon for the snapshot environment.

  • @Autumn Faulkner - Waiting on MARC records from you, using the DI and make sure it is the right connection (hopefully SRS will do the right thing) with MARC authority.

  • @Shelley Doljack - Already uploaded the authority data in the new snapshot environment.

 

Anonymization of data in Bugfest environments:

Lee Braginsky: Good news on the Data Anonymization front: Stanford Univ is going to field a team of developers for 3 weeks. @Shelley Doljack’s is leading this effort.1st week data analysis was done. There is a spreadsheet with tables to preserve vs. Annoymize.

Focus for 2 weeks will focused on PII, Users, user custom fields, Vendor name, vendor contacts, interface credentials etc.

For anonymizing requirement: -

Shelley has put together a wiki page to gather requirements for anonymization - https://folio-org.atlassian.net/wiki/x/BQA4K .
@Lee Braginsky - to present the open question to this group before Stanford developers can start on this project.

 

Review timeline document

 

Other topics

Any items to discuss?

 Action items (updated 4/22/2025)

@Lee Braginsky will update the track for Scripts to anonymize data set.
@Lee Braginsky will publicize developers' requirements in the #folio-implementers slack channel.
@Yogesh Kumar Create a wiki page where we document how the FOLIO Snapshot data is build, and other relevant information for test users.
@Charlotte Whitt will set up a work meeting with @Autumn Faulkner @Shelley Doljack @Kristin Martin @Charlotte Whitt - to get the Data Import profiles for load of MARC data and write up the two job profiles, one for open orders and one for pending status and get order data added to the Quesnelia environment
@Charlotte Whitt will update Patron notices templates and basic functionality, and update Circ rules accordingly
@Kristin Martin - will reach out to Owen, and ask him to attend an upcoming meeting to present his script for loading of agreement data. Will review the data.
@Charlotte Whitt will update the Circ rules in the Quesnelia environment
@Charlotte Whitt - will look into adding bound-with data to the Quesnelia environment. In the 100 MARC record there is one record which is bound-with. Charlotte will ask Lehigh if we can use 5-10 more sample records from their collection

 Decisions

Lee, Yogesh, Shelley will inform the working group on the talk and progress on developing the anonymization tool. Lee, Yogesh, Shelley, and Noah meets every Monday.

We are meeting every other week - uneven weeks.