2025-02-11 Better Sample Data Meeting notes

2025-02-11 Better Sample Data Meeting notes

 Date

Feb 11, 2025

 Participants

  • @Yogesh Kumar, @Lee Braginsky, @Charlotte Whitt , @Kristin Martin (regrets), @Autumn Faulkner (regrets), @Tod Olson (regrets), @Shelley Doljack (regrets)

 Goals

  • Follow up on the status of discussion topics and task

 Discussion topics

Time

Item

Notes

Time

Item

Notes

 

General:

  1. Letter to Risk Office at Michigan State University Libraries

 

 

  1. Eureka platform to be rolled out with the Sunflower release (4/28/2025)

 

 

 

FOLIO Snapshot:

 

 

 

 

 

 

 

 

  1. Environment update is WIP

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Stanford is working on sample authority data to be ready soon for the snapshot environment.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Anonymization of data in Bugfest environments:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

@Autumn Faulkner did send the finalized letter to the dean for the MSU early December.

The letter was sent to the risk management group, no response yet.

 

Document written up by the TC: Report on the timeline for adoption of the Eureka platform

Is there any update from the Tri-Council meeting on the decision on when to roll out Eureka.

No news.

 

Topics:

  • are circulation rules supposed to be anonymized? We will come back to that topic next time.

    • In FOLIO Snapshot the circ rules should be minimal

  • FOLIO Snapshot is being build every 24 hours. So the sample data must adhere to that, when adding talking about loan patterns, how do we ensure that open loans adhere to the circ rules for testing things like bills, notices, aging to lost, etc.?

  • At today’s call we decided to push the conversation to next meeting.

Updates on FOLIO Snapshot?

At the TC yesterday Craig asked re. status on the work of the PC working group. TC pleased to hear we are working on improving the sample data.

Latest Update:

Charlotte still to work on update of existing 36 instance records with Source ID = FOLIO. Difficult to carve out time.

  • Update existing 36 FOLIO Source.

Add 100 MARC records - MM-SIG eyes on the 100 bibs What to ensure that these records has the right mix of misc. types, to cover the basic; incl. bound-with.

  • Instances to have holdings, or holdings/item. Also example on items with No barcode, On order, and regular item barcodes.

  • Shelley to add Authority data. Try to overlap with inventory records both source FOLIO? and source MARC - added 100 authority records via data import.

  • Shelley will provide the job profile used in the institution.

  • Autumn has provided sample data for music records and serials records with multiple holds and multiple items. Holdings statements to be added. - These will be loaded shortly into the env.

    These records are loaded to this groups shared drive.

  • Autumn:

  • Order/Order lines data. Kristin will provide screen shots in Edit mode to Autumn. Autumn will create two job profiles, one for open orders and one for pending status.

  • ERM-SIG to help with providing data and review the entered data and data to be added. Kristin to reach out.

  • Decide to do a wiki page where we document how the FOLIO Snapshot data is build, and other relevant information for test users.

 

Updates on working on updating Stanford's env. to Q release.

Does anonymizing matter for Inventory records

  • E.g. notes (drop administrative notes, and all notes with private information - this need to be spec’ed out in our documentation),

  • local tags, digital book plate information, donor information (can be dropped)

    • This working group should provide the guidelines for this. Shelley asks - where the tool need to be flexible.

  • donor or bookplate data in MARC tags? Or other inventory record fields?

    • yes

  • any donor info at all?

Lee has written up a job description for the task Shelly is working on. Shelley will work 50% on this. Shelley will work with the new community member.

Yogesh will present for the ARLEF group.

 

Lee’s group will be tied up in Ramsons and Sunflower work.

Lee suggests to focus on one area; e.g. Patron and usergroups data. Can use a tool to fake data (names, phone numbers, addresses - all PII data).

Shelley has been pulled out for work on Stanford’s upgrade to Quesnelia. She will get back and pick up the work again. Shelley has put together a wiki page to gather requirements for anonymization - https://folio-org.atlassian.net/wiki/x/BQA4K .

Shelley has pulled out all reference data. Tod mentioned that maybe Chicago could contribute with a Phyton developer too.

Lee will send the job add to Tod in the Slack channel.

Shelley would need to have the technical requirements written up. Will start with the document provided by Lee and his project on the POC.

 

 

Review timeline document

 

Other topics

Any items to discuss?

 Action items

@Charlotte Whitt will update the FOLIO Snapshot track in the Timeline doc
@Lee Braginsky will update the track for Scripts to anonymize data set.
@Lee Braginsky will publicize developers' requirements in the #folio-implementers slack channel.
@Charlotte Whitt will add to the agenda for next time 2/11/2025 the talk on anonymization of circulation rules and loan data
@Charlotte Whitt @Yogesh Kumar Create a wiki page where we document how the FOLIO Snapshot data is build, and other relevant information for test users.

 Decisions

Lee, Yogesh, Shelley will inform the working group on the talk and progress on developing the anonymization tool. Lee, Yogesh, Shelley, and Noah meets every Monday.

We are meeting every other week - uneven weeks.