2024-10-15 Better Sample Data Meeting notes
Date
Oct 15, 2024
Participants
@Yogesh Kumar, @Lee Braginsky, @Charlotte Whitt , @Kristin Martin (regrets) @Autumn Faulkner(NA) , @Alissa Hafele @Tod Olson
Goals
Follow up on the status of discussion topics and task
Discussion topics
Time | Item | Notes |
---|---|---|
|
| The topic was on the agenda for today’s meeting with the OLF Officers. The plan was to attend and present the purpose of the letter and our draft document. We missed it - @Autumn Faulkner is to forward the letter to Simeon Warner and then we will have the conversation on Slack in stead. Kristin sent Folio and OLF images for the letterhead. @Autumn Faulkner- We need to finalize the Google Doc with the revised version of the letter to be brought to the Discussion: Tod has had a conversation with Shawn on contributing MARC records with holdings information in 952 (see also the Slack channel). Alissa mentioned that Stanford can contribute data also with holdings data.
a. Golden Copy / Yogesh Kumar The environment was created by the Kitfox team and updated by QA. The Kitfox team is now working on fixing the issues listed here https://folio-org.atlassian.net/wiki/spaces/DQA/pages/203685917 All work is done by now, except removing the duplicate records. Removing would also delete item and holdings. Foljiet (Ryan Taylor) to discuss with the team what to do. No simple way to identify. Charlotte can check with Charles Ledvina, how he would recommend to identify duplicates.
b. Snapshot environment. Update from Charlotte Whitt: Still waiting for the environment - monthly build environment: https://folio-quesnelia.dev.folio.org/ to be built. A new DevOps started at Index Data yesterday, so hopefully this work will be picked up. Recruit members from MM-SIG - extra eyes, review the records updated in FOLIO Instance = FOLIO. Status: Autumn has provided sample data for music records and serials records with multiple holds and multiple items. Holdings statements to be added. These records are loaded to this groups shared drive. Write up Data Import Job profiles - Kristin Martin? Data Import Job Profile which can import ~ 100 bibs in MARC 21 and create instance, holdings, item (corresponding to the locations we have set up). MM-SIG eyes on the 100 bibs. That these records has the right mix of misc. types, to cover the basic; incl. bound-with. Will ask if MM-SIG member has experience with MFHD. Stanford don’t have MFHD formated holdings data in the current export work flow. Alissa Hafele mentioned that Stanford could probably provide these data. Maybe ~25-50 examples. Load MARC Authority data - any libraries who can contribute this to Snapshot? Alissa will check with Darsi - Stanford has Authority Data but not loaded them yet. Then later we can move on to Order/Order lines data. We will ask the Acquisition SIG (Kristin Martin will take the lead on this), the ERM-SIG to review the entered data.
c. Anonymization script Update: from @Lee Braginsky Right now we have POC, Lee is requesting 2 developers from the coumunity help right a java/sql and folio schema tool to anonymize the data. We have successfully completed the POC on dataset anonymization. I can present results at the next meeting. Yogesh will check with Lee later today. Alissa asked if there is more work to happen beyond the POC. Yogesh said that this is done for 90 % Yogesh/Lee will post the link to the document on anonymization. When Stanford is ready to deliver data, then Yogesh and Lee can set up a meeting. Alissa will check with her colleagues.
|
|
|
We will split it up the the three tracks as listed above. Yogesh will update the Golden copy (Bugfest) section. Charlotte will update the FOLIO Snapshot section (stuck until we get the environment up and running) Lee will update the time line doc for his work on anonymization scripts. |
|
| Should we continue to meet every week? Tod suggest we leave the weekly meetings in the calendar as is. And then check in Friday - to see if to meet following Tuesday.
|