Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Time

Item

Notes

  1. Letter to Risk Office at Michigan State University Libraries

  1. Follow up on activities since last meeting:

    1. Update on - Golden copy for Bugfest environment.

    2. Update on - Work on a small data set for FOLIO Snapshot.

    3. Update on POC findings - write robust data anonymization scripts

Continue talk/input on Autumn Faulkner's letter to Risk Office at Michigan State University LibrariesThe topic was on the agenda for today’s meeting with the OLF Officers. The plan was to attend and present the purpose of the letter and our draft document.

We missed it - Autumn Faulkner is to forward the letter to Simeon Warner and then we will have the conversation on Slack in stead.

Kristin sent Folio and OLF images for the letterhead.

Autumn Faulkner- We need to finalize the Google Doc with the revised version of the letter to be brought to the Folio Officers meeting on October 15th. It would be wonderful if you could attend the meeting.

Autumn will get a revised version of the letter.

A couple other questions from my Dean re: technical processes.

Michigan State University Libraries will aim for providing data from Inventory and anonymized data from other domains.

Autumn Faulkner asks if Yogesh and Lee can provide more details about the scrambling tools.

Discussion:

The team spent two days on the POC, using the JavaFaker framework. They replaced all the PII pieces in patrons and users and scrambled the vendors and loan history. Tool is WIP, we need community developer help to write a tool that can live in the folio-org repository for all libraries to use.

Tod raised a Question: How can we get support from the community to work on the anonymization project? Can the community council help? Kristin is to check with the council and may present at the Oct 28th meeting.

a. Golden Copy / Yogesh Kumar

The environment was created by the Kitfox team and updated by QA. The Kitfox team is now working on taking a backup.

Kristin and Charlotte will review this environment and suggest any data corrections.

One suggestion made is to remove duplicate records.

Link to wiki - fixing the issues listed here /wiki/spaces/DQA/pages/203685917

Has gotten some feed back from POs, but not enough.

Jira tickets has been written up for KitFox team - Sprint 199. Jira ticket: BF-752 (Epic).

Will then be what we will use as copy for Bugfest Ramsons. Expect to have this done by end of September.

b. Snapshot environment / . Update from Charlotte Whitt - Waiting : Still waiting for the environment to be built. Charlotte has reached out to ID Devops re. adding data to - monthly build environment: https://folio-quesnelia.dev.folio.org/ (the link is not up and running yet)

Index Data’s DEVOPS are working on this as we speak:

FOLIO-4071Create Quesnelia reference environment
  • The environment will be in Quesnelia

  • Data will not be overwritten.

  • Charlotte will provide access to Autumn, Kristin and SMEs helping out.

Charlotte to fix the 36 instances with Instance source = FOLIO - and will do this as soon as the environment is ready to be built. A new DevOps started at Index Data yesterday, so hopefully this work will be picked up.

Recruit members from MM-SIG - extra eyes, review the records updated in FOLIO Instance = FOLIO.

Status:

Autumn has a music cataloger back ground. Will find records in OCLC. Will aim for next weekprovided sample data for music records and serials records with multiple holds and multiple items. Holdings statements to be added.

These records are loaded to this groups shared drive.

Also include Serial records, with multiple holdings each with multiple items. Add holdings statements.

Write up Data Import Job profiles - Kristin Martin?

Data Import Job Profile which can import ~ 100 bibs in MARC 21 and create instance, holdings, item (corresponding to the locations we have set up). MM-SIG eyes on the 100 bibs. That these records has the right mix of misc. types, to cover the basic; incl. bound-with.

/wiki/spaces/DQA/pages/203685917

Will ask if MM-SIG member has experience with MFHD.

Stanford has this. Charlotte will check in with Darsi and Alexis MFHD formated holdings data - Alissa Hafele.

Load MARC Authority data - this task we will come back to after WOLFcon. any libraries who can contribute this to Snapshot?

Then later we can move on to Order/Order lines data. We will ask the Acquisition SIG (Kritin Martin will take the lead on this), the ERM-SIG to review the entered data.

c. Anonymization script

Update: from Lee Braginsky

RIght Right now we have POC, Lee is requesting 2 developers from the coummnity coumunity help right a java/sql and folio schema tool to anonymize the data.

We have successfully completed the POC on dataset anonymization. I can present results at the next meeting. Next steps - need 2 Java developers, with SQL and folio schema knowledge to help develop the tool.

Status on writing robust data anonymization scripts / Lee Braginsky. Did get a university to provide data, which will be anonymized/scrambled

  • Replace PII with randomly generated data

  • Scramble loan history

  • Scramble orders, invoice amounts, fund codes

  • Replace vendor names with randomized names

  • Strip out staff notes with initials, etc.

  • One set of data for the general environment, and perhaps a second sample set for the ECS environment

    • Get this from a consortium!

  1. Review timeline document

  • Draft - timeline document -

    Lref gdrive file
    urlhttps://docs.google.com/spreadsheets/d/18jLjWHO-sVTAGq7c6HUyW2COxvFvY76rTulMyg-7xuo/edit?gid=0#gid=0

We will split it up the the three tracks as listed above.

Yogesh will update the Golden copy (Bugfest) section.

Charlotte will update the FOLIO Snapshot section (stuck until we get the environment up and running)

Lee will update the time line doc for his work on anonymization scripts.

...