2023-01-25 Meeting notes: workflow management with Prefect

2023-01-25 Meeting notes: workflow management with Prefect

Date

Jan 25, 2023

Housekeeping

  • Convener and notes: @Martina Schildt

  • Next meeting: Jan 30, 2023

Discussion items

  1. Using Prefect for managing workflows | @Jenn Colt from Cornell

 

Minutes

  • order workflow

    • prefect is a python based workflow engine

    • use cloud based version | free tier | use hosted version

    • prefect can basically talk to multiple systems?

    • PROOF - local system for ordering

    • retrieve files, Edit MARC for Lehigh; save to S3 | Amazon’s Simple Storage Service

    • S3 is remote storage which is provided by Amazon; It has an API so you can read/write files from it

    • submit to Lehigh App

    • FOLIO data import via mod-copycat

  • PDA

    • load PDA records

    • blacklight request button; link to Prefect

    • can kick-off workflows in different ways, that is one of them

    • passes HRID of instance to be ordered

    • Brooks in chat: How are you all handling your PDA records for discovery? Are you loading them into FOLIO and harvesting them?

      • indexed to Blacklight

  • Cataloger makes metadata corrections in FOLIO

    • instance UUIDs placed in folder by digi pres staff

    • poll for files, export MARC and further steps; send emails

  • Setting holdings in OCLC

    • use LDP; core function of a lot of Cornell's automation

    • working on Coutts sync

  • Owen: Do you know how the ‘watching the s3 folder’ part works? e.g. is a time scheduled check? Or something more pro-active?

    • Jenn will find out

  • Data cleaning

    • Single record imports with OCLC prefixes; remove prefixes

    • reaches out to data export API

    • runs every day

  • LDP

    • reports relevant for every morning

    • template to keep working on

  • Lamda function

    • watches ArchivesSpace for changes

    • not developed in Prefect, but relates to others

 

  • Kristin: how difficult was it to set this up

    • Jenn is in Technical Services; a lot of work with people in IT

    • no help needed by hosting provider

    • work done outside of FOLIO

    • not plug and play; everything has python script behind it

    • it is a lot of IT work; 

    • frontend workflow engine would be nice to have

    • you can have 3 users with free tier; jump to more users would cost a lot of money

  • Kristin: how much time is saved with Prefect for ACQ workflow

    • Jenn: a lot; 

    • Dung-Lan: Yes, I can see what we were able to do in Voyager with Coutts EDI ordering is replicated in the Prefect process Jenn just described.

 

  • Owen: what was decision making process for choosing prefect?

    • team has been skilling up in python, so wanted to use a python framework

  • Owen: are the prefect scripts something that can be shared? | re-use with modifications?

    • Jenn: really shareable; built things in chunks

    • all chunks are reusable

  • Owen: on FOLIO side, when interacting with FOLIO; does Prefect log in as a user

    • Jenn: it has users

    • user for Prefect

    • user for discovery

    • possible improvement for FOLIO

  • Martina S: just to confirm: there is - nevertheless how good this works - a need for a UI in FOLIO?

    • Jenn: yes, there is a Prefect UI, you can click and say: run this now

    • there is a connector from FOLIO

  • MartinaS: it seems institutions have found different ways and external tools that fulfill workflow tasks

    • maybe we are rather talking about integarting them all into FOLIO rather than building something completely new in FOLIO

  • Owen:

    • FOLIO community could decide on an external tool to use to integrate with FOLIO

    • building FOLIO communities around different external tools

  • would not be possible to force institutions into payment for specific tools

  • Brooks: implementing a language-agnostic scripting engine for FOLIO would be… a lift

    • I’m not generally a fan of the kitchen sink approach… some things rightfully belong outside FOLIO

  • Owen:

    • talked with GBV about how we integrate GOKb

    • Prefect would be a perfect tool for integrating with external KBs

    • if there was consolidation around some tools (like Prefect), it would make things easier

Chat

18:09:08 Von  Owen Stephens  an  Alle:
    Could you say what Prefect is?
18:09:42 Von  Owen Stephens  an  Alle:
    OK - thanks. So it can basically talk to multiple systems?
18:09:50 Von  Owen Stephens  an  Alle:
    OK - thank you
18:10:11 Von  Kristin Martin  an  Alle:
    What is S3?
18:10:13 Von  Owen Stephens  an  Alle:
    Sorry - CyberDuck?
18:10:14 Von  Brooks Travis  an  Alle:
    Cyberduck FTW!
18:10:31 Von  Brooks Travis  an  Alle:
    Amazon’s Simple Storage Service
18:10:49 Von  Owen Stephens  an  Alle:
    S3 is remote storage which is provided by Amazon
18:11:02 Von  Owen Stephens  an  Alle:
    It has an API so you can read/write files from it
18:11:21 Von  Brooks Travis  an  Alle:
    A lot of other cloud service providers have S3-compatible storage services, as well.
18:11:47 Von  Brooks Travis  an  Alle:
    (Except for Azure 😕)
18:14:32 Von  Brooks Travis  an  Alle:
    How are you all handling your PDA records for discovery? Are you loading them into FOLIO and harvesting them?
18:17:48 Von  Owen Stephens  an  Alle:
    Do you know how the ‘watching the s3 folder’ part works? e.g. is a time scheduled check? Or something more pro-active?
18:19:21 Von  Owen Stephens  an  Alle:
    Thanks - just curious
18:30:51 Von  Dung-Lan Chen  an  Alle:
    Yes, I can see what we were able to do in Voyager with Coutts EDI ordering is replicated in the Prefect process Jenn just described.
18:34:10 Von  Dennis Bridges  an  Alle:
    Just thought I would confirm that “import orders in MARC format” is coming in Orchid 👍🏻
18:37:01 Von  Kristin Martin  an  Alle:
    Right, but import invoices in MARC format is coming when?
18:43:25 Von  Brooks Travis  an  Alle:
    Yeah, because implementing a language-agnostic scripting engine for FOLIO would be… a lift
18:45:03 Von  Brooks Travis  an  Alle:
    I’m not generally a fan of the kitchen sink approach… some things rightfully belong outside FOLIO
18:45:31 Von  Dung-Lan Chen  an  Alle:
    Jenn, I am wondering if you and your Acquisitions folks who handle the Coutts/OASIS ordering process using Prefect and Lehigh App, etc. can do a presentation about the acquisitions part in one of the upcoming ACQ SIG meetings?
18:45:44 Von  Jenn Colt  an  Alle:
    I think so
18:46:24 Von  Dung-Lan Chen  an  Alle:
    Thank you!!
18:50:56 Von  Brooks Travis  an  Alle:
    Providing an external event notification service would be very helpful…
18:50:57 Von  Laura Daniels  an  Alle:
    it's great having you join us, Jenn
18:51:10 Von  Dung-Lan Chen  an  Alle:
    Yes!!
18:51:14 Von  Brooks Travis  an  Alle:
    In FOLIO
18:51:18 Von  Jenn Colt  an  Alle:
    Yeah brooks a list like that -events, auth management, those things

Transcript

Future topics

  • Topic proposal by @Owen Stephens for October:

    • Use of shortcut keys and macros for more effective cross-app working  - it also be good to have UX and Stripes/dev knowledge for this discussion I think. I know @Laura (she/they) uses macros so might have insights into the potential for cross-app working

    • Potential for external 'workflow' solutions for cross-app interactions

      • I think 'workflow' is a dangerous term here - in this context it's more about automation than user workflows, although I think there is overlap

      • I was particularly struck by the solution in production at TAMU (Jeremy Huff and Sebastian Hammer presented, the recording is at https://prod-zoom-recordings-openlibraryfoundation-org.s3.amazonaws.com/50dc6c87-3912-43fa-8287-56ec73b12bbb%2Fshared_screen_with_speaker_view%28CC%29.mp4 starting at 3 hrs, 14 min) - I think getting someone from TAMU to talk about how this is used would be v interesting

      • There was also a presentation on the use of a tool called Airflow at Stanford for "bibliographic workflow" but I've not watched that yet so not 100% sure if it is completely applicable - I think the core use case there was systems migration but it may go beyond that

      • Jenn Colt on using Prefect

      • does not need to be workflow across apps

  • UX/UI and implementers topics

    • should be Wednesdays

  • Comprehensive look at where data is copied and stored as opposed to live data | how it is represented

  • Date filters and how they work in different apps

Attendees

Present

Name

Home Organization

Present

Name

Home Organization

x

Brooks Travis

EBSCO

x

Charlotte Whitt

Index Data

x

Dennis Bridges

EBSCO

x

Dung-Lan Chen

Skidmore College

 

Erin Nettifee

Duke

 

Gill Osguthorpe

UX/UI Designer - K-Int

 

Heather McMillan Thoele

TAMU

 

Ian Ibbotson

Developer Lead - K-Int

 

Jag Goraya

K-Int

x

Jana Freytag

VZG, Göttingen

x

Jenn Colt

Cornell

 

Khalilah Gambrell

EBSCO

 

Kimberly Pamplin

TAMU

 

Kirstin Kemner-Heek  

VZG, Göttingen

x

Kristin Martin

Chicago

x

Laura Daniels

Cornell

 

Lloyd Chittenden

Marmot Library Network

 

Marc Johnson

K-Int

x

Martina Schildt

VZG, Göttingen

x

Martina Tumulla

hbz, Cologne

x

Maura Byrne

Chicago

 

Mike Gorrell

Index Data

x

Owen Stephens

Product Owner -  Owen Stephens Consulting

 

Patty Wanninger

Product owner Users app

 

Rachel A Sneed

TAMU

 

Sara Colglazier

Five Colleges / Mount Holyoke College Library

x

Susanne Schuster

BSZ Konstanz

 

John Coburn

EBSCO

 

Zak Burke

EBSCO

 

Daniel Huang

Lehigh

 

Maccabee Levine

Lehigh

x

Robert Scheier

Holy Cross

Action items