2023-01-25 Meeting notes: workflow management with Prefect

Date

Housekeeping

Discussion items

  1. Using Prefect for managing workflows | Jenn Colt from Cornell


Minutes

  • order workflow
    • prefect is a python based workflow engine
    • use cloud based version | free tier | use hosted version
    • prefect can basically talk to multiple systems?
    • PROOF - local system for ordering
    • retrieve files, Edit MARC for Lehigh; save to S3 | Amazon’s Simple Storage Service
    • S3 is remote storage which is provided by Amazon; It has an API so you can read/write files from it
    • submit to Lehigh App
    • FOLIO data import via mod-copycat
  • PDA
    • load PDA records
    • blacklight request button; link to Prefect
    • can kick-off workflows in different ways, that is one of them
    • passes HRID of instance to be ordered
    • Brooks in chat: How are you all handling your PDA records for discovery? Are you loading them into FOLIO and harvesting them?
      • indexed to Blacklight
  • Cataloger makes metadata corrections in FOLIO
    • instance UUIDs placed in folder by digi pres staff
    • poll for files, export MARC and further steps; send emails
  • Setting holdings in OCLC
    • use LDP; core function of a lot of Cornell's automation
    • working on Coutts sync
  • Owen: Do you know how the ‘watching the s3 folder’ part works? e.g. is a time scheduled check? Or something more pro-active?
    • Jenn will find out
  • Data cleaning
    • Single record imports with OCLC prefixes; remove prefixes
    • reaches out to data export API
    • runs every day
  • LDP
    • reports relevant for every morning
    • template to keep working on
  • Lamda function
    • watches ArchivesSpace for changes
    • not developed in Prefect, but relates to others


  • Kristin: how difficult was it to set this up
    • Jenn is in Technical Services; a lot of work with people in IT
    • no help needed by hosting provider
    • work done outside of FOLIO
    • not plug and play; everything has python script behind it
    • it is a lot of IT work; 
    • frontend workflow engine would be nice to have
    • you can have 3 users with free tier; jump to more users would cost a lot of money
  • Kristin: how much time is saved with Prefect for ACQ workflow
    • Jenn: a lot; 
    • Dung-Lan: Yes, I can see what we were able to do in Voyager with Coutts EDI ordering is replicated in the Prefect process Jenn just described.


  • Owen: what was decision making process for choosing prefect?
    • team has been skilling up in python, so wanted to use a python framework
  • Owen: are the prefect scripts something that can be shared? | re-use with modifications?
    • Jenn: really shareable; built things in chunks
    • all chunks are reusable
  • Owen: on FOLIO side, when interacting with FOLIO; does Prefect log in as a user
    • Jenn: it has users
    • user for Prefect
    • user for discovery
    • possible improvement for FOLIO
  • Martina S: just to confirm: there is - nevertheless how good this works - a need for a UI in FOLIO?
    • Jenn: yes, there is a Prefect UI, you can click and say: run this now
    • there is a connector from FOLIO
  • MartinaS: it seems institutions have found different ways and external tools that fulfill workflow tasks
    • maybe we are rather talking about integarting them all into FOLIO rather than building something completely new in FOLIO
  • Owen:
    • FOLIO community could decide on an external tool to use to integrate with FOLIO
    • building FOLIO communities around different external tools
  • would not be possible to force institutions into payment for specific tools
  • Brooks: implementing a language-agnostic scripting engine for FOLIO would be… a lift
    • I’m not generally a fan of the kitchen sink approach… some things rightfully belong outside FOLIO
  • Owen:
    • talked with GBV about how we integrate GOKb
    • Prefect would be a perfect tool for integrating with external KBs
    • if there was consolidation around some tools (like Prefect), it would make things easier

Chat

18:09:08 Von  Owen Stephens  an  Alle:
    Could you say what Prefect is?
18:09:42 Von  Owen Stephens  an  Alle:
    OK - thanks. So it can basically talk to multiple systems?
18:09:50 Von  Owen Stephens  an  Alle:
    OK - thank you
18:10:11 Von  Kristin Martin  an  Alle:
    What is S3?
18:10:13 Von  Owen Stephens  an  Alle:
    Sorry - CyberDuck?
18:10:14 Von  Brooks Travis  an  Alle:
    Cyberduck FTW!
18:10:31 Von  Brooks Travis  an  Alle:
    Amazon’s Simple Storage Service
18:10:49 Von  Owen Stephens  an  Alle:
    S3 is remote storage which is provided by Amazon
18:11:02 Von  Owen Stephens  an  Alle:
    It has an API so you can read/write files from it
18:11:21 Von  Brooks Travis  an  Alle:
    A lot of other cloud service providers have S3-compatible storage services, as well.
18:11:47 Von  Brooks Travis  an  Alle:
    (Except for Azure 😕)
18:14:32 Von  Brooks Travis  an  Alle:
    How are you all handling your PDA records for discovery? Are you loading them into FOLIO and harvesting them?
18:17:48 Von  Owen Stephens  an  Alle:
    Do you know how the ‘watching the s3 folder’ part works? e.g. is a time scheduled check? Or something more pro-active?
18:19:21 Von  Owen Stephens  an  Alle:
    Thanks - just curious
18:30:51 Von  Dung-Lan Chen  an  Alle:
    Yes, I can see what we were able to do in Voyager with Coutts EDI ordering is replicated in the Prefect process Jenn just described.
18:34:10 Von  Dennis Bridges  an  Alle:
    Just thought I would confirm that “import orders in MARC format” is coming in Orchid 👍🏻
18:37:01 Von  Kristin Martin  an  Alle:
    Right, but import invoices in MARC format is coming when?
18:43:25 Von  Brooks Travis  an  Alle:
    Yeah, because implementing a language-agnostic scripting engine for FOLIO would be… a lift
18:45:03 Von  Brooks Travis  an  Alle:
    I’m not generally a fan of the kitchen sink approach… some things rightfully belong outside FOLIO
18:45:31 Von  Dung-Lan Chen  an  Alle:
    Jenn, I am wondering if you and your Acquisitions folks who handle the Coutts/OASIS ordering process using Prefect and Lehigh App, etc. can do a presentation about the acquisitions part in one of the upcoming ACQ SIG meetings?
18:45:44 Von  Jenn Colt  an  Alle:
    I think so
18:46:24 Von  Dung-Lan Chen  an  Alle:
    Thank you!!
18:50:56 Von  Brooks Travis  an  Alle:
    Providing an external event notification service would be very helpful…
18:50:57 Von  Laura Daniels  an  Alle:
    it's great having you join us, Jenn
18:51:10 Von  Dung-Lan Chen  an  Alle:
    Yes!!
18:51:14 Von  Brooks Travis  an  Alle:
    In FOLIO
18:51:18 Von  Jenn Colt  an  Alle:
    Yeah brooks a list like that -events, auth management, those things

Transcript

Future topics

  • Topic proposal by Owen Stephens for October:
    • Use of shortcut keys and macros for more effective cross-app working  - it also be good to have UX and Stripes/dev knowledge for this discussion I think. I know @Laura (she/they) uses macros so might have insights into the potential for cross-app working
    • Potential for external 'workflow' solutions for cross-app interactions
      • I think 'workflow' is a dangerous term here - in this context it's more about automation than user workflows, although I think there is overlap
      • I was particularly struck by the solution in production at TAMU (Jeremy Huff and Sebastian Hammer presented, the recording is at https://prod-zoom-recordings-openlibraryfoundation-org.s3.amazonaws.com/50dc6c87-3912-43fa-8287-56ec73b12bbb%2Fshared_screen_with_speaker_view%28CC%29.mp4 starting at 3 hrs, 14 min) - I think getting someone from TAMU to talk about how this is used would be v interesting (tick)
      • There was also a presentation on the use of a tool called Airflow at Stanford for "bibliographic workflow" but I've not watched that yet so not 100% sure if it is completely applicable - I think the core use case there was systems migration but it may go beyond that
      • Jenn Colt on using Prefect (tick)
      • does not need to be workflow across apps
  • UX/UI and implementers topics
    • should be Wednesdays
  • Comprehensive look at where data is copied and stored as opposed to live data | how it is represented
  • Date filters and how they work in different apps

Attendees

Present

Name

Home Organization

x

Brooks Travis

EBSCO

x

Charlotte Whitt

Index Data

x

Dennis Bridges

EBSCO

xDung-Lan ChenSkidmore College

Erin NettifeeDuke

Gill Osguthorpe

UX/UI Designer - K-Int


Heather McMillan Thoele

TAMU


Ian Ibbotson

Developer Lead - K-Int


Jag GorayaK-Int
x

Jana Freytag

VZG, Göttingen

xJenn ColtCornell

Khalilah Gambrell

EBSCO


Kimberly PamplinTAMU

Kirstin Kemner-Heek  

VZG, Göttingen

x

Kristin Martin

Chicago

x

Laura Daniels

Cornell


Lloyd Chittenden

Marmot Library Network


Marc JohnsonK-Int
x

Martina Schildt

VZG, Göttingen

x

Martina Tumulla

hbz, Cologne

x

Maura Byrne

Chicago


Mike Gorrell

Index Data

x

Owen Stephens

Product Owner -  Owen Stephens Consulting


Patty Wanninger

Product owner Users app


Rachel A SneedTAMU

Sara ColglazierFive Colleges / Mount Holyoke College Library
xSusanne SchusterBSZ Konstanz

John CoburnEBSCO

Zak BurkeEBSCO

Daniel HuangLehigh

Maccabee LevineLehigh
xRobert ScheierHoly Cross

Action items

  •