2023-01-25 Meeting notes: workflow management with Prefect
Date
Housekeeping
- Convener and notes: Martina Schildt
- Next meeting:
Discussion items
- Using Prefect for managing workflows | Jenn Colt from Cornell
Minutes
- order workflow
- prefect is a python based workflow engine
- use cloud based version | free tier | use hosted version
- prefect can basically talk to multiple systems?
- PROOF - local system for ordering
- retrieve files, Edit MARC for Lehigh; save to S3 | Amazon’s Simple Storage Service
- S3 is remote storage which is provided by Amazon; It has an API so you can read/write files from it
- submit to Lehigh App
- FOLIO data import via mod-copycat
- PDA
- load PDA records
- blacklight request button; link to Prefect
- can kick-off workflows in different ways, that is one of them
- passes HRID of instance to be ordered
- Brooks in chat: How are you all handling your PDA records for discovery? Are you loading them into FOLIO and harvesting them?
- indexed to Blacklight
- Cataloger makes metadata corrections in FOLIO
- instance UUIDs placed in folder by digi pres staff
- poll for files, export MARC and further steps; send emails
- Setting holdings in OCLC
- use LDP; core function of a lot of Cornell's automation
- working on Coutts sync
- Owen: Do you know how the ‘watching the s3 folder’ part works? e.g. is a time scheduled check? Or something more pro-active?
- Jenn will find out
- Data cleaning
- Single record imports with OCLC prefixes; remove prefixes
- reaches out to data export API
- runs every day
- LDP
- reports relevant for every morning
- template to keep working on
- Lamda function
- watches ArchivesSpace for changes
- not developed in Prefect, but relates to others
- Kristin: how difficult was it to set this up
- Jenn is in Technical Services; a lot of work with people in IT
- no help needed by hosting provider
- work done outside of FOLIO
- not plug and play; everything has python script behind it
- it is a lot of IT work;
- frontend workflow engine would be nice to have
- you can have 3 users with free tier; jump to more users would cost a lot of money
- Kristin: how much time is saved with Prefect for ACQ workflow
- Jenn: a lot;
- Dung-Lan: Yes, I can see what we were able to do in Voyager with Coutts EDI ordering is replicated in the Prefect process Jenn just described.
- Owen: what was decision making process for choosing prefect?
- team has been skilling up in python, so wanted to use a python framework
- Owen: are the prefect scripts something that can be shared? | re-use with modifications?
- Jenn: really shareable; built things in chunks
- all chunks are reusable
- Owen: on FOLIO side, when interacting with FOLIO; does Prefect log in as a user
- Jenn: it has users
- user for Prefect
- user for discovery
- possible improvement for FOLIO
- Martina S: just to confirm: there is - nevertheless how good this works - a need for a UI in FOLIO?
- Jenn: yes, there is a Prefect UI, you can click and say: run this now
- there is a connector from FOLIO
- MartinaS: it seems institutions have found different ways and external tools that fulfill workflow tasks
- maybe we are rather talking about integarting them all into FOLIO rather than building something completely new in FOLIO
- Owen:
- FOLIO community could decide on an external tool to use to integrate with FOLIO
- building FOLIO communities around different external tools
- would not be possible to force institutions into payment for specific tools
- Brooks: implementing a language-agnostic scripting engine for FOLIO would be… a lift
- I’m not generally a fan of the kitchen sink approach… some things rightfully belong outside FOLIO
- Owen:
- talked with GBV about how we integrate GOKb
- Prefect would be a perfect tool for integrating with external KBs
- if there was consolidation around some tools (like Prefect), it would make things easier
Chat
18:09:08 Von Owen Stephens an Alle:
Could you say what Prefect is?
18:09:42 Von Owen Stephens an Alle:
OK - thanks. So it can basically talk to multiple systems?
18:09:50 Von Owen Stephens an Alle:
OK - thank you
18:10:11 Von Kristin Martin an Alle:
What is S3?
18:10:13 Von Owen Stephens an Alle:
Sorry - CyberDuck?
18:10:14 Von Brooks Travis an Alle:
Cyberduck FTW!
18:10:31 Von Brooks Travis an Alle:
Amazon’s Simple Storage Service
18:10:49 Von Owen Stephens an Alle:
S3 is remote storage which is provided by Amazon
18:11:02 Von Owen Stephens an Alle:
It has an API so you can read/write files from it
18:11:21 Von Brooks Travis an Alle:
A lot of other cloud service providers have S3-compatible storage services, as well.
18:11:47 Von Brooks Travis an Alle:
(Except for Azure 😕)
18:14:32 Von Brooks Travis an Alle:
How are you all handling your PDA records for discovery? Are you loading them into FOLIO and harvesting them?
18:17:48 Von Owen Stephens an Alle:
Do you know how the ‘watching the s3 folder’ part works? e.g. is a time scheduled check? Or something more pro-active?
18:19:21 Von Owen Stephens an Alle:
Thanks - just curious
18:30:51 Von Dung-Lan Chen an Alle:
Yes, I can see what we were able to do in Voyager with Coutts EDI ordering is replicated in the Prefect process Jenn just described.
18:34:10 Von Dennis Bridges an Alle:
Just thought I would confirm that “import orders in MARC format” is coming in Orchid 👍🏻
18:37:01 Von Kristin Martin an Alle:
Right, but import invoices in MARC format is coming when?
18:43:25 Von Brooks Travis an Alle:
Yeah, because implementing a language-agnostic scripting engine for FOLIO would be… a lift
18:45:03 Von Brooks Travis an Alle:
I’m not generally a fan of the kitchen sink approach… some things rightfully belong outside FOLIO
18:45:31 Von Dung-Lan Chen an Alle:
Jenn, I am wondering if you and your Acquisitions folks who handle the Coutts/OASIS ordering process using Prefect and Lehigh App, etc. can do a presentation about the acquisitions part in one of the upcoming ACQ SIG meetings?
18:45:44 Von Jenn Colt an Alle:
I think so
18:46:24 Von Dung-Lan Chen an Alle:
Thank you!!
18:50:56 Von Brooks Travis an Alle:
Providing an external event notification service would be very helpful…
18:50:57 Von Laura Daniels an Alle:
it's great having you join us, Jenn
18:51:10 Von Dung-Lan Chen an Alle:
Yes!!
18:51:14 Von Brooks Travis an Alle:
In FOLIO
18:51:18 Von Jenn Colt an Alle:
Yeah brooks a list like that -events, auth management, those things
Transcript
Future topics
- Topic proposal by Owen Stephens for October:
- Use of shortcut keys and macros for more effective cross-app working - it also be good to have UX and Stripes/dev knowledge for this discussion I think. I know @Laura (she/they) uses macros so might have insights into the potential for cross-app working
- Potential for external 'workflow' solutions for cross-app interactions
- I think 'workflow' is a dangerous term here - in this context it's more about automation than user workflows, although I think there is overlap
- I was particularly struck by the solution in production at TAMU (Jeremy Huff and Sebastian Hammer presented, the recording is at https://prod-zoom-recordings-openlibraryfoundation-org.s3.amazonaws.com/50dc6c87-3912-43fa-8287-56ec73b12bbb%2Fshared_screen_with_speaker_view%28CC%29.mp4 starting at 3 hrs, 14 min) - I think getting someone from TAMU to talk about how this is used would be v interesting
- There was also a presentation on the use of a tool called Airflow at Stanford for "bibliographic workflow" but I've not watched that yet so not 100% sure if it is completely applicable - I think the core use case there was systems migration but it may go beyond that
- Jenn Colt on using Prefect
- does not need to be workflow across apps
- UX/UI and implementers topics
- should be Wednesdays
- Comprehensive look at where data is copied and stored as opposed to live data | how it is represented
- Date filters and how they work in different apps
Attendees
Present | Name | Home Organization |
---|---|---|
x | Brooks Travis | EBSCO |
x | Charlotte Whitt | Index Data |
x | Dennis Bridges | EBSCO |
x | Dung-Lan Chen | Skidmore College |
Erin Nettifee | Duke | |
Gill Osguthorpe | UX/UI Designer - K-Int | |
Heather McMillan Thoele | TAMU | |
Ian Ibbotson | Developer Lead - K-Int | |
Jag Goraya | K-Int | |
x | Jana Freytag | VZG, Göttingen |
x | Jenn Colt | Cornell |
Khalilah Gambrell | EBSCO | |
Kimberly Pamplin | TAMU | |
Kirstin Kemner-Heek | VZG, Göttingen | |
x | Kristin Martin | Chicago |
x | Laura Daniels | Cornell |
Lloyd Chittenden | Marmot Library Network | |
Marc Johnson | K-Int | |
x | Martina Schildt | VZG, Göttingen |
x | Martina Tumulla | hbz, Cologne |
x | Maura Byrne | Chicago |
Mike Gorrell | Index Data | |
x | Owen Stephens | Product Owner - Owen Stephens Consulting |
Patty Wanninger | Product owner Users app | |
Rachel A Sneed | TAMU | |
Sara Colglazier | Five Colleges / Mount Holyoke College Library | |
x | Susanne Schuster | BSZ Konstanz |
John Coburn | EBSCO | |
Zak Burke | EBSCO | |
Daniel Huang | Lehigh | |
Maccabee Levine | Lehigh | |
x | Robert Scheier | Holy Cross |