Skip to end of banner
Go to start of banner

2022-02-25 - Sys Ops & Management SIG Agenda and Notes

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Date

Attendees

Ingolf Kuss

Lisa Sjögren (EBSCO) 

Kyle Banerjee 

Tod Olson 

Hkaplanian 

jpnelson 

@Nils Olof Paulsson

Brandon Tharp 

Anton Emelianov (Deactivated) 

Steffen Köhler 

jroot 

Florian Gleixner 

Ian Walls 

Philip Robinson 

Goals

Discussion items

TimeItemWhoNotes

Find a note taker

30Data Migration tests with Apache Airflowjpnelson 

Apache Airflow by airbnb.

A  workflow for migrating marc records from Symphony. Using the Okapi inventory-storage,

DAG = Directed a... graph

Some libraries only import instances, not holdings and items

Calling transformers from folio-migration-tools

Converts to valid JSON

POST the records in bunches of 1,000

Each task in airflow has a log. You can set up retries.

DAG has env vars like FOLIO_USER/PASSWD, OKAPI_URL, ...

Most time is being spent by posting the records (although running in parallel)

We have 2 parallel posting instances.

The bottleneck is Okapi.

It is not like a Unix pipelining process

We get out of memories from Okapi if we increase the number of parallel processes.

Jeremy is migrating multiple libraries.

Phil: We are using Prefact. 

Are you using this for processing of data import ? We use it for collections. We use airflow to extract and populate our Solr or Blacklight indexes .

Ian: Allowing the the librarians to see what is happening here. To see them what is happening and make changes.

Lisa: What if you want to do part of these processes ? You can work with "failed statuses". DAG will stop and continue. You can re-run one step. We have an Alma, a Symphony and a FOLIO integration. All of those are being managed by airflow.

Jason: I have similar things going on with Vufind and our Worfklow engine - post Folio migration.

Ian: I found that my migration toolkit environment found itself living on post-migration to do data processing jobs

Code written in Python, bib_records.py. Transform csv to tsv. Custom Code is on "FOLIO Plugin". Using EBSCO FOLIO migration tools to do that work.

FOLIO is just a small plugin. There a plugins for many different system available; they are just given to you.

The mapping is being done by the EBSCO transformers.



Action items

  •  


  • No labels