Date
Attendees
@Nils Olof Paulsson
Meeting Link
- https://zoom.us/j/591934220
- Password: folio-lsp
Goals
Discussion items
Time | Item | Who | Notes |
---|---|---|---|
Find a note taker | |||
30 | Data Migration tests with Apache Airflow | jpnelson | Apache Airflow by airbnb. A workflow for migrating marc records from Symphony. Using the Okapi inventory-storage, DAG = Directed a... graph Some libraries only import instances, not holdings and items Calling transformers from folio-migration-tools Converts to valid JSON POST the records in bunches of 1,000 Each task in airflow has a log. You can set up retries. DAG has env vars like FOLIO_USER/PASSWD, OKAPI_URL, ... Most time is being spent by posting the records (although running in parallel) We have 2 parallel posting instances. The bottleneck is Okapi. It is not like a Unix pipelining process We get out of memories from Okapi if we increase the number of parallel processes. Jeremy is migrating multiple libraries. Phil: We are using Prefact. Are you using this for processing of data import ? We use it for collections. We use airflow to extract and populate our Solr or Blacklight indexes . Ian: Allowing the the librarians to see what is happening here. To see them what is happening and make changes. Lisa: What if you want to do part of these processes ? You can work with "failed statuses". DAG will stop and continue. You can re-run one step. We have an Alma, a Symphony and a FOLIO integration. All of those are being managed by airflow. Jason: I have similar things going on with Vufind and our Worfklow engine - post Folio migration. Ian: I found that my migration toolkit environment found itself living on post-migration to do data processing jobs Code written in Python, bib_records.py. Transform csv to tsv. Custom Code is on "FOLIO Plugin". Using EBSCO FOLIO migration tools to do that work. FOLIO is just a small plugin. There a plugins for many different system available; they are just given to you. The mapping is being done by the EBSCO transformers. https://github.com/FOLIO-FSE/folio-migration-tools + customer migration app has been checked out inside the airflowbnb Container. Mapping (holdings, items, instances) is in the migration app. Tod: Fantastic to use the folio-migration-tools and NOT re-coding Jason: Awesome to see a container-driven local development environment. 🙂 Theodor: The customization of the migration tools are documented here: https://github.com/FOLIO-FSE/migration_repo_template |