Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

TimeItemWhoNotes
5 minAnnouncementsAll

WOLFcon early bird registration open until June 18th. Call for proposals is open until June 5th; aiming to have an agenda ready before the early bird registration closes. Draft proposals are okay; they can be revised later.

FOLIO Council election results should be coming out soon after the Community Council meeting on Monday.

70 min

Data Import: Current State to Desired State

Background documents:

ARLEF Data Import Report, 2023-04-12

EBSCO's Report to the ARLEF Report, 2023-04-24

Report on Data Import, by Corrie Hutchinson, 2023-05-05

EBSCO's Update on Data Import Troubleshooting, 2023-05-23

From Data Import/MM:

Desired outcomes:

  • Shared understanding of severe shortcomings of data import
  • Shared understanding of next steps for addressing data import
  • Shared understanding of timeline for improvement
  • Determination of how data import remediation will be prioritized

This comes out of various documents and the meeting in Stanford, and includes two reports from EBSCO.

Problem statement: we don't have a reliable and performant solution for loading records to support daily workflows and monthly/quarterly workflows. (Leaving aside system migrations for the moment.) Record loads of less than 1000 records can timeout. Libraries are unable to manage basic workflows such as loading electronic records. FOLIO users must plan to work off-hours to complete import jobs.

Slide 6 describes in-progress efforts across these topics: architecture, infrastructure, development, and product management.

"Large" loads are about 100,000 records, and there need to be plans to address loads of that size and much smaller sizes. Also noting that some types of loads are more performant than other types, so it isn't necessarily about the number of records. "Chunking" can also allow for smaller loads to run interspersed with a larger load. (Single record imports jump the queue right now.)

In addition to the documentation prepared, interviews with current users of FOLIO help make the issues more understandable. (There is a governance question of who "customer" or user is, and if it is inclusive of all hosting providers and those not using a hosting provider.)

Performance is one issue, as is the capabilities of the logic that is a part of data import.

Questions from the meeting include how bugs are prioritized and whether there can be a dashboard of work that is happening on Data Import. (A dashboard is in the works.) Libraries need to be able to know when the functionality will be implemented so they can plan for their local workflows, including when functionality is deemed out-of-scope.

Short term: complete work in progress and address critical production needs (continue performance improvements and reliability/scalability improvements). Collaborate with other hosting providers and self-hosted institutions. Provide realistic performance benchmarks. Define an approach for addressing failed records and logging issues.

Mid-term: review the data import roadmap; continue architecture and infrastructure improvements; development to address failed records and logging issues.

Longer-term (12 months): continue architecture, infrastructure, and development improvements

Proposal for release changes: extend the Poppy release to the current Q-release date (November 2023). There is functionality (other than data import) that is not ready that is driving this decision.


PPT - Data_Import_Improvements.pptx

10 min

Proposal for combined Poppy/Quesnelia release

See notes from Release Management Stakeholders


Further discussion to happen over Slack, and aiming for a decision to be made on Monday.
5 minAgenda topics

...

00:03:35    Tod Olson:    I think also "good middle of the night" for a few people.
00:05:08    Kristin Martin:    Earlybird through June 18
00:08:15    Kristin Martin:    https://wikifolio-org.folioatlassian.orgnet/wiki/display/PC/2023-06-01+Product+Council+Meeting+notes
00:14:04    Christopher Spalding (EBSCO):    Kristin, we’ll send a link to this deck so it can be added to the documentation.
00:14:36    Kristin Martin:    Reacted to "Kristin, we’ll send ..." with 👍
00:31:21    Charlotte Whitt:    Those P1/P2 bug tickets are they all tagged with the label `support` - and monitored by the Support SIG?
00:31:48    Erin Nettifee:    Related - is there a jira dashboard for this work in progress that could be shared?
00:32:09    Charlotte Whitt:    Reacted to "Related - is there a..." with 💯
00:33:56    Erin Nettifee:    Yes - a board would help with transparency of the work
00:36:03    Gang Zhou | SHL:    Reacted to "Yes - a board would ..." with 👍
00:36:05    Jennifer Eustis:    Reacted to "Related - is there a..." with 💯
00:36:25    Jenn Colt:    It does. I thought it was already there.
00:36:49    Jennifer Eustis:    I thought it was already there as well. It's to allow the single record imports/overlays to go through
00:37:29    Jenn Colt:    Closed https://issuesfolio-org.folioatlassian.orgnet/browse/MODSOURMAN-662
00:37:54    Felix Hemme:    Why can't multiple jobs run in parallel?
00:38:24    Mark Veksler:    Replying to "I thought it was alr..."

...

This is happening for us in Nolana. Single record imports will process while a data import file is also processing.
00:39:49    Felix Hemme:    Reacted to "This is happening fo..." with 👍🏼
00:40:29    Jennifer Eustis:    That would be great.
00:40:45    Laura Daniels:    Reacted to "That would be great." with âž•
00:40:50    Colin V. (he/him):    Reacted to "That would be great." with âž•
00:40:50    Lynn Gullickson Spencer:    Reacted to "That would be great." with âž•
00:40:53    Erin Nettifee:    Is there a document anywhere that lists the requirements for data import and what has been implemented and what hasn't? I'm aware of https://wikifolio-org.folioatlassian.orgnet/wiki/display/FOLIOtips/Data+Import+Functionality+Implementation+Table but it is so high-level, and where the issues really come out is in the nitty gritty of what data import needs to be able to do.
00:41:08    Lloyd (He/Him):    100k is the most I would want to do at one time anyway. More would choke my discovery system.
00:41:11    Christopher Spalding (EBSCO):    Replying to "Why can't multiple j..."

...

That is what is missing - the list of requirements that the app is supposed to meet. Without that, it is very confusing when someone is looking at a thing that doesn't work. You can't know where to start if you don't know if the app is supposed to be able to do the thing that didn't work.
00:56:26    Thomas Trutt:    Reacted to "That is what is miss..." with 👍🏻
00:56:33    Corrie Hutchinson:    Reacted to "That is what is miss..." with 👍🏻
00:57:48    Thomas Trutt:    But that flow logic sounds like a base requirement.
01:00:21    Jenn Colt:    Burn it down is always appealing in theory
01:01:28    Steph Buck:    https://wikifolio-org.folioatlassian.orgnet/wiki/pages/viewpage.action?spaceKey=DQA&title=Defect+Priority+Definition+for+Functional+Issues
01:01:50    Aleksey Petrenko:    Reacted to "https://wiki.folio.o..." with 👍
01:01:54    Thomas Trutt:    ðŸ™‚ Would it make sense to simplify and streamline DI and allow for plugins, that add functionality, instead of trying to put all the features into one container?? (Just throwing thoughts out there..)
01:01:59    Thomas Trutt:    Reacted to "Burn it down is alwa..." with 😃
01:03:09    Jenn Colt:    Replying to "🙂 Would it make sen..."

...