Batch Importer (Bib/Acq) (UXPROD-47)

[UXPROD-4440] Control OCLC import or overlay (edge-connection or Data Import solution?) Created: 09/Apr/21  Updated: 28/Dec/23

Status: Draft
Project: UX Product
Components: None
Affects versions: None
Fix versions: TBD
Parent: Batch Importer (Bib/Acq)

Type: New Feature Priority: TBD
Reporter: Adam Dickmeiss Assignee: Ryan Taylor
Resolution: Unresolved Votes: 0
Labels: back-end, data-import, delegate_candidate, epam-folijet, inventory-single-record-import
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Defines
defines UXPROD-47 Batch Importer (Bib/Acq) Analysis Complete
Release: Not Scheduled
Epic Link: Batch Importer (Bib/Acq)
Development Team: Folijet
PO Rank: 0

 Description   

So far edge-connexion has used the create single record profile hence only allowing creation of new records.

We have two options here:

  • edge-connexion solution: read the OCLC number from the incoming MARC (from 035), search mod-inventory-storage for instances where identifier =/@identifierType=oclc 'xxxxx' and if a record with the number exists (only one, more than one fail) pass the Instance UUID to mod-copycat (so that DI updates rather than creates a new record). Alternatively to avoid introducing mod-inventory-storage dependency in edge-connexion, we can do the "match" check in mod-copycat
  • merge create and update profiles in Data Import: a single profile that automatically controls if records are created or updated would make this simpler


 Comments   
Comment by Adam Dickmeiss [ 09/Apr/21 ]

If SRS can not do this "OCLC import profile" that looks in 035 and does the right thing (tm).. then we'll have to do it.. and it's probably best to extend both mod-copycat and edge-connexion a bit.  I have the feeling that this could be useful for non-OCLC cases too. It's a step towards mod-copycat doing matching.. Which IMHO seems like a duplicate of what SRS does. It is doable and self-contained and can be done in a day.

 

Example: import for mod-copycat could benefit too (if you search for an OCLC identifier that already exists).. Not to mention that if it does NOT do that, the edge-connexion import will fail (if there are multiple records with same OCLC identifier!).

Comment by Jakub Skoczen [ 12/Apr/21 ]

Adam Dickmeiss Mike Taylor Wayne Schneider Agree that it makes sense to implement matching in mod-copycat rather than edge-connexion. Would we make the "action" (create vs update) explicit in the /copycat/import request? Right now the action is implicit depending on whether the internalIdentifier is provided or not, correct?

I guess there are two ways to go about it:

  • add action parameter to the /copycat//import request, let the client (UI or edge-connexion) control the action and, effectively, selection of the DI profile
  • choose the action automatically: look up instances in Inventory via the externalIdentifier:
    1. if 0 found, perform "create"
    2. if 1 found, perform "update"
    3, if >1 found, return an error
Comment by Mike Taylor [ 12/Apr/21 ]

Right now the action is implicit depending on whether the internalIdentifier is provided or not, correct?

Correct.

But I don't see how that's not explicit.

If we introduce an action parameter too, then we have two new error cases to deal with: action=update with no ID specified, and action=create with an ID specified. I think the API is fine as it is: no ID => create, ID => update.

Comment by Jakub Skoczen [ 12/Apr/21 ]

Mike Taylor Not quite. If internalIdentifer is specified then, yes, we could continue with an 'update, in all circumstances'. But the more interesting case here is if the internalIdentifer is not specified. Right now it always means 'create' but we don't attempt any matching on on externalIdentifier. With "matching" it would become:

  • no internalIdentifier AND externalIdentifier matches == 0 records -> create
  • no internalIdentifier AND externalIdentifier matches == 1 records -> update
  • no internalIdentifier AND externalIdentifier matches > 1 records -> failure

The outcome get a little messy, depending on the values of inputs.

Comment by Adam Dickmeiss [ 12/Apr/21 ]

Something somewhere needs to do a match on OCLC number. It's not possible at the moment. Regardless of where it's done, ideally it should be atomic. If not, here's what's going to happen.

  1. somebody does a push ..
  2. The record is parsed and OCLC number is checked. It's not found.
  3. import is process.
  4. somebody does a push on same record.
  5. the record is parsed and OCLC number is checked. It's not found.
  6. 2nd import in progress.
  7. 1st import eventually hits mod-inventory-storage (may take minutes)
  8. 2nd import eventually hits mod-inventory-storage

The system now has two records with same OCLC record number import can no longer be performed.

Comment by Jakub Skoczen [ 12/Apr/21 ]

Adam Dickmeiss Good point on the atomicity. Could we use a vert.x distributed Lock on the ID from mod-copycat for that?

Comment by Molly Driscoll [ 11/Feb/22 ]

Autumn Faulkner, Timothy Watters (please feel free to expand on these use cases)

Michigan State and the Library of Michigan would like to be able to customize the job profile that is used when pushing from Connexion. This would support the ability to overlay, as well as additional actions, like creating a corresponding holding and item from the bibliographic data.

Kay Granskog Shawn Nicholson I'm not sure that this is a rankable feature, but if you want to comment on the ranking for MSU-LM, please do.

Comment by Molly Driscoll [ 11/Feb/22 ]

Janet Baldwin this is the feature that addresses your questions about whether "OCLC Push" can overlay - please feel free to add comments!

Comment by Autumn Faulkner [ 11/Feb/22 ]

Apologies for what is likely to be a use case novella, but I hope it's helpful in understanding the need for this functionality.

So let's say my Acquisitions folks get a title "Jane Eyre" and choose from among several OCLC record options to bring in a stub Instance linked to the PO in the Orders app. By the time this title gets to me, some upgrades have happened in WorldCat and a different OCLC record than the one originally used ends up being the better choice. 

I also have several local collection, provenance, and item-specific notes to add to the record that would not be appropriate for the OCLC master record. I edit the record locally and they take me a little while to construct, because I have to go hunt down the info. I keep the record in my save file for a day or two while I nail down details, updating occasionally. When it's ready for export, I need to overlay on the stub Instance, matching on something other than the OCLC number, because the incoming record will have a different OCLC number than the stub. 

At the point of overlay, I want to update the cataloged date of the instance to today, and also create attached holding and item records.

All of this could indeed be done with the export of a .mrc file from Connexion and a Data Import with a custom job profile, but that's several extra steps per title (and most of our catalogers work title by title, since their materials are varied enough in location, format, etc., to make batch work more trouble than it's worth). On a broad scale, the extra time spent on moving files around could have a real impact on efficiency.

Since the edge-connexion module is already set up to work with a job profile stored locally (I think?) in each library's tenant, would the effort involved for allowing it to accept other local job profiles be prohibitive? Balanced against that, it would make a HUGE difference to folks at MSU (and some other institutions, from what I hear in Slack) to be able to select from a list of profiles rather than being locked into one.

 

 

 

Comment by Lynne Fors [ 13/Apr/22 ]

Being able to overlay and update using single record import (push from Connexion client) and matching on OCLC number is a core component of our cataloging workflow at Wellesley.  We work on a title to title basis and being able to update and create using Connexion client push is the most efficient way for us to accomplish our work.

Comment by Kyle Banerjee [ 17/Jun/22 ]

Being able to overlay and update using single record import (push from Connexion client) and matching on OCLC number is a core component of our cataloging workflow at Wellesley.  We work on a title to title basis and being able to update and create using Connexion client push is the most efficient way for us to accomplish our work.

This is super common. As in every OCLC library I've worked with has this workflow – is there any that don't?

Comment by Vitus Tang [ 23/Jun/23 ]

Stanford would also like to be able to overlay an existing Folio instance record (source either MARC or Folio) with a push from an OCLC app. For our catalogers, there is always an existing acquisitions record to overlay. Without this capability, we basically cannot use the push method in our cataloging workflow and would always have to use the pull method. And we do want to use the push method because we want to capture the $0 from OCLC for controlled headings and we can do that only by a push from the Record Manager. We cannot get the $0s via the Z39.50 route used by the pull method. So, yes, we very much support this new feature proposal

Comment by Ann-Marie Breaux (Inactive) [ 18/Aug/23 ]

Moved from EDGCONX project to UXPROD project, since this is a feature

Comment by Charlotte Whitt [ 05/Dec/23 ]

Hi Ann-Marie Breaux - I removed Mjolnier as development team, and assigned this to FOLJIET. Please let me know if this is not your area, then I can take it back as assigned to Thor as the development team.

Comment by Ann-Marie Breaux (Inactive) [ 05/Dec/23 ]

Hi Charlotte Whitt I think it's fine to be assigned to Folijet for now. I don't see any library rankings on it, so it will just be in Folijet's backlog for now. Thank you! cc: Ryan Taylor

Generated at Fri Feb 09 00:39:57 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.