[FOLIO-1331] Define and describe the architecture for how to keep data in sync across multiple apps Created: 09/Jul/18  Updated: 21/Jan/22

Status: Open
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: Umbrella Priority: P1
Reporter: Charlotte Whitt Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: acquisitions, crossplatform, crossrmapps, integration, inventory, marccat, receiving
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Cloners
clones FOLIO-1273 Define and describe the architecture ... Open
is cloned by FOLIO-1756 Define and describe the architecture ... Closed
Relates
relates to UXPROD-2196 NFR: PubSub enhancements (BE) Open
relates to UXPROD-967 eholdings - Order Integration : Autom... Open
relates to UXPROD-689 Ability to update item record with ca... Closed
relates to UXPROD-1806 NFR: Data Import Pub-Sub (Event Drive... Closed
relates to UXPROD-2012 NFR: Data Import Pub-Sub (Event Drive... Closed
relates to UIU-2082 [BE] Data corruption. When holdings/i... Closed
relates to UXPROD-138 Locally-stored metadata records for e... Closed
relates to UXPROD-187 Receive Item and update details regar... Closed
relates to UXPROD-691 Allow Item status to be set during re... Closed
relates to UXPROD-693 Ability to update item record with Ba... Closed
relates to UXPROD-961 Allow Inventory Instances to generate... Closed
relates to UXPROD-968 eholdings - ERM Integration Closed
relates to UXPROD-1080 Locally-stored metadata records for e... Closed
relates to UXPROD-127 A process is needed to periodically r... Draft
relates to UXPROD-139 Avoid creation of duplicative metadat... Draft
relates to UXPROD-151 Ebook packages - relationship to indi... Draft
relates to UIREQ-589 [BE] Data corruption. When holding/it... Open
relates to UIREQ-650 [BE] When inventory data (e.g. barcod... Open
relates to UXPROD-150 Automatic update of Bib records updat... Closed
relates to UXPROD-683 Create Instance Record in inventory f... Closed
relates to UXPROD-684 Ability to create an Item Record in i... Closed
relates to UXPROD-686 Create Item Record in inventory for p... Closed
relates to UXPROD-687 Create Holding Record in inventory fo... Closed
relates to UXPROD-2473 Automatic update of Holdings records ... Closed
relates to UXPROD-2474 Automatic update of Authority records... Closed
relates to UXPROD-3399 When Relevant User Record Attributes ... Draft
Sprint:
Development Team: Core: Platform

 Description   

Purpose: Define process to keep data in sync seamless when exchanged and shared by two or multiple apps.

NOTE: This ticket complements FOLIO-1273 Open .

This area is relevant for interaction between, e.g.:

  • the Acquisition app and Inventory, e.g. when a brief record is created in the Acquisition app, and then pushed to Inventory when the order is send to the vendor
  • the eHoldings app and Inventory app
  • the ERM app and Inventory app, e.g. selected holdings records
  • the Inventory > Item record and the Check in screen, and e.g. piece number, Piece Description, and condition data as missing and damaged items
  • the Inventory > Item record and the Check out screen, e.g. piece number, Piece Description, and condition data as missing and damaged items
  • the Courses > Reserves record and Inventory>Item
  • the Inventory > Item record and the Request Screen, e.g. Call number, Volume and Enumeration ( UIREQ-81 Closed )
  • the MARC Batch Loader and various apps (SRS, Inventory, Acq (orders/invoices))
  • exchange of records between Inventory and SRS e.g.
    • bibliographic records created and maintained in quickMARC accessing the SRS MARC directly, and pushed to Inventory
    • holdings records maintained as MFHD - MARC 21 Format for Holdings Data, https://www.loc.gov/marc/holdings/ and then shared with Inventory > Holdings
  • eHoldings app and ERM app
  • eHoldings app and Order app
  • How does mod-source-storage interact with quickMARC and Inventory?
  • Do we need additional storage beyond mod-source-storage for MARC holdings?

Documentation:



 Comments   
Comment by Charlotte Whitt [ 20/Jul/18 ]

Jakub Skoczen Cate Boerema who can we assign this to?

See also latest Discuss post by Tod Olson (uChicago); with comments from VBar and Jakub Skoczen: https://discuss.folio.org/t/on-distributed-updates-and-eventual-consistency/1966

Comment by Cate Boerema (Inactive) [ 20/Jul/18 ]

Wasn't the idea that the POs would work out the user requirements (including stories and designs) for the cross-domain dependencies identified in this document and then, if dev analysis is needed, a spike story would be added for that technical discussion and design?

In general, I think it's ideal if we can be more specific about what problem we are asking the developers to solve. Curious what Jakub Skoczen's take is...

Comment by Charlotte Whitt [ 20/Jul/18 ]

True, Yes, and the POs are mapping out areas where we have identified dependencies across apps, we write jiras for the interaction (see all the related UXPRODs to this story), we map out conceptual datamodels, we define where we have data elements in common etc.

See: documents gathered in the Cross RM app subfolder: https://drive.google.com/drive/folders/148aEoxuuDMKI9-RVgTQvIBBw26ITyB82

Somehow (at least I think) we need to have the architecture described; so I'm just trying to understand ownership.
In the dialogue with the developers, with SMEs this pop's up on a regular basis - and latest in Tod Olson's discuss post published this week.

But maybe there is a consensus about all of this from a dev. perspective - and it's just me and other POs, who are not all clear on, how data synchronization is expected to be implemented

Comment by Ian Ibbotson (Use this one) [ 20/Jul/18 ]

The danger for me here is that in talking to users the POs completely miss really important standardisation initiatives like NISO FASTEN working group where specific functional requirements for connecting very specific subsystems are being elaborated. By describing this as a generic problem we really run a serious risk of dumbing down the nuances of each edge in the graph. Personally, I think it might be better to take a wider standards based view of module interoperability - right now it all feels very tightly coupled to a FOLIO world view.

Comment by Khalilah Gambrell [ 20/Jul/18 ]

POs are defining the business requirements for the work and data flows across apps. We are not trying to prescribe in any way a technical approach. We can work with our respective teams to satisfy app dependencies but if there is an overall standard/approach that needs to be followed then we need to know from an architect or technical council.

Comment by Hkaplanian [ 23/Jul/18 ]

When we had this discussion/meeting in Madrid, Jakub and Vince both agreed that the different apps would need to call the API's of the other apps that needed updates or to retrieve data. The specific example of acquisitions needed to update inventory with a holding was used. Is this too late to even change at this stage?

Comment by Mike Taylor [ 24/Jul/18 ]

I think the reason it's cropped up again – not just now, but on other occasions – is because we have new people joining the project all the time, and they come with concerns that are not directly addressed by any our documents. We really need a FOLIO FAQ that explains why we made some of the decisions we made, so we can point newcomers to it and avoid repeatedly rehearsing the same issues.

Comment by Ann-Marie Breaux (Inactive) [ 25/Jul/18 ]

Hi all - and in Madrid, we didn't yet have the idea of the MARCcat app fleshed out, or the fact that we'll need to store/edit MARC holdings records, etc. From a data flow standpoint, we need to be clear on what the storage apps are named, how long it takes for an inventory instance to reflect a change made over in MARCcat and synchronized to mod-source-storage, and similar. I'm putting together a meeting for next week to make sure that we're aligned on some basics, so that we can ensure it's documented enough to help the MM-related POs (A-M, Charlotte, Tiziana) to be able to write appropriate user stories for the developers.

Comment by Charlotte Whitt [ 05/Nov/18 ]

Conversation on Slack #coreteam on November 2, 2018:

Charlotte Whitt [4:06 PM]
Hi @david-crossley, @jakub and others - do we have documentation on how we'll perform exchange of records across apps, and especially how we'll update records with only partial information. Why I'm asking is, e.g. following use case: In the eHoldings app there is a package record, and in Inventory we have a matching container record. Then the title get's updated in eHoldings package record, and we'd like to have that update to be pushed to the Inventory container record. But only the update of the name/title; while the Container record in Inventory will have a more complete description, more metadata elements, than in the eHoldings package record; and we of course don't want to loose all these data added by a cataloger.

The relevant jira's will be: FOLIO-1331 Open , and all related UXPROD stories

Jakub Skoczen [4:12 PM]
@charlotte I don’t think we would have such use case documented, but it’s an interesting case, it sounds to me like we would need some sort of “override” functionality in there

Khalilah Gambrell [4:15 PM]
@charlotte, do you mean the update of the package name OR update of the title in the package?

Charlotte Whitt [4:17 PM]
It can be both @kgambrell
It's not the specific element which is the key here, but that we want an update of only one element, coming from a package record in eHoldings, and this one element must only overwrite one element in Inventory container record

Marc Johnson [4:24 PM]
@charlotte Is the expected behaviour that when a eHoldings `package` `title` property is changed, the inventory `container` `title` property should also change to match it?

Charlotte Whitt [4:27 PM]
Yes, except in eHoldings terminology it's probably `packageName` changing `containerName`
I'm not that familiar with the correct item names here :wink:

Marc Johnson [4:29 PM]
@charlotte Thanks, that helps. I imagine there is some none-name/title based matching rules which mean we can work out which packages are represented as which containers in inventory?

@jakub This seems like an edit of a container record, is that what you meant by override, or something else?

Charlotte Whitt [4:30 PM]
the thing is, if we pass the complete package record (json file), then the matching container record will be overwritten with a record with less data, and we'll loose all the data added by catalogers, who have edited the container record in Inventory (edited)

Marc Johnson [4:34 PM]
@charlotte Can you expand upon what you mean by pass a package record?

That seems like we are envisaging the backend part of eholdings (`mod-kb-ebsco`?) somehow sending a package record to a backend part of inventory, and that knowing how to interpret it?

Charlotte Whitt [4:42 PM]
@marcjohnson - yes. We are envisioning records, and update of single data elements, being pushed from one record in one app to a matching record in another app. E.g. updates coming from a package record in the eHoldings app being pushed to Inventory (and here the matching record is a container record).
The same scenario can also be: a brief order instance record in the Order app creating an Instance record in Inventory. Then the staff who submitted the order record discover a misspelling, and then corrects the title in the Order app, and then the update needs to be pushed to the Inventory instance record as well.

Marc Johnson [4:45 PM]
@charlotte Ok, is it reasonable to say then that your question is about how this `push` (AKA synchronisation in other conversations?) mechanism is going to work on the FOLIO platform?

Charlotte Whitt [4:45 PM]
Yes :slightly_smiling_face:
The thing is, that we have scenarios where it's a full record, but also scenarios where it's only one data element

Marc Johnson [5:00 PM]
@charlotte As @jakub suggested above, I don’t think we have any mechanism or standard for this at the moment (@jakub were you referring to this inter-module `synchronisation` mechanism?).

The places where a module has needed to perform an activity like this at the moment, this has been via a business logic module (in the context of where the change happens) making a request to the API of the other context to be updated. For example, when `mod-circulation` updates the status of an item in `mod-inventory-storage` (via the `item-storage` API) when a check out or check in is performed.

I imagine what this could look like in the scenario described above, would be that `mod-kb-ebsco` would make requests to the `inventory-container-storage` API (made up name, as I believe this does not exist at the moment) to find the appropriate container and then update it’s name.

The important part about this, is that `inventory` does not know anything about `circulation` or `e-holdings` (and so does not understand their models and records), but they need to understand inventory’s model and APIs. Whereas pushing a package record to inventory relies on inventory understanding what to do with a package. So there a different approaches, which introduce different kinds of dependencies between the different parts.

Is it your expectation that FOLIO-1331 Open will define a general mechanism / pattern for this kind of interaction (and may well be different to what I’ve described above, when messaging and event publishing are introduced)?

Charlotte Whitt [5:06 PM]
Okay, thank you so much for digging into this @marcjohnson - I'll also cc: @nielserik to this conversation too

Comment by Marc Johnson [ 05/Nov/18 ]

Thanks Charlotte Whitt for posting that conversation here.

To try to add more clarity, I was attempting to describe how some FOLIO modules have currently interacted with other modules in order to fetch (and in some cases store) additional information about related records, or change the state of related records. I am not making a proposal as to how this should be done, within the scope of this issue.

Comment by Jakub Skoczen [ 05/Nov/18 ]

Marc Johnson Charlotte Whitt I understand that there are two issues that need to be resolved to address the particular use case Charlotte is talking about:

  • syncing changes across modules (where the mechanism used could include a direct API call or a publish/subsribe mechanism)
  • allow part of the record to be update (we would need a mechanism to control what parts are those and ability to see previous values)

Am I right?

Comment by Charlotte Whitt [ 05/Nov/18 ]

Hi Marc Johnson - I understand that
Just wanted to capture the talk we had Friday, so we don't need to dig it up at a later time.

Addressing these questions and coming up with proposals - would that be something the Back End team would discuss, and bring forward - or how should we best kick-start this topic, Jakub Skoczen?

CC: Khalilah Gambrell, Dennis Bridges, Ann-Marie Breaux

Comment by Charlotte Whitt [ 05/Nov/18 ]

(...) two issues that need to be resolved to address the particular use case Charlotte is talking about:

  • syncing changes across modules (where the mechanism used could include a direct API call or a publish/subscribe mechanism)
  • allow part of the record to be update (we would need a mechanism to control what parts are those and ability to see previous values)

Hi Jakub Skoczen - yes, that's correct!

Comment by Marc Johnson [ 05/Nov/18 ]

Charlotte Whitt It was more for new readers coming to this issue later

I think how this process starts is up to Jakub Skoczen (it seems similar and related to AES, and so I would suggest it starts with the Technical Council).

Jakub Skoczen My thoughts on your summary:

syncing changes across modules (where the mechanism used could include a direct API call or a publish/subscribe mechanism)

Would it be reasonably to summarise the example from last week as:

  • An authoritative source of packages is e-Holdings
  • A inventory container can be related to a package and some of it's information is derived from that
  • When a package's name is changed in e-Holdings, a related container's name should change in inventory

To me, this could be viewed as two steps: Synchronising inventories view of packages; and deriving a container's name from a related package.

There seems to be different kinds of synchronisation described in these examples:

  1. Two contexts have different views of a type of record, one is authoritative and the other has a (possibly partial) copy, e.g. the item information that circulation has to support loan/request processing or presentation
  2. A kind of record in one context is (possibly partially) based upon a different kind of record in a different context, and some properties are derived based upon that relationship e.g. the above container and package relationship, where only name is derived and the rest of the container record is owned by inventory

Do those different kinds make sense? Do they both fall in the remit of synchronisation and this issue?

The example I provided in the conversation last week is different, it described a situation where a process (check out) in one context (circulation) impacts the state (status changes from Available to Checked out) of a record (an item) in a different context (inventory).

Does this also fall under synchronisation? I'm inclined to suggest it doesn't, even if the technical mechanisms for achieving these turn out to be similar. (The container and package example can be reworded to be like this).

allow part of the record to be update (we would need a mechanism to control what parts are those and ability to see previous values)

I think the ability to partial update a record is a separate (but potentially related) topic. It might be useful for performing some kinds of changes, depending upon the approach chosen.

Comment by Jakub Skoczen [ 28/Jan/19 ]

The general architecture for synchronising state across multiple modules will include:

As Marc Johnson explained, partial record updates are likely a separate topic.

Comment by Charlotte Whitt [ 30/Jan/19 ]

Hi Jakub Skoczen, and Marc Johnson - sounds good. I have created: FOLIO-1756 Closed - Define and describe the architecture for how to keep data in sync across multiple apps for partial record updates

Comment by Ann-Marie Breaux (Inactive) [ 19/Jan/21 ]

Hi Charlotte Whitt I'm not sure how you want to change the description to reflect that MARCcat is no longer being developed, so I left it as-is. Please update however you think is appropriate. Note that quickMARC will be accessing the SRS MARC directly, rather than having separate storage like MARCcat.

Comment by Kelly Drake [ 26/Mar/21 ]

Hi all,  I updated the priority to P1.  After discussion among a number of POs, it was felt that the time has come in FOLIO's maturity to address this issue.

Comment by Ian Walls [ 23/Jun/21 ]

A minority opinion, and probably too late to matter, but I will still speak up.

I think we could save ourselves a LOT of headaches by making a generalized "storage service", rather than module-specific ones.  There would need to be a model-registration function so that different models could be added by different services.   In some cases, dependencies would exist (you can't logically have a Loan model without User and Item models), and this would be where foreign keys would come into play.  The Loan model wouldn't necessarily know anything about the User or Item model, other than it's linked to them.

I think this would probably be easier to do in GraphQL than REST; this pushes the request of specific linked-entity attributes onto the client instead of the individual module's APIs having to either stay entirely within their module (and thus do multiple API calls to different endpoints to get all the connected data) or make assumptions of another modules model.

This would not preclude any developer from making a module that uses it's own storage method, but it would provide a common core utility in the platform that would reduce both data synchronization issues and HTTP overhead. 

Generated at Thu Feb 08 23:12:33 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.