Skip to end of banner
Go to start of banner

DR-000030 - Data consistency and message driven approach

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Submitted Date

  

Approved Date


Status

DRAFT

ImpactMEDIUM

 

Overrides/Supersedes 

This decision was migrated from the Tech Leads Decision Log as part of a consolidation process.  The original decision record can be found here.

RFC 

N/A

Stakeholders

  • Front-end and back-end devs who meet issues with data consistenc

Contributors

Raman Auramau 

Approvers

Background/Context

This is a solution design document aimed to provide details, alternatives and decision for FOLIO cross-module data consistency problem.

Data consistency refers to whether the same data kept at different places do or do not match. Being a distributed microservices-based platform, FOLIO is a number of separate modules with own data schemas and storages. So, FOLIO follows the principle to keep each module’s persistent data private to that module and accessible only via its API. A module’s transactions only involve its database.

With that, this approach has some drawbacks, among them -

  • Implementing business transactions that span multiple services is not straightforward,
  • Implementing queries that join data that is now in multiple databases is challenging.

ARCH-5 - Spike : Data consistency approach for Folio OPEN

Basing on currently known Folio issues with Data consistency one can see that FOLIO has difficulties caused by both mentioned shortcomings. Those difficulties can be divided into the following groups:

  • PRIORITY 1 eventual consistency for redundant / duplicated data when some data is duplicated in 2 storages and is to be synchronized,
  • PRIORITY 2 consistency for dangling / lost references when an item is deleted from one module leaving lost references to it in other modules, a problem that is succinctly, if frustratingly, captured in the PR discussion related to UITEN-128.
  • PRIORITY 3 data consistency during distributed business operations when data in several separate storages is to be modified (mod-finance-storage, mod-invoice-storage, mod-orders-storage),
  • updates collisions ... (check with Jacub? )

Eventual consistency for duplicated data

Brief context: a source module owns an entity; a particular entity field is duplicated into 1+ entities of another module (e.g., for search, or filtering, or sorting). If an original field value in source module is changed, the change is to be replicated everywhere.

Identified cases:

  1. Pair of RefNumber and RefType should be in consistence state between POL and invoice line MODORDERS-421 - Spike : User should be able to edit Pair of "refNumber" and "refNumberType" in the POL and see that update on the Invoice line OPEN
    1. mod-orders → mod-invoice
    2. 1-to-1 relation - one pair of refNumber/refNumberType to one invoice record ( (question) - not sure, need to confirm)
  2. VendorCode should be in consistence state between Organization record and purchaseOrder.vendorCodeMODORDERS-398 - Data consistency needed : Update "vendorCode" in related purchase orders BLOCKED
    1. mod-organizations → mod-orders
    2. 1-to-many relation - one vendor code can be used in many orders
  3. FundCode should be in consistence state between Fund record and pol.fundDistribution.code
    1. mod-finance → mod-orders
    2. 1-to-many relation - one fund code can be used in many orders

Outstanding questions

Status

Item

Details, comments, decisions

(plus)

Raman Auramau How many data (rows) can be affected?

Raman Auramau Assumption is up to dozens of thousand (e.g. changing of vendor or fund code can affect thousands+ orders

(warning)

Raman Auramau Are there any specific performance requirements (i.e. how fast data are to be synchronized)?


(warning)

Raman Auramau What is an expected behavior in case synchronization fails? Options - rollback, continue from the same record, retry (1..N times), report an error


(warning)Raman Auramau What is the allowable lag between changing a value in module-source and updating it in module-recipient?

Assumptions

N/A

Constraints

N/A

Rationale

Decision

Implications

  • Pros
    • N/A
  • Cons
    • N/A

Other Related Resources

References

On distributed updates and eventual consistency

2021-05-12 Meeting notes - there was a lot of discussion on related issues

Meetings notes

Agenda

  • (plus) Review current context of FOLIO Data Consistency
  • (plus) Review identified groups of Data Consistency issues - are there any to be added?
    • added a new one for update collisions
  • (plus) What are the priorities?
  • (plus) Review a proposed solution for Eventual consistency for redundant data
  • (plus) Identify next steps

Tech Leads meeting

  • (plus) Communicate the proposed solution for Eventual consistency for redundant data on Tech Leads meeting
  • (plus) Next steps from my end - no objections from Tech Leads to go ahead
    • I do propose to consider a FOLIO Domain Event Bus on top of Kafka as a potential pattern (without strong committing on this right now)
    • Need to work though this in all the details during implementation for a particular case (Thunderjet team has some examples)
    • After that - document and review the design and implementation, and present on Tech Leads meeting as a detailed design
  • Action items:
    • (warning) Sync with Charlotte W regarding the cross apps consistency
    • (warning) Chat with Mikhail F and Vladimir S regarding Kafka usage including topic naming conventions etc.
    • (plus) Meet with Thunderjet team and Dennis B to agree on capacity and planning
    • (warning) Raman to continue design work and keep in mind Jacub's proposal (see in comments)

Thunderjet grooming session

  • (plus) Raman shared current status and suggested plan; Thunderjet is ok to go ahead
  • UIREC-135 - Update create, edit and receive piece forms with additional fields CLOSED - this is a top priority issue with data consistency for duplicated data which is necessary for current Thunderjet feature completion; agree to start with this issue
  • Raman to work with Andrei Makaranka on details
  • No labels