Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Page Properties


Submitted Date

  

Approved Date


Status

DRAFT

ImpactMEDIUM


 

Overrides/Supersedes 

...

Rationale

Evaluated Alternatives

Alternative

Reasoning

API Composition is a pattern for micro-service-based platforms for implementing queries that span services. In this approach the application performs the data join rather than the database. For example, a service (or the API gateway) could retrieve a customer and their orders by first retrieving the customer from the customer service and then querying the order service to return the customer’s most recent orders.

Redundant data are currently used for both visualization and filtering / search. API Composition approach likely will add additional internal API calls which might impact on overall performance and efficiency. Also, certain re-thinking of work flow will be required. (minus)

Command Query Responsibility Segregation - maintain one or more materialized views that contain data from multiple services. The views are kept by services that subscribe to events that each services publishes when it updates its data. For example, the online store could implement a query that finds customers in a particular region and their recent orders by maintaining a view that joins customers and orders. The view is updated by a service that subscribes to customer and order events.

The views mechanism requires certain efforts for implementation and resources for keeping views in an actual state. Meanwhile currently known use cases are pretty straightforward. So, this approach seems to be to complex for this particular case. (minus)

Domain-event pattern for change notifications, and data normalizationto simplify data synchronization - implement a notification channel for easy (though guaranteed) delivery of changes, and improve data normalization to achieve 1-to-1 updates.Enable keeping current FOLIO approach for certain data redundancy while solve eventual consistency issue with minimum efforts. (plus)

The solution consists of 2 parts -

...

Drawio
bordertrue
diagramNameData Consistency and Message Driven Approach
simpleViewerfalse
width
linksauto
tbstyletop
lboxtrue
diagramWidth713
revision1

Note that in Kafka, there is no explicit limit on the number of consumer groups that can be instantiated for a particular topic. However, you should be aware that the more the consumer groups, the bigger the impact on network utilization (http://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html).

Option 1 - Data normalization

(warning) RA - Is it possible to use this approach for json-b structure and to sort by vendor_code located in another table? How this might affect query performance?

Raman A

Quick POC demonstrated that it's not possible to use sort SQL operator in these conditions with JSON-B.

So, this proposal is only possible in case of migration from JSON-B to standard relational structure for all modules-recipients.

It worth noting that according to some performance analysis conducted by Taras Spashchenko (not sure if they were formally documented (sad)) the relational structure demonstrates much better performance than JSON-B one.

With that, migration to standard relational structure looks to be efficient though it takes certain efforts for refactoring and testing.

At the moment, records in modules-recipients contain redundant data explicitly, i.e. an order (can) contains an explicit vendor code or fund code. This approach admits a 1-to-many relation when one and the same code in module-source is used in many records in module-recipient. Therefore, a change of one value requires updates of many records which raises up such concerns as update consistency and synchronization lag.

The solution is to slightly improve data normalization, transform relations to 1-to-1 pattern, and consequently significantly improve overall experience.

For that it's proposed to move explicit redundant data from main data of modules-recipients to new separate tables-vocabularies, and add relations between them via unique identifiers, as it shown on the picture below:

Image Added

Data processing (search, filtration etc.) then should be implemented on DB engine via join operation. Keeping in mind that a table-vocabulary is expected to have a small number of records, it can be easily indexed so that the overall join should not impact negatively on queries performance.

This approach allows to isolate external data (which can be potentially changed at any point of time) into a separate table-vocabulary. In turn, in case of changes

  • it's only required to update such table-vocabulary without any impact on other data,
  • it does not need to update dozens or thousands of records but the only one,
  • it can also potentially improve database usage (like disk space or indexing).

In cases with 1-to-1 relations the described additional normalization is likely not required.

Drawio
bordertrue
diagramNameEventual Consistency For Redundant Data - Option 1
simpleViewerfalse
width
linksauto
tbstyletop
lboxtrue
diagramWidth933
revision1

Work breakdown draft structure

  • Design data model for improved normalization
  • Implement a migration script (either SQL or Liquibase)
  • Test migration script
  • Update existing business logic to enable new data model (most likely - update SQL statements to support join)
  • Test updated business logic
  • Add support of domain event and Kafka client in module-source
  • Add support of domain event and Kafka client in module-recipient
  • Implement a logic to handle domain event and update data model

Option 2 - De-normalized data processing

It's still possible to handle all the changes even without explicit normalization. For that it's proposed to split a reaction to change event in 2 phases. Phase 1 is to receive an event, query a data storage to retrieve a full list of items to be updates, and push information (at least its ID) of every item as a separate message to a Kafka topic. Phase 2 is to receive all such messages one by one (by a single processor or by a pool of them for scalability and performance) and make required update.

Drawio
bordertrue
diagramNameEventual Consistency for Redundant Data - Option 2
simpleViewerfalse
width
linksauto
tbstyletop
lboxtrue
diagramWidth1243
revision1

Consistency for Dangling References

Deletion of core-module records may leave dangling references from non-core modules

Image AddedUICR-125 - Data corruption. When holding/item data are moved in Inventory, then the item in Courses is not updated accordinglyOPENImage AddedMODORDERS-642 - Data corruption. When holding/item data are moved in Inventory, then the connected Order lines are not updated accordingly IN REFINEMENTImage AddedUIREQ-589 - [BE] Data corruption. When holding/item data are moved in Inventory, then the connected Request is not updated accordingly OPENImage AddedUIU-2082 - [BE] Data corruption. When holdings/item data are moved in Inventory, then the connected Fee/Fine is not updated accordingly CLOSED

This case seems to be similar from some perspective to the case with eventual consistency for duplicated data - a robust notification channel is required to notify modules-recipients about changes happened in modules-sources.

Consistency for Distributed Business Operations

...