Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info

DRAFT! DRAFT! DRAFT! THIS IS NOT A PROPOSAL AT THE MOMENT

...

  • maximum number of loans per user is limited to 10 by automated patron blocks configuration
  • a user has no open loans at the moment
  • user checks out an item, but ITEM_CHECKED_OUT event does NOT reach mod-patron-blocks (which keeps count of loans for every user)
  • over the next few months user checks out 10 more items, each time a corresponding event reaches mod-patron-blocks successfully
  • library notices that user has 11 open loans, while the the limit is 10
  • library reports a bug in mod-patron-blocks - the most likely culprit from user's perspective
  • during investigation a developers discovers that the block was not imposed because of a failed event delivery which took place months ago

Consequences of the Push mechanism while Data Import

The existing PubSub is a Push mechanism. Source Record Manager would place large numbers of messages (one per record) into the queue during a large import job. Mod-pubsub would then push these into the callback function provided by mod-inventory. There was no means for mod-inventory to say “enough already”, it would get overloaded and crash. This was discussed with Folijet previously, and no viable solution was found.

The proposed scheme of modules interaction through Direct Kafka

...

Drawio
bordertrue
diagramName35 Direct Kafka
simpleViewerfalse
width
linksauto
tbstyletop
lboxtrue
diagramWidth501
revision3

Requirements Addressing

Below the key benefits are listed:

  • Guaranteed delivery provided by Kafka allows addressing reliability concern
  • Improved data consistency since Kafka does not deliver newer messages until older ones are acknowledged
  • Better performance by eliminating the overhead of multiple HTTP calls per event dispatch
  • Enabling good HA since every new Event Consumer instance connects Kafka within a consumer group, so that the load is distributed evenly
  • Improved manageability because of easier investigation capabilities, less data inconsistency, and following fail-fast approach
  • the Pull mechanism provided by the Direct Kafka (as implemented in Data Import) - this implementation places the consumer code in mod-inventor,y and it will pull message from Kafka when it has capacity

...

Limitations, Risks and Assumptions

  • Configuration (including Kafka, topics, group consumer, authorization) is more complicated than with the PubSub
  • While Kafka supports exactly-once delivery, the at-least-once implementation is simpler and more manageable. In turn, at-least-once means that the Event Consumer must be prepared to handle potential duplicate events

...

  • All modules involved will have a Kafka client and "know" that Kafka is being used as the transport mechanism. As a result, if it is necessary to move to another transport mechanism in the indefinite future, changes will be required in all the modules involved. This risk can be partially mitigated by placing all the logic required to work through Direct Kafka in a separate library with designated interfaces. In this case, the logic of interaction through Direct Kafka will, in a sense, still be hidden from the business logic of the modules involved. Note: there is folio-kafka-wrapper which provides some useful functionality; for Spring-way it should be much easier
  • At the moment there's no implemented approach to address security concerns (including authorization) for Kafka - it will be required to follow some general solution when it'll be made
  • There are concerns about the 'informal' way of managing dependencies on specific versions of events given that any new subscriber could subscribe to any given published event - we don't know who all the subscribers are.

Modules affected

Below is the list of modules participating in Circulation where refactoring will be required:

...