Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

...

This page is intended to analyze the current experience of using the PubSub mechanism, in particular in the Circulation application, and to study the feasibility of moving to the Direct Kafka approach.

Link to Jira -

Jira Legacy
serverSystem JIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyUXPROD-3764

Additional information on the topic

...

  • maximum number of loans per user is limited to 10 by automated patron blocks configuration
  • a user has no open loans at the moment
  • user checks out an item, but ITEM_CHECKED_OUT event does NOT reach mod-patron-blocks (which keeps count of loans for every user)
  • over the next few months user checks out 10 more items, each time a corresponding event reaches mod-patron-blocks successfully
  • library notices that user has 11 open loans, while the the limit is 10
  • library reports a bug in mod-patron-blocks - the most likely culprit from user's perspective
  • during investigation a developers discovers that the block was not imposed because of a failed event delivery which took place months ago

One more real-life example: Sometimes the system blocks the user due to the activation of patron blocks restrictions. In such cases, FSE receives support tickets in Jira. At the current time, there are several dozens of similar tickets from different libraries on this topic. According to the FSE team, in the vast majority of cases, the problem is solved by re-synchronization between mod-circulation and mod-patron-blocks, performed by running a special Jenkins job.

Assumption: Based on the information received from the FSE and Vega teams, their experience in analyzing the issues described and the available statistics, it is reasonable to assume that these issues are caused by the PubSub issues described above.

Consequences of the Push mechanism while Data Import

The existing PubSub is a Push mechanism. Source Record Manager would place large numbers of messages (one per record) into the queue during a large import job. Mod-pubsub would then push these into the callback function provided by mod-inventory. There was no means for mod-inventory to say “enough already”, it would get overloaded and crash. This was discussed with Folijet previously, and no viable solution was found.

The proposed scheme of modules interaction through Direct Kafka

...

  • Guaranteed delivery provided by Kafka allows addressing reliability concern
  • Improved data consistency since Kafka does not deliver newer messages until older ones are acknowledged
  • Better performance by eliminating the overhead of multiple HTTP calls per event dispatch
  • Enabling good HA since every new Event Consumer instance connects Kafka within a consumer group, so that the load is distributed evenly
  • Improved manageability because of easier investigation capabilities, less data inconsistency, and following fail-fast approach
  • the Pull mechanism provided by the Direct Kafka (as implemented in Data Import) - this implementation places the consumer code in mod-inventor,y and it will pull message from Kafka when it has capacity

Limitations, Risks and Assumptions

...

Quite T-shirt estimates - L->XXL.

Spike scope

  • How to configure and connect to Kafka broker and topic?
  • How to create Kafka topics?
  • How many Kafka topics should be used - one topic per each event type, or one for all types?
  • ..