...
- Delivery must be guaranteed with at least one approach
- Firebird - check if they can handle at-least-one while processing circulation log (UUID field, or hash on a set of key fields)
Performance
AFAIK, pubsub's performance was an issue for data import. To my knowledge, Vega didn't have any issues with pubsub performance because we're mainly using it for events triggered by users manually (check-out, manual fee/fine charge etc.). But it can be a potential issue for libraries that use fixed schedules and have thousands of loans "ageing to lost" or being charged a fine for at the same time.
Folijet - what are the performance requirements?
Retention policy
nothing specific, default should be enough
Versioning
It should be here
Payload size
Assumption: up to 100 Kb
Vega - small jsons (less than 1 Kb)
Folijet - small jsons
Firebird (Remote Storage) -
The existing scheme of modules interaction through PubSub
...
- modules decoupling - for interaction and transmission of messages, standard calls to OkApi are used
- the potential to replace the underlying transport mechanism from Kafka with something else without having to refactor the client modules (i.e. the modules that use this PubSub) are there any other benefits?
- ability to use FOLIO permissions to control access
...
- It won't deliver newer messages until older ones are acknowledged which will help with data consistency.
- Kafka at-least-once semantic allows to address reliability
- Better performance, though it wasn't a problem in our case.
- enabling good HA since every new instance connects Kafka withing consumer groups, with good distribution of events
it will be good for customers, because fewer bugs (even if not fewer, they will be easier to investigate and won't cause data inconsistency) - fail fast!
Configuration is more complex
Limitations, Risks and Assumptions
- All modules involved will have a Kafka client and "know" that Kafka is being used as the transport mechanism. As a result, if it is necessary to move to another transport mechanism in the indefinite future, changes will be required in all the modules involved. This risk can be mitigated by placing all the logic required to work through Direct Kafka in a separate library with designated interfaces. In this case, the logic of interaction through Direct Kafka will, in a sense, still be hidden from the business logic of the modules involved. In some cases clients may prefer current mod-pubsub's approach when it keeps delivering events even after failures to the "fail fast" logic folio-kafka-wrapper - is it for RMB? for Spring it should be much easier
- There's no implemented approach to authorize events in Kafka - it will be required to follow some general solution when it'll be made...
Modules affected
Need to list modules participating in Circulation where refactoring will be required.
- mod-circulation - SOURCE / CONSUMER (from mod-feesfines)
- mod-feesfines (Vega) - SOURCE
- mod-patron-blocks (Vega) - CONSUMER
- mod-audit (Firebird) - CONSUMER
- mod-remote-storage (Firebird) - CONSUMER
Time and effort estimates
Need to think also how behavior after implementation can be tested / validated.
Is it possible to tune PubSub in order to cover all the needs? - No biz value in making another Kafka; it's not a trivial task to make guaranteed delivery on HTTP