/
2020-02-28 - System Operations and Management SIG Notes

2020-02-28 - System Operations and Management SIG Notes

Date

Attendees

Goals

  • Learning about PubSub integration

Discussion items

TimeItemWhoNotes
5WelcomeIngolf
  • Find a note taker
10Support SIG

Looking for a SysOps & Mngmt SIG represantative

Support SIG Charge

 30+PubSub integration

Vince will talk about PubSub integration.

Initial slide has links to the Pub/Sub design.  Code available in Github.

Slides are available of today's presentation: link to slides.

Helps move FOLIO away from tightly coupling microservices and helps eliminate the need to know which other microservices and API's need to be called.

Modules in FOLIO can register during tenant configuration as publishers of particular events and subscribers of events.

Currently FOLIO has:

  • Event descriptors that describe event types
  • Multiple subscribers of events
  • Multiple publishers
  • Time to live for events
  • Optional call backs to publishers if there are no distributions to subscribers after time out
  • Callback to publisher when last subscribe deletes the subscription
  • PubSub client can provide utilities to manage registrations and publishings

Slides lists currently available APIs for registration, administration, publishing and history

Can provide an audit rail.

Slide presents PubSub data structures for event type descriptor and event descriptor

Various events can be tied together that form a workflow.

Current limitations:

  • Even type definitions don't support versioning.  Need to create a new event type vs modifying and event type.  On purpose for simplicity. Eliminates the need to force updates across all services.
  • No guaranteed delivery mechanism
  • Security model (waiting on availability of system users)
  • No cross tenant event distribution. Currently a FOLIO limitation across the board.

The messages right now are being stored in Kafka.  But we are not providing a general purpose kafka queue for all of FOLIO.  Only for Pub/Sub.

Is coding required to config?  Right now the configurations occur via the Pub/Sub client.  Editing a JSON file.

Jon Miller wants to see this used for bulk update.  Ex. A million records all at once.  This currently the pilot case used for testing.  Provides the decoupling and the queue as well as the distribution.  How performant is it?  Depends on the resources are available.  Some testing was done.  This is not the migration use case.   The performance tests will be re-run since multiple fixes have been implemented.    Current changes allow scaling to increase performance.

How are roll backs handled?  Technically, pub sub is not responsible for this.  It's not a workflow engine.  The logic for that must exist in the module and then notify the queue to hold.  The publisher can support the SAGA pattern which ilies it can receive notification about pausing or stopping.

If you publish something.  You don't know how many subscribers there are.  One of the subscribers might not receive the message.  And alternative patter is the publisher acing as a controller.

Kafka can archive messages as long as you need them.  If over a couple of weeks you notice a module has been missing messages, Kafka can be told to replay the events once the situation if fixed.  This is a huge advantage to regain consistency.  Vince, this is true, but we should be careful here since this is at the kafka level and it's more powerful than Pub/Sub is currenty.  You might not have easy access to this in the current state.

What would we not directly interface with Kafka?  We want to abstract this to change our mind in the future.  The wrapper also simplifies so the developer does not need to know all the details of working with Kafka and instead has a simple set of interfaces.  Are we handicapping ourselves with this?  We are losing some of Kafka's power here.  Vince.  This is our 1st step into this space.  When we are ready, there is nothing stopping us from making those kafka features available.  We are really focused on a1st version right now.

What is the point of the TTL in this system?  1. You want automatic cleanup of messages in the system.  2.  It does provide a feedback mechanism to the publisher about messages not being received by anyone. 

What process do you recommend for cleanup?  It's simply an abstraction that ....

From Taras_Spashchenko to Everyone: (10:36 AM)
https://microservices.io/patterns/data/saga.html

Right now inventory and SRS are hard coded.  The implementation of PubSub is being developed now.

inventory is being modified to receive these events.

Right now import must know about inventory, acquisitions and other parts of FOLIO and their data structures.  This is being changed now.

Can an event contain an array of data?  There is nothing here that prescribes what is in a message.  It's up to whomever publishes the message.

Might be interesting to have bulk operation messages for performance.

If an app crashes, the pub/sub will continue to try and deliver the message until the app restarts.  You just need to reconnect.  As long as you are in the time frame, the messages will be delivered.

For now is this a branch?  It was ready in Q4 and not included in the release.  In Q1, there is now an import dependency on this and it will be included in the release.



Action items

  • Ingolf Kusswill move architectural diagram to the wiki space
  • Ingolf Kuss will edit Notes on Conceptual Architectural Diagram