Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date

03-Mar-2021

Attendees


Discussion items

Time

Item

Who

Notes


Follow up from Kafka Security

RE: Data Import and Elastic Search's use of Kafka:

Data import: Each tenant has its own dedicated topic for each event type (a topic per environment - tenant - event type is created).

However, there is no capability to provide Kafka dedicated user credentials to each producer / consumer within Data import modules. All producers and consumers leverage the same set of Kafka client settings defined in ModuleDescriptor. Data import modules should be updated to make sure each tenant has a dedicated Kafka credentials.

ElasticSearch: A different multi-tenancy approach is implemented there. ElasticSearch module leverages a single topic for all tenants. The tenants' messages are distinguished one from another by using metadata attached to each message (tenant id).

As you see, the multi-tenancy on Kafka's side is implemented for the modules differently, so it will take time to make the changes in them that unify the multi-tenancy approach.

However, the direct Kafka connections should be secured in R1, so I propose to implement a simplified version of the solution for now. What I propose:

1) Add module-level Kafka user credentials support to Data import, ElasticSearch, Remote storage and PubSub modules.

A single credential should be provided to all producers and consumers of a module with along with other Kafka client settings.

Changes in PubSub are required since once Kafka authentication and authorization are enabled, the PubSub will need to pass through them as well.

2) Add TLS support to the same modules. 

Same here, the settings should be provided to all producers and consumers of a module with other Kafka client settings.

-----

In TC's review and discussion, there is work that would need to be done to adopt Vasily's recommendations. It seems there are 3 choices to recommend:

1) Accept the risk of not making any changes to Data Import and ElasticSearch, and look to improve their handling in the next release

2) Push for a "Hot fix" to add capabilities that would allow a deployment to leverage these new capabilities

3) Push to either get it into R1/Iris or delay R1/Iris so that this work can be included

Note that we should have an idea of scope of this work in the next few days.

Module release dates are such that a small implementation could possibly fit.

Also note that we'll need to assess the DevOps impact for R1 too.

If we end up recommending the 2nd option (hot fix that is optionally adopted) then we'll be introducing a testing dilemma - either testing assumes one way or the other, or, we test both methods.

After discussion we are agreed that we are comfortable with the level of risk and so we will push all work to R2 and look to implement the full plan that Vasily outlined last week (to be linked here in these notes).



New Technology requests - evaluation criteriaTC

Didn't have time

The TC needs to develop a set of criteria/checklist to be reviewed when new technologies are suggested/requested... so that we can make consistent and transparent decisions.

As an example, our previous conversation about the following item sparked the need for our establishing the criteria:

During today’s (20-Jan-2021) Tech Leads meeting a question was raised regarding the Spring Batch framework and whether the TC needs to approve its use in FOLIO. Spring Batch has been proposed as a building block for the new Data Import architecture (Data export by using Spring Batch (aka Export Manager)).

Brainstorming Notes:




Other


Future Topic:
What is our process for accepting a team and/or its work into the FOLIO distribution (folio `core` and/or folio `complete` as defined here: FOLIO Module/JIRA project-Team-PO-Dev Lead responsibility matrix