[FOLIO-3448] SPIKE: Prevent FolioExecutionContext from being initialized with wrong tenant id Created: 14/Mar/22  Updated: 05/Mar/23  Resolved: 02/Feb/23

Status: Closed
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: Tech Debt Priority: P1
Reporter: Julian Ladisch Assignee: Julian Ladisch
Resolution: Done Votes: 0
Labels: security, security-reviewed
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Blocks
blocks FOLSPRINGS-86 Enabling two tenants in parallel fails Closed
Continues
is continued by MODEXPW-163 End spring execution context Closed
is continued by FOLSPRINGS-95 non-public beginFolioExecutionContext... Awaiting release
Relates
relates to MODEXPW-375 Job runs by user not the one who crea... Closed
relates to FOLSPRINGS-58 Async FolioExecutionContext helper Closed
relates to MODEBSNET-58 Delete beginFolioExecutionContext fro... Closed
relates to MODEXPS-123 Close FolioExecutionContext if it is ... Closed
relates to MODEXPW-173 Close FolioExecutionContext if it is ... Closed
relates to MODRS-130 Close FolioExecutionContext if it is ... Closed
relates to MODEXPW-144 Export eHoldings: use correct tenantId Closed
relates to MODEXPW-67 FolioExecutionContext is initialized ... Closed
Sprint:
Development Team: Spring Force
RCA Group: TBD

 Description   

MODEXPW-67 Closed = "FolioExecutionContext is initialized with wrong tenant id if Spring Batch job launches asynchronously in multi tenant cluster"

MODEXPW-67 Closed has fixed a wrong tenant id.

How could it happen that a wrong tenant id was used?

Is it possible that other code has similar tenant id issues?

How can we prevent this from happening again? Can we make the library that handles tenant ids for spring bullet-proof? Can we publish a how-to or best practices on wiki.folio.org, dev.folio.org or the relevant spring github repositories?

Tenant separation is a critical security feature that must be reliable.



 Comments   
Comment by Julian Ladisch [ 14/Mar/22 ]

Oleksandr Bozhko Can you advise how FOLIO can prevent a wrong tenant id in spring based modules?

Comment by Jakub Skoczen [ 17/Mar/22 ]

Oleksii Petrenko who from the Spring Way team should investigate this?

Comment by Axel Dörrer [ 24/Mar/22 ]

Oleksii Petrenko the security team would like to know if this issue is already in investigation somehow, thanks.

Comment by Oleksandr Bozhko [ 24/Mar/22 ]

Julian Ladisch From what I understood during the work on MODEXPW-67 Closed , this issue comes from the usage of async logic here (mod-data-export-worker module) together with this synchronization logic (folio-spring-base library). This requires more in-depth investigation.

Comment by Andrei Makaranka [ 19/Apr/22 ]

Priority was increased to P1 because it is affect EDIFACT and BULK_EDIT schedulers in the mod-data-export-spring.
Related issues : https://folio-org.atlassian.net/browse/MODEXPW-67 and Issue fixed only for specific case, but not for all.

Comment by Axel Dörrer [ 21/Apr/22 ]

Jakub Skoczen since this got a P1 the security team would like to ask you if you could expedite it

Comment by Craig McNally [ 28/Apr/22 ]

Jakub Skoczen please see above.

Comment by Craig McNally [ 12/May/22 ]

Oleksii Petrenko Pavlo Smahin what needs to happen to make progress on this?

Comment by Mikhail Fokanov [ 02/Jun/22 ]

Julian Ladisch I added documentation to readme.md and Javadoc in corresponding java classes. This documentation addresses the following questions:

How could it happen that a wrong tenant id was used?

Is it possible that other code has similar tenant id issues?

How can we prevent this from happening again?

Taras Spashchenko is working on wiki page for documentation of the Spring approach (e.g. Quick start guide, Guidelines etc.). In scope of these work the following question will be addressed:

Can we publish a how-to or best practices on wiki.folio.org, dev.folio.org or the relevant spring github repositories?

Comment by Pavlo Smahin [ 02/Jun/22 ]

Mikhail Fokanov, please review issue FOLSPRINGS-86 Closed . There is also a problem with the execution context. 

Comment by Julian Ladisch [ 03/Jun/22 ]

I've created FOLSPRINGS-58 Closed "Async FolioExecutionContext helper" to make it as easy as possible to avoid this issue in new/changed code.

Comment by Julian Ladisch [ 03/Jun/22 ]

Is it possible that other code has similar tenant id issues?

How can we ensure that all existing code that uses FolioExecutionContext and runs asynchronously properly sets and unsets the FolioExecutionContext?

Do we need a code review Jira for each module that might be affected?

Comment by Julian Ladisch [ 05/Jun/22 ]

This GitHub search for beginFolioExecutionContext lists all modules and all code files that might be affected:

https://github.com/search?q=org%3Afolio-org+beginFolioExecutionContext&type=code

Comment by Mikhail Fokanov [ 08/Jun/22 ]

https://github.com/search?q=org%3Afolio-org+beginFolioExecutionContext&type=code

Julian Ladisch I analyzed all the code that can be found by it. It is used in a right way everywhere.

How can we ensure that all existing code that uses FolioExecutionContext and runs asynchronously properly sets and unsets the FolioExecutionContext? Do we need a code review Jira for each module that might be affected?

I arranged a meeting for this Friday in order to explain all the details to Epam Team leads. I will collect questions and outcome of this meeting and then I will arrange a meeting for other community Team leads. There was a gap in the documentation regarding the FolioExecutionContext , that is why this mistake was done by developers.

Julian Ladisch What additional actions would you suggest?

Comment by Julian Ladisch [ 09/Jun/22 ]

https://github.com/folio-org/mod-data-export-spring/blob/v1.3.5/src/main/java/org/folio/des/config/FolioExecutionContextHelper.java#L50-L73
FolioExecutionContextHelper.initScope() sets the FolioExecutionContext, but there isn't an endFolioExecutionContext() when the task is finished.

Comment by Julian Ladisch [ 09/Jun/22 ]

https://github.com/folio-org/mod-data-export-spring/blob/v1.3.5/src/main/java/org/folio/des/config/kafka/KafkaConsumerInterceptor.java#L33
sets the FolioExecutionContext but there isn't an endFolioExecutionContext() when the task is finished.

Comment by Julian Ladisch [ 09/Jun/22 ]

https://github.com/folio-org/mod-data-export-worker/blob/v1.3.5/src/main/java/org/folio/dew/config/kafka/KafkaConsumerInterceptor.java#L34
sets the FolioExecutionContext but there isn't an endFolioExecutionContext() when the task is finished.

Comment by Julian Ladisch [ 09/Jun/22 ]

https://github.com/folio-org/mod-remote-storage/blob/v1.5.0/src/main/java/org/folio/rs/service/AccessionQueueService.java#L97
sets the FolioExecutionContext but there isn't an endFolioExecutionContext() when the task is finished.

https://github.com/folio-org/mod-remote-storage/blob/v1.5.0/src/main/java/org/folio/rs/integration/KafkaMessageListener.java#L27
has two instances of FolioExecutionContext; one in the private variable, one set by processAccessionQueueRecord(events) in the thread.

Comment by Julian Ladisch [ 09/Jun/22 ]

https://github.com/folio-org/mod-data-export-worker/pull/217/files#diff-f1d0438cfd3ca0dc21fbf1e14570cecd7322e0179e3ff153e0d41350cadf43c9R96
sets the FolioExecutionContext but there isn't an endFolioExecutionContext() when the task is finished.

Comment by Mikhail Fokanov [ 09/Jun/22 ]

Julian Ladisch, after you comment about endFolioExecutionContext() I added this information about it to readme.md, but actually calling endFolioExecutionContext() is not mandatory because when some one process/request will be executed, the headers will be rewritten with the new ones from the process/request (after beginFolioExecutionContext() call).
My comment "I analyzed all the code that can be found by it. It is used in a right way everywhere." was just about it. This mentioned code will work well without calling of endFolioExecutionContext() (e.g. it is not a bug in the code).

Comment by Julian Ladisch [ 10/Jun/22 ]

It's a bug in the code when the tentant id (in FolioExecutionContext) is read from thread local but it hasn't been set for the current task.
This bug should fail the unit test.
The unit test fails if the previous task removes the tenant id from thread local.
Therefore the tenant id should always been removed.
Not removing the tenant id is not a bug but makes it possible that a bug in other code that runs next on the same thread goes unnoticed.

Comment by Mikhail Fokanov [ 15/Jun/22 ]

Julian Ladisch OK. I will ask Firebird (Siarhei Charniak, Viachaslau Khandramai) to fix the code, that you mentioned. I created a draft story: https://folio-org.atlassian.net/browse/MODEXPW-163

Comment by Julian Ladisch [ 21/Jun/22 ]

Mikhail Fokanov Thanks for tje Jira to improve the MODEXPW code.

Is there agreement that the FolioExecutionContextSetter should be merged and that it should replace  beginFolioExecutionContext() and endFolioExecutionContext()?

FOLSPRINGS-58 Closed =  https://github.com/folio-org/folio-spring-base/pull/56/files

Comment by Mikhail Fokanov [ 30/Jun/22 ]

Hello Julian Ladisch. We had a call with Epam guys regarding FolioExecutionContext usage. We agreed that in each case if the context is opened, it must be closed. Guys are ready to fix the issues that you highlighted in comments to this jira on June 9th as soon as they can (e.g. this week).
In order to fix them we have the following plan:

  • To avoid breaking changes before the release, in Morning glory guys will call the existing "endFolioExecutionContext" method, where it is needed. It is tiny amount of work.
  • In the next release we will release new (5.0) version of Spring-Folio-Base and all modules will migrate to Autocloseable FolioExecutionContextSetter.
Comment by Julian Ladisch [ 30/Jun/22 ]

Good plan, thanks!

Comment by Craig McNally [ 15/Sep/22 ]

Waiting to merge this so it doesn't end up in Morning Glory (a workaround is used in that release), but does end up in Nolana.

Comment by Craig McNally [ 02/Feb/23 ]

See continuation story

Generated at Thu Feb 08 23:28:13 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.