2024-05-08 - Architectural PoC part 2




Craig McNally

+ about 33 attendees

Recording: https://recordings.openlibraryfoundation.org/folio/tech-council/2024-05-08T10:55/

Discussion items

1 minScribeAll

Jenn Colt

Reminder:  Please copy/paste the Zoom chat into the notes.  If you miss it, this is saved along with the meeting recording, but having it here has benefits. 

Q&A on the PoC

Discussion on working with authentication and authorization with Keycloak:

  • Diagram of user that has permissions
  • Keycloak authorization slightly different - concept of capability - explicit granting of ability to perform a specific scope against a particular resource. Separates the resource from the scope. Permission isn't granted permanently, instead there is a policy which allows evaluation at run time whether to grant the permission. User policy, role policy. With Keycloak implementation can deliver either through role policy or user policy, each tied to a capability. Explicit separation between scope and the resource.
  • permission has one resource and one scope and multiple policies
  • policies are reusable
  • interacting with this model strictly through FOLIO. In FOLIO work through more familiar abstraction level - roles and capabilities. capabilities are defined by what endpoints a module has and what scopes are allowed on them. capabilities therefore are system defined.
  • what admins can do is create roles. allow particular capabilities that you want to make available to a role and then grant that role to a user.
  • current model drawback is it can be hard to determine what has been granted to a user, have to drill into layers of permission sets, hard to evaluate if they have the right level of permissions. in the new model easier to see exactly what capabilities are provided when granting a role to a user, not nested so able to clearly audit and manage what roles have
  • what has to happen in development for this to work? shouldn't have to do anything, should be able to use existing modules without modification
  • what does that mean in terms of migrating my current set up to this method? have some abilties to do that. can interpret existing perms and create appropriate capabilities. because the assingment is nested and because capabilities generally get assigned as a group, can bundle capabilities into capability sets, system generated, can map existing ui perm sets into pre-determined capability sets and that would be a first step in migration
  • capabilities are derived from what the module offers so if module offers resource with a scope that results in the creation of a capability
  • don't define capabilities in descriptors, etc. use existing module descriptors
  • some module have inconsistency so there is some challenge to make a comprehensive set. would help for some modules to provide declaration and to be consistent in naming
  • would be better if things were more uniform but maybe some modules have subtly different desires. does the new way prohibit modules from taking their own approach to bundling permissions? shouldn't. modules can still declare permissions. inconsistency is stuff like GET vs READ. ambiguous action word type that might mean different things in different modules have to be interpreted. we do have some formalization around this, would be good to have more of that, some perms probably existed before the formalization
  • back end permissions vs front end permissions - in current model have permissions, back end perm sets, front end sets. in eureka perms are capabilities - back end set = capabilities. front end sets = roles. roles are things you can create, capability sets. the capability sets can be aggregated into roles.
  • to operate optimally in eureka, would modules still be backward compatible? there's no way to put capability in module descriptors so basically you keep doing descriptors as you do now with end points and add perms to them, then eureka will create the capabilities
  • where is authorization enforced? in the sidecars
  • Call starts in UI with user and token, goes to Kong which uses routing tables to figure out how to direct the call, tables point to the sidecar so the call goes there next, the sidecar has awareness of capabilities needed to allow call to proceed so it calls Keycloak which is where enforcement happens, Keycloak validates token and then check keycloak storage to see if that user has the needed roles/capabilities/policies, if so then it tells the sidecar and the sidecar sends the request to the module, then Kong passes the response back to the UI
  • these new components are single purpose
  • the current approach is less linear, more hub and spoke with okapi at the center
  • is there overhead with sidecars? probably less because there are fewer back and forth calls in the more linear eureka model than the hub and spoke okapi model
  • authorization is closer to the module with less traffic in the api gateway
  • with module to module communication how do the sidecars know where the other sidecars are? at start up the sidecars check in, call the manager, get information from the tenant manager about what module they are associated with and the required permissions, also learns discovery information for the modules its module depends on, also subscribes to a Kafka topic, so if a new tenant or something gets enabled, the new info comes from kafka or new discovery info, etc if something changes.
  • now the required interfaces in the module descriptor being described accurately actually matters. in classic FOLIO some inaccuracies didn't matter but in Eureka they matter a lot more because that is how sidecars get info about how to route from module to module.
  • right now there are a bunch of missing interface dependencies and they have been filing JIRAs. however there is also a temp mode where if a sidecar doesn't know what to do it can send it to kong instead which will know how to route. log specific warning. fallback mode should be noisy.
  • the module sidecars discover based on the interface. what happens if different versions of modules offer same interface version? some modules have large number of interfaces that don't necessarily increment. side car needs to learn about mod-foo discovery info etc, it's not tied to a version of the interface, it's tied to what the tenant has as a specific implementation. can only be one instance of an interface for a tenant at a given time. it gets the bootstrap information that shows dependencies, gets their discovery info for that tenant. right now could deploy two versions of same module in the same installation with different tenants having different versions enabled.  tenant entitlement is part of bootstrap info. not sure if keyed off tenant ID as well.  are you required to have all tenants in one environment on the same release? ID will have env with tenants on different flowers. FSE doesn't do that. Two tenants can currently run different sets of modules in the same env. At A&M when test upgrade versions and when used to do migrations, would leave one tenant alone and make new tenant and enable new and existing modules all in the same instance of FOLIO. Trying to understand how deployments need/don't need to change. Not sure that it is a problem to do. Craig McNally will check on it.
  • computational load of one okapi vs many many sidecars? is hub and spoke more cost effective than many side cars? number of api calls is less, not necessarily less resources but sidecars are intended to have very small footprint, Quarkus lets compile to native binaries which helps reduce footprint over jars. Is the new version more resource hungry? Not sure. Previous answer was more about traffic. Same tasks need to be performed regardless of pattern so not sure how much different. In design did want to keep from increasing which is why went with small footprint sidecars but don't have numbers.
  • can it continue to run on single machine for folks using that deployment for their development work? this is the vagrant box.
  • can we still develop using vagrant boxes? team created docker images that worked, how many modules you can run depends on your machine. all the parts are included in the setup so all containers created.
  • TLS communication between Kong and sidecar can they be TLS? module authentication. Kong isn't terminating TLS, sidecar is doing the termination. Kong needs to look into request though. Craig McNally will get back to Julian Ladisch

Questions for other venues

The goal for today is to have a technical discussion, if questions around product or governance arise, let's record them here for discussion in another forum:

Diagrams shown during the session

Zoom Chat

11:01:57 From Craig McNally to Everyone:
ok back
11:03:59 From vbar to Everyone:
Sorry audio mic problems
11:04:07 From vbar to Everyone:
I’ll reconnect
11:10:50 From Wayne Schneider to Everyone:
Just to be clear, AFAIK currently FOLIO permissions are not just attached to endpoints, they are also attached to methods (e.g. POST).
11:11:30 From Craig McNally to Everyone:
Replying to "Just to be clear, AF..."

Correct. In the previous diagram this was indicated by the permission being associated with a resource and scope
11:15:23 From Tod Olson to Everyone:
Sorry I'm late.
11:29:21 From Thomas Trutt to Everyone:
Will the change to KeyCloak solve the issues of a user having to sign out and back in to gain newly assigned permissions?
11:33:07 From Martin Scholz to Everyone:
Replying to "Will the change to K..."

To my understanding, this is a limitation of Stripes, not the backend. So it will most likely remain the same.
11:37:51 From Martin Scholz to Everyone:
I wonder if one powerful Okapi would still be more computation/cost-effective than hundreds? of sidecars 🤔
11:39:56 From Charlotte Whitt to Everyone:
Reacted to "I wonder if one powe..." with 🤔
11:41:18 From Thomas Trutt to Everyone:
Replying to "I wonder if one powe..."

You would have to do load tests, but your still running all traffic through one module, which can become a bottle neck.
11:44:44 From Julian Ladisch to Everyone:
Replying to "I wonder if one powe..."

Running tenant a+b on okapi 1, tenant c+d on okapi 2, etc. might be an option.
11:44:54 From Olamide to Everyone:
You cant enable the multiple versions of the same module for a tenant. RIght?
11:45:07 From Wayne Schneider to Everyone:
Replying to "You cant enable the ..."

11:45:51 From Martin Scholz to Everyone:
Replying to "I wonder if one powe..."

I'm not a sysop and just curious what would cause less Computational load? Running a lot of small sidecars with Little resources or one Okapi with lots of resources
11:47:21 From Zak Burke to Everyone:
Replying to "Will the change to K..."

Less a “limitation of stripes” than a “bug in stripes”. This was mostly, but not completely, addressed in https://github.com/folio-org/stripes-core/pull/1425 for STCOR-813. The permissions are now correctly assigned but discovery still doesn’t update until signing out and back in.

We are still working on that and the fix should land both Eureka and community branches.
11:47:22 From Florian Gleixner to Everyone:
Replying to "You cant enable the ..."

We use that too
11:47:54 From Martin Scholz to Everyone:
Reacted to "Less a “limitation o..." with 👍
11:48:31 From Thomas Trutt to Everyone:
Reacted to "Less a “limitation o..." with 👍
11:55:54 From Maccabee Levine to Everyone:
Sorry for the awful audio...referring to the vagrant boxes, confirming the resource loads don't change that
11:58:07 From Maccabee Levine to Everyone:
Replying to "Sorry for the awful ..."

Thanks Taras
11:58:21 From Wayne Schneider to Everyone:
Replying to "Sorry for the awful ..."

And where would developers find this?
12:00:48 From Mike Taylor to Everyone:
Did I get the start-time wrong? I was here dead on 5pm UK, and it feels like the meeting has already got deep into technical details.
12:01:01 From Maccabee Levine to Everyone:
I think my two questions in the doc are still open questions, but they don't need to be answered here / today
12:01:18 From Maccabee Levine to Everyone:
Will the single-server installation documents (fresh install and upgrade) be updated to reflect the various architectural changes?
12:01:23 From Maccabee Levine to Everyone:
Will the Vagrant boxes be updated to support the various architectural changes?
12:01:26 From Wayne Schneider to Everyone:
Replying to "Did I get the start-..."

Sorry, Mike, yes you are an hour late
12:01:31 From Shelley Doljack to Everyone:
12:01:32 From Mike Taylor to Everyone:
Oh, I am A WHOLE HOUR out!!
12:01:33 From Mike Taylor to Everyone:
12:01:41 From Mike Taylor to Everyone:
Did this get recorded?
12:01:47 From Craig McNally to Everyone:
Replying to "Did this get recorde..."