Hosting / SysOps Findings
Overview
As participants in the Eureka early adopter program gain experience with hosting Eureka, it’s expected things will be uncovered which could be helpful to share with others. The purpose of this page is to capture details and eventually incorporate that information into official documentation.
NOTE: a separate page will be created for end-user findings. This page focuses on the hosting/system operations side of the early adopter program.
Findings
Early adopters should add their hosting-related findings, inconsistencies, missing information, other observations here. Feel free to organize this however makes sense to you, this is intended to be a collaborative effort.
GBV: 2025-01-22
We have tested the the following script for the deployment of Keycloak and Kong into a kubernetes cluster: https://github.com/folio-org/eureka-k8s/tree/master
We were able to get kong and keycloak deployed, but certain aspects need to be more precisely covered in the manual:
Env KC_HTTPS_STORE_PASSWORD is listed to be set in the keycloak-credentials, but if you set values other than default, the container crashes.
Also, if the BCFKS encryption is still being enforced, this needs to be mentioned and a source for the java libraries needed to use that encryption should also be provided).
A way to provide your own keystore file is missing and needs to be described.
In setup.sh:
Regarding the KONG_URL, it would be helpful to provide guidance on how this URL is formed during deployment. The pattern is http://<deploymentname>.<namespace>.clusterurl:8001 in our example.
For the create secret parts: there should be a "delete if exists" clause or a similar mechanism to handle cases where the script is rerun. Without this, changes won't take effect if parameters are updated in the script after the initial run. In general, implementing a "cleanup on failure" mechanism would also be beneficial.
There is an issue with the final checks in the script for the running pods. The scripts uses the label release=keycloak and release=kong, but the deployments do not set these labels, therefore the checks can never succeed.
GBV: 2025-01-29
If deployed with your new setup.sh, Keycloak does not fully start. It attempts (50 times) to log in as KEYCLOAK_ADMIN_USER, but this fails. We had to deploy Keycloak in development mode using /opt/keycloak/bin/kc.sh start-dev. In this mode, the login works, and KC_FOLIO_BE_ADMIN_CLIENT_ID is set up correctly.
In PROD-mode we tried to login manually inside the container
sh-5.1$ /opt/keycloak/bin/kcadm.sh config credentials --server http://localhost:8080 --realm master --user admin --password "${KEYCLOAK_ADMIN_PASSWORD}"
Logging into http://localhost:8080 as user admin of realm master
Failed to send request - Connect to localhost:8080 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refusedBut it worked in DEV-mode
folman@k8s-node-01:~$ login2pod.sh keycloak-0 eureka
Defaulted container "keycloak" out of: keycloak, k8tz (init)
sh-5.1$ /opt/keycloak/bin/kcadm.sh config credentials --server http://keycloak:8080 --realm master --user admin --password "${KEYCLOAK_ADMIN_PASSWORD}"
Logging into http://keycloak:8080 as user admin of realm masterBack to PROD-mode everything is fine. (Except the MeterFilters …)
Appending additional Java properties to JAVA_OPTS
2025-01-29 14:47:58,044 INFO [org.keycloak.common.Profile] (main) Preview features enabled: admin-fine-grained-authz:v1, scripts:v1, token-exchange:v1
2025-01-29 14:47:58,097 ERROR [org.keycloak.quarkus.runtime.configuration.mappers.PropertyMappers] (main) Hostname v1 options [proxy] are still in use, please review your configuration
2025-01-29 14:47:58,226 INFO [org.keycloak.common.Profile] (main) Preview features enabled: admin-fine-grained-authz:v1, scripts:v1, token-exchange:v1
2025-01-29 14:48:01,644 WARN [io.micrometer.core.instrument.composite.CompositeMeterRegistry] (main) A MeterFilter is being configured after a Meter has been registered to this registry. All MeterFilters should be configured before any Meters are registered. If that is not possible or you have a use case where it should be allowed, let the Micrometer maintainers know at https://github.com/micrometer-metrics/micrometer/issues/4920. Enable DEBUG level logging on this logger to see a stack trace of the call configuring this MeterFilter.
2025-01-29 14:48:01,756 INFO [org.keycloak.broker.provider.AbstractIdentityProviderMapper] (main) Registering class org.keycloak.broker.provider.mappersync.ConfigSyncEventListener
2025-01-29 14:48:01,815 INFO [org.keycloak.quarkus.runtime.storage.infinispan.CacheManagerFactory] (main) Starting Infinispan embedded cache manager
2025-01-29 14:48:01,921 INFO [org.infinispan.CONTAINER] (main) Virtual threads support enabled
2025-01-29 14:48:02,096 INFO [org.infinispan.CONTAINER] (main) ISPN000556: Starting user marshaller 'org.infinispan.commons.marshall.ImmutableProtoStreamMarshaller'
2025-01-29 14:48:02,268 WARN [org.jgroups.stack.Configurator] (main) JGRP000014: ThreadPool.thread_dumps_threshold has been deprecated: ignored
2025-01-29 14:48:02,285 INFO [org.infinispan.CLUSTER] (main) ISPN000078: Starting JGroups channel ISPN with stack jdbc-ping
2025-01-29 14:48:02,286 INFO [org.jgroups.JChannel] (main) local_addr: 29c18646-8fcb-4208-bb26-bb332087c918, name: keycloak-0-42083
2025-01-29 14:48:02,295 INFO [org.jgroups.protocols.FD_SOCK2] (main) server listening on *.57800
2025-01-29 14:48:02,332 INFO [org.jgroups.protocols.pbcast.GMS] (main) keycloak-0-42083: no members discovered after 35 ms: creating cluster as coordinator
2025-01-29 14:48:02,665 INFO [org.infinispan.CLUSTER] (main) ISPN000094: Received new cluster view for channel ISPN: [keycloak-0-42083|0] (1) [keycloak-0-42083]
2025-01-29 14:48:02,771 INFO [org.infinispan.CLUSTER] (main) ISPN000079: Channel ISPN local address is keycloak-0-42083, physical addresses are [10.42.7.208:7800]
2025-01-29 14:48:03,231 INFO [org.keycloak.connections.infinispan.DefaultInfinispanConnectionProviderFactory] (main) Node name: keycloak-0-42083, Site name: null
2025-01-29 14:48:04,075 WARN [io.agroal.pool] (main) Datasource '<default>': JDBC resources leaked: 3 ResultSet(s) and 0 Statement(s)
2025-01-29 14:48:04,233 INFO [io.quarkus] (main) Keycloak 26.1.0 on JVM (powered by Quarkus 3.15.2) started in 6.786s. Listening on: http://0.0.0.0:8080. Management interface listening on http://0.0.0.0:9000.
2025-01-29 14:48:04,233 INFO [io.quarkus] (main) Profile prod activated.
2025-01-29T13:48:04.234164102Z 2025-01-29 14:48:04,234 INFO [io.quarkus] (main) Installed features: [agroal, cdi, hibernate-orm, jdbc-postgresql, keycloak, micrometer, narayana-jta, opentelemetry, reactive-routes, rest, rest-jackson, smallrye-context-propagation, smallrye-health, vertx]
Using a NodePort and NGINX as a reverse proxy, we finally got it online: https://keycloak.folio.gbv.de/ (you need a whitelisted IP to access it). To make this work, we had to adjust KC_HOSTNAME and KC_HOSTNAME_BACKCHANNEL to <https://keycloak.folio.gbv.de/.
GBV: 2025-02-12
We have the MGR-Apps in place. Unfortunately the security-setup does not work until now. If we enable SECURITY_ENABLED=true for mgr-applications with with EPHEMERAL, the Container crashes. We guess the secrets must be stored in VAULT, but there is no hint in the existing setup-instractions how to do it.
Vorschlag: We tried to start the mgr-applications deployment but failed so far. We currently assume that mgr-apps tries to retrieve a token from keycloak and fails with the following message during what seems to be the bootstrap step triggered by the KC_IMPORT_ENABLED = true. It generates the following error in mgr-applications log:
ENV:11:16:15 ERROR SpringApplication Application run failed 2025-02-11T10:16:15.178513174Z jakarta.ws.rs.ProcessingException: jakarta.ws.rs.NotAuthorizedException: HTTP 401 Unauthorized 2025-02-11T10:16:15.178528171Z at org.jboss.resteasy.client.jaxrs.internal.ClientInvocation.filterRequest(ClientInvocation.java:652) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final]
at org.jboss.resteasy.client.jaxrs.internal.ClientInvocation.invoke(ClientInvocation.java:424) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final]
at org.jboss.resteasy.client.jaxrs.internal.proxy.ClientInvoker.invokeSync(ClientInvoker.java:134) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final]
at org.jboss.resteasy.client.jaxrs.internal.proxy.ClientInvoker.invoke(ClientInvoker.java:103) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final]
at org.jboss.resteasy.client.jaxrs.internal.proxy.ClientProxy.invoke(ClientProxy.java:102) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final] 2025-02-11T10:16:15.178569372Z at jdk.proxy2/jdk.proxy2.$Proxy288.findByClientId(Unknown Source) ~[?:?]
at org.folio.security.integration.keycloak.service.KeycloakImportService.getClientIfExists(KeycloakImportService.java:205) ~[folio-security-1.5.8.jar!/:1.5.8]
at org.folio.security.integration.keycloak.service.KeycloakImportService.importData(KeycloakImportService.java:57) ~[folio-security-1.5.8.jar!/:1.5.8]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) ~[?:?]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) ~[?:?]
at java.base/java.lang.reflect.Method.invoke(Unknown Source) ~[?:?]
at org.springframework.context.event.ApplicationListenerMethodAdapter.doInvoke(ApplicationListenerMethodAdapter.java:365) ~[spring-context-6.1.13.jar!/:6.1.13]
at org.springframework.context.event.ApplicationListenerMethodAdapter.processEvent(ApplicationListenerMethodAdapter.java:237) ~[spring-context-6.1.13.jar!/:6.1.13]
at org.springframework.context.event.ApplicationListenerMethodAdapter.onApplicationEvent(ApplicationListenerMethodAdapter.java:168) ~[spring-context-6.1.13.jar!/:6.1.13]
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:185) ~[spring-context-6.1.13.jar!/:6.1.13] 2025-02-11T10:16:15.178655642Z at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:178) ~[spring-context-6.1.13.jar!/:6.1.13]
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:156) ~[spring-context-6.1.13.jar!/:6.1.13]
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:452) ~[spring-context-6.1.13.jar!/:6.1.13]
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:385) ~[spring-context-6.1.13.jar!/:6.1.13]
at org.springframework.boot.context.event.EventPublishingRunListener.ready(EventPublishingRunListener.java:109) ~[spring-boot-3.3.4.jar!/:3.3.4]
at org.springframework.boot.SpringApplicationRunListeners.lambda$ready$6(SpringApplicationRunListeners.java:80) ~[spring-boot-3.3.4.jar!/:3.3.4] 2025-02-11T10:16:15.178718891Z at java.base/java.lang.Iterable.forEach(Unknown Source) ~[?:?] 2025-02-11T10:16:15.178726477Z at org.springframework.boot.SpringApplicationRunListeners.doWithListeners(SpringApplicationRunListeners.java:118) ~[spring-boot-3.3.4.jar!/:3.3.4]
at org.springframework.boot.SpringApplicationRunListeners.doWithListeners(SpringApplicationRunListeners.java:112) ~[spring-boot-3.3.4.jar!/:3.3.4]
at org.springframework.boot.SpringApplicationRunListeners.ready(SpringApplicationRunListeners.java:80) ~[spring-boot-3.3.4.jar!/:3.3.4]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:349) [spring-boot-3.3.4.jar!/:3.3.4]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1363) [spring-boot-3.3.4.jar!/:3.3.4]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1352) [spring-boot-3.3.4.jar!/:3.3.4]
at org.folio.am.ApplicationManagerApplication.main(ApplicationManagerApplication.java:15) [!/:1.4.1]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) ~[?:?]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) ~[?:?] 2025-02-11T10:16:15.178817696Z at java.base/java.lang.reflect.Method.invoke(Unknown Source) ~[?:?]
at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:102) [mgr-applications-fat.jar:1.4.1] 2025-02-11T10:16:15.178832366Z at org.springframework.boot.loader.launch.Launcher.launch(Launcher.java:64) [mgr-applications-fat.jar:1.4.1]
at org.springframework.boot.loader.launch.JarLauncher.main(JarLauncher.java:40) [mgr-applications-fat.jar:1.4.1] 2025-02-11T10:16:15.178847244Z Caused by: jakarta.ws.rs.NotAuthorizedException: HTTP 401 Unauthorized
at org.jboss.resteasy.client.jaxrs.internal.ClientInvocation.handleErrorStatus(ClientInvocation.java:238) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final] 2025-02-11T10:16:15.178862497Z at org.jboss.resteasy.client.jaxrs.internal.ClientInvocation.extractResult(ClientInvocation.java:216) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final]
at org.jboss.resteasy.client.jaxrs.internal.proxy.extractors.BodyEntityExtractor.extractEntity(BodyEntityExtractor.java:59) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final] 2025-02-11T10:16:15.178876978Z at org.jboss.resteasy.client.jaxrs.internal.proxy.ClientInvoker.invokeSync(ClientInvoker.java:136) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final]
at org.jboss.resteasy.client.jaxrs.internal.proxy.ClientInvoker.invoke(ClientInvoker.java:103) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final] 2025-02-11T10:16:15.178891851Z at org.jboss.resteasy.client.jaxrs.internal.proxy.ClientProxy.invoke(ClientProxy.java:102) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final] 2025-02-11T10:16:15.178898950Z at jdk.proxy2/jdk.proxy2.$Proxy271.grantToken(Unknown Source) ~[?:?]
at org.keycloak.admin.client.token.TokenManager.grantToken(TokenManager.java:95) ~[keycloak-admin-client-26.0.4.jar!/:26.0.4] 2025-02-11T10:16:15.178922771Z at org.keycloak.admin.client.token.TokenManager.getAccessToken(TokenManager.java:71) ~[keycloak-admin-client-26.0.4.jar!/:26.0.4] 2025-02-11T10:16:15.178930320Z at org.keycloak.admin.client.token.TokenManager.getAccessTokenString(TokenManager.java:66) ~[keycloak-admin-client-26.0.4.jar!/:26.0.4]
at org.keycloak.admin.client.resource.BearerAuthFilter.filter(BearerAuthFilter.java:52) ~[keycloak-admin-client-26.0.4.jar!/:26.0.4] 2025-02-11T10:16:15.178944866Z at org.jboss.resteasy.client.jaxrs.internal.ClientInvocation.filterRequest(ClientInvocation.java:644) ~[resteasy-client-6.2.9.Final.jar!/:6.2.9.Final]keycloak side gives the following warning:
WARN [org.keycloak.events] (executor-thread-8) type="CLIENT_LOGIN_ERROR", realmId="dc4df79e-3491-48ba-9a06-5179cb1c4a06", realmName="master", clientId="folio-backend-admin-client", userId="null", ipAddress="10.42.3.212", error="invalid_client_credentials", grant_type="client_credentials"suggesting that the provided credentials are wrong. The same credentials can be used with curl to successfully get a token though. Trying to use username/password and grant_type=password results in the same problem, even with the keycloak admin user. We also compared the keycloak settings for the folio-backend-admin-client with the eureka-platform-bootstrap for single-server backend-user and found them to be identical in settings. Comparing the token from our kubernetes deployment with the eureka-platform-bootstrap token reveals no difference between the two. We therefore rule out that the credentials or the token itself are at fault here. To rule out any sort of misconfiguration leftovers on the database side from previous attempts, we restarted the whole process on a clean database for keycloak and mgr-apps with the same results.
Stanford University: Aug/Sept 2025
When entitling platform-complete we ran into several issues related to modules' system users, specifically around creating the user in okapi db mod_users.users and keycloak db user_entity. The READMEs for these troublesome modules do not include information about setting environment variables to turn off system user creation. We use the folio-helm-v2 charts to deploy and found that we needed to add the following override:
extraJavaOpts: - "-XX:MaxRAMPercentage=75.0" - "-DSYSTEM_USER_ENABLED=false"to these modules (these were NOT documented on the ramsons release notes pages):
mod-entities-links
mod-lists
Our sidecars couldn’t talk to the secret store we are using, Vault. Using the folio-helm-v2 charts, we had to add some override values. It would be beneficial for the community if the folio-helm-v2 _sidecar.tpl file allowed for valueFrom.secretKeyRef
sidecarContainers: eureka: extraEnvVars: - name: SECRET_STORE_VAULT_TOKEN value: ${password} - name: SECRET_STORE_VAULT_ADDRESS value: ${url}Some of the endpoints in mgr-tenant-entitlement just plain don’t work (or our issuer in the JWT didn’t match what was expected). We tried reinstall/applications and reinstall/modules and they didn’t work. So we had to delete entitlements to redo some steps.
There was a lot of brokeness around the users in Vault and the entitlement process trying to create them in Keycloak. When we asked the folio-eureka-ea-agent AI agent about “The request to the mgr-* module endpoints (/tenant, /applications, /entitlements) DO work, but the requests to the mod-*-keycloak endpoints (/roles, /users) DO NOT work.” it replied:
## **Summary** **mgr-* endpoints** = Management/administrative operations with relaxed auth **mod-*-keycloak endpoints** = Business operations requiring user identity and capabilities **Your current approach works perfectly** for mgr-* endpoints **For mod-*-keycloak**, you need **user-based authentication** with **proper capabilities**
The “system” users were not created in Keycloak when we entitled app-platform-minimal. They made it to Vault but not Keycloak. However, the capabilities for *-keycloak modules DID get created. To get past some of these issues, we ended up creating these by hand in Keycloak. We will see what happens here when we startover Eureka deployment after clearing out Keycloak db.
General observation regarding authorization: the AI agent said:
## **Root Cause: Different Authentication Requirements** The issue is **NOT with your token or configuration** - it's due to **fundamentally different authentication architectures** between these module types: ### **mgr-* Module Endpoints (Working)** - **mgr-applications**, **mgr-tenants**, **mgr-tenant-entitlements** - These are **management modules** with **relaxed authentication** - **GET requests** work **without authorization headers** - **POST requests** require Authorization tokens (as of May 12, 2025) - Use **simple token validation** ### **mod-*-keycloak Endpoints (Failing)** - **mod-roles-keycloak**, **mod-users-keycloak** - These are **business logic modules** with **strict authentication** - **ALL requests** require **proper tenant context and user permissions** - Require **specific FOLIO capability-based authorization** - Need **user identity validation** beyond just token validationI have to wonder how this is better than securing supertenant with Okapi? Shouldn’t the POST requests to mgr modules require authentication in Keycloak and have specific roles for the realm?
Stanford University: Oct 2025
When we try to login to the FOLIO UI with Eureka backend, we see the following error in the mod-login-keycloak logs:
aused by: org.jose4j.lang.UnresolvableKeyException: SRJWT07003: Failed to load a key from https://keycloak-folio-dev.stanford.edu/realms/master/protocol/openid-connect/certs
2025-10-01T09:37:00.068-07:00 at io.smallrye.jwt.auth.principal.AbstractKeyLocationResolver.reportLoadKeyException(AbstractKeyLocationResolver.java:218)
2025-10-01T09:37:00.068-07:00 at io.smallrye.jwt.auth.principal.KeyLocationResolver.<init>(KeyLocationResolver.java:49)
2025-10-01T09:37:00.068-07:00 at io.smallrye.jwt.auth.principal.DefaultJWTTokenParser.getVerificationKeyResolver(DefaultJWTTokenParser.java:275)
2025-10-01T09:37:00.068-07:00 at io.smallrye.jwt.auth.principal.DefaultJWTTokenParser.parseClaims(DefaultJWTTokenParser.java:120)
2025-10-01T09:37:00.068-07:00 ... 151 more
2025-10-01T09:37:00.068-07:00 Caused by: java.net.SocketException: Connection resetOur kubernetes cluster does not have hairpin mode turned on, so that the requests from the sidecar to the Keycloak FQDN to check the public key, doesn’t resolve. There is a question of whether or not the public key could be mounted to the sidecars, but the code to check the cert parses the JWT for the issuer URL, and appends /protocol/openid-connect/certs to it in order to lookup the public key. For requests coming from outside the kubernetes cluster, e.g. the client’s browser from the FOLIO UI, it makes sense that the issuer URL is the keycloak FQDN. It’s still an open question in my mind (@Shelley Doljack ) if the sidecars should be able to use the keycloak kubernetes service to resolve the cert instead of having to go out to the internet to do so. I wonder what the security implications would be if the sidecars didn’t use the issuer URL from the JWT originating from the FOLIO UI to check the cert.
Being able to configure keycloak with a backchannel URL so that certificate requests can be done via an internal network service instead of the public FQDN seems to be contentious within the wider community since it breaks the OpenID Connect specification. https://github.com/keycloak/keycloak/issues/27660#issuecomment-2202611587
We enabled hairpin mode Hairpin mode is actually enabled by default for our kubernetes cluster. and that didn’t fix the problem. After reading further about hairpin mode, turning it on wouldn’t have fixed the problem anyways because that is for a request coming from the same interface, in other words, the request to the service would have to be coming from the same pod to the same pod (e.g. keycloak making a request to itself). The sidecar containers are running in different module pods in the cluster, so hairpin mode doesn’t come into play at all.
The real problem was with the F5 load balancer in front of the NGINX ingress controller. We had to enable SNAT mode on the load balancer for the network. After doing so, the sidecars were able to verify the cert at the keycloak FQDN when logging in as eureka_admin via the UI.