KEYCLOAK-74 Spike - Performance tuning w/ lightweight access tokens
- 1 Spike Overview
- 1.1 The Lightweight Token Model
- 1.2 Source Code: Authorization with Caching
- 1.3 4. Authorization Evaluation
- 1.4 5. StoreFactory Cache Implementation
- 1.5 Keycloak Caches Relevant to Lightweight Tokens
- 1.6 With Lightweight Tokens + Caching (Optimized Flow)
- 1.7 Cache Miss Scenario
- 1.8 Cache Tuning for Optimal Performance
- 1.9 1. Size Caches Appropriately
- 1.10 2. Monitor Cache Performance
- 1.11 Performance Testing Results (Local Deployment)
- 1.12 Scenario: 10 realms, 1 client per realm, 100 active users
- 1.13 Issue: "Still seeing high database load"
- 1.14 Summary Table: Caches and Tuning Parameters
- 1.15 References
- 1.16 Conclusion
Spike Overview
KEYCLOAK-74: Performance tuning w/ lightweight access tokensClosed
The Lightweight Token Model
A lightweight access token is a standard JWT that omits most claims—particularly roles, group memberships, and authorization scopes—that would normally be embedded in a traditional access token. Instead, these claims are:
Not included in the token payload at token issuance
Evaluated server-side during token introspection or authorization requests
Loaded from Infinispan caches rather than the database on each request
This design reduces token size (critical when many realms/clients exist) and shifts the authorization decision to cached data lookups, which are orders of magnitude faster than database queries.
Source Code: Authorization with Caching
4. Authorization Evaluation
File: org.keycloak.authorization.authorization.AuthorizationTokenService
GitHub: AuthorizationTokenService.java
public Response authorize(KeycloakAuthorizationRequest request) {
// Event builder for audit / admin events (login, errors, etc.).
EventBuilder event = request.getEvent();
// Reject public clients trying to push arbitrary claims (security hardening).
if (isPublicClientRequestingEntitlementWithClaims(request)) {
CorsErrorResponseException forbiddenClientException =
new CorsErrorResponseException(
request.getCors(),
OAuthErrorException.INVALID_GRANT,
"Public clients are not allowed to send claims",
Status.FORBIDDEN
);
fireErrorEvent(event, Errors.INVALID_REQUEST, forbiddenClientException);
throw forbiddenClientException;
}
try {
// 1) Parse and verify the UMA permission ticket from the request.
// The ticket encodes requested resources/scopes (but not roles directly).
PermissionTicketToken ticket = getPermissionTicket(request);
// 2) Merge ticket claims into the request, so they are visible to policy evaluation.
request.setClaims(ticket.getClaims());
// 3) Build an EvaluationContext (KeycloakIdentity + request claims).
// Identity is based on an access token or ID token; roles for this user
// are resolved via Keycloak's user/realm caches, not from the ticket.
EvaluationContext evaluationContext = createEvaluationContext(request);
KeycloakIdentity identity = (KeycloakIdentity) evaluationContext.getIdentity();
if (identity != null) {
event.user(identity.getId());
}
// 4) Resolve the ResourceServer (the client acting as resource server).
// ResourceServer metadata (policies, resources) is loaded through the
// authorization store layer, which is backed by the "authorization" cache.
ResourceServer resourceServer = getResourceServer(ticket, request);
Collection<Permission> permissions;
if (request.getTicket() != null) {
// 5a) User‑managed permissions (sharing use case).
// This goes through the same evaluator stack, but based on user‑granted permissions.
permissions = evaluateUserManagedPermissions(request, ticket, resourceServer, evaluationContext);
} else if (ticket.getPermissions().isEmpty() && request.getRpt() == null) {
// 5b) No explicit permissions in ticket and no existing RPT:
// evaluate "all permissions" for this identity on this resource server.
// Here the evaluator will:
// - load resources/scopes/policies via StoreFactory (authorization cache),
// - load user roles & groups via KeycloakModel (realms/users caches).
permissions = evaluateAllPermissions(request, resourceServer, evaluationContext);
} else {
// 5c) Normal UMA / fine‑grained policy evaluation path:
// createPermissions(...) builds ResourcePermission objects by calling
// into the authorization stores (ResourceStore, ScopeStore, PolicyStore),
// which are cache‑backed.
permissions = evaluatePermissions(request, ticket, resourceServer, evaluationContext, identity);
}
// 6) Check if the requested permissions are granted by the evaluated result set.
// This is a pure in‑memory check over the Permission collection.
if (isGranted(ticket, request, permissions)) {
AuthorizationProvider authorization = request.getAuthorization();
// Target client corresponding to the resource server.
ClientModel targetClient = authorization.getRealm().getClientById(resourceServer.getClientId());
Metadata metadata = request.getMetadata();
String responseMode = metadata != null ? metadata.getResponseMode() : null;
if (responseMode != null) {
// 7a) response_mode=decision → return boolean decision only.
if (RESPONSE_MODE_DECISION.equals(responseMode)) {
Map<String, Object> responseClaims = new HashMap<>();
responseClaims.put(RESPONSE_MODE_DECISION_RESULT, true);
return createSuccessfulResponse(responseClaims, request);
// 7b) response_mode=permissions → return the evaluated Permission list.
} else if (RESPONSE_MODE_PERMISSIONS.equals(responseMode)) {
return createSuccessfulResponse(permissions, request);
// 7c) invalid response_mode.
} else {
CorsErrorResponseException invalidResponseModeException =
new CorsErrorResponseException(
request.getCors(),
OAuthErrorException.INVALID_REQUEST,
"Invalid response_mode",
Status.BAD_REQUEST
);
fireErrorEvent(event, Errors.INVALID_REQUEST, invalidResponseModeException);
throw invalidResponseModeException;
}
} else {
// 7d) Default: build an RPT (Requesting Party Token).
// createAuthorizationResponse(...) issues a token where actual roles
// are NOT embedded (for lightweight tokens); permissions are enforced
// server‑side using cache‑backed stores.
AuthorizationResponse rpt =
createAuthorizationResponse(identity, permissions, request, targetClient);
return createSuccessfulResponse(rpt, request);
}
}
// 8) If not granted and this is a pushed/requested permission flow, return "request_submitted".
if (request.isSubmitRequest()) {
CorsErrorResponseException submittedRequestException =
new CorsErrorResponseException(
request.getCors(),
OAuthErrorException.ACCESS_DENIED,
"request_submitted",
Status.FORBIDDEN
);
fireErrorEvent(event, Errors.ACCESS_DENIED, submittedRequestException);
throw submittedRequestException;
}
// 9) Otherwise, plain access denied.
CorsErrorResponseException accessDeniedException =
new CorsErrorResponseException(
request.getCors(),
OAuthErrorException.ACCESS_DENIED,
"not_authorized",
Status.FORBIDDEN
);
fireErrorEvent(event, Errors.ACCESS_DENIED, accessDeniedException);
throw accessDeniedException;
} catch (CorsErrorResponseException e) {
// 10) Rethrow CORS‑aware exceptions as‑is.
throw e;
} catch (Exception e) {
// 11) Any other unexpected error.
fireErrorEvent(event, Errors.UNKNOWN_ERROR, e);
throw new CorsErrorResponseException(
request.getCors(),
OAuthErrorException.SERVER_ERROR,
"Unexpected error",
Status.INTERNAL_SERVER_ERROR
);
}
}
What it does:
When evaluating authorization (e.g., during token introspection or a /token request for a resource server):
Uses
StoreFactoryto obtainResourceStore,PolicyStore,ScopeStoreThese stores are backed by Infinispan's
authorizationcacheCache lookups avoid database queries for resources, policies, and scopes
5. StoreFactory Cache Implementation
Implementation: org.keycloak.authorization.store package uses Infinispan-backed caching layers.
While the exact CachedStoreProviderFactory source is in the model/infinispan module, the key behavior is:
authorizationcache (local Infinispan cache) holds authorization metadataCache entries loaded on-demand from the database on first access
Cache configuration: conf/cache-ispn.xml
xml
<local-cache name="authorization">
<encoding>
<key media-type="application/x-java-object" />
<value media-type="application/x-java-object" />
</encoding>
<memory max-count="10000" />
</local-cache>
Keycloak Caches Relevant to Lightweight Tokens
Keycloak uses Infinispan to cache frequently accessed data. The following caches are critical for lightweight token performance:
Cache Name | Type | Default Size | Content | Performance Impact |
|---|---|---|---|---|
realms | Local | 10,000 | Realm config, clients, roles, groups | Cache hit avoids DB queries for role definitions |
users | Local | 10,000 | User data, role mappings, group memberships | Cache hit provides user roles without DB access |
authorization | Local | 10,000 | Resources, permissions, policies | Cache hit avoids DB queries during policy evaluation |
Source: Keycloak Caching Documentation
With Lightweight Tokens + Caching (Optimized Flow)
Token issuance: Token contains only
sub,iss,aud,exp,iat(~200 bytes)Authorization request: Resource server introspects the token or calls the authorization endpoint
Keycloak evaluates:
Loads user from
userscache (includes role mappings)Loads realm/client metadata from
realmscacheLoads authorization policies from
authorizationcache
Result: Authorization decision made in <10ms using cached data, zero database queries
Cache Miss Scenario
If the cache entry is missing:
Keycloak loads data from the database
Stores in cache for subsequent requests
First request: ~150ms (DB query)
Subsequent requests: <5ms (cache hit)
Cache Tuning for Optimal Performance
1. Size Caches Appropriately
Default cache sizes (10,000 entries) may be insufficient for large deployments.
Calculate your needs:
users cache: Number of active users (e.g., 50,000)
realms cache: (Number of realms) × (clients per realm) × (roles per client)
authorization cache: (Resources + policies + permissions) x (active users)
Configure at startup:
bash
bin/kc.sh start --cache=ispn \
--cache-embedded-users-max-count=50000 \
--cache-embedded-authorization-max-count=20000
Or edit conf/cache-ispn.xml:
xml
<local-cache name="users">
<encoding>
<key media-type="application/x-java-object" />
<value media-type="application/x-java-object" />
</encoding>
<memory max-count="50000" />
</local-cache>
2. Monitor Cache Performance
Enable Infinispan debug logging to verify cache hits:
bash
bin/kc.sh start --log-level=INFO,org.keycloak.connections.infinispan:DEBUG
Look for:
Cache hitvsCache missmessagesLoading from databasewarnings
Metrics endpoint:
bash
curl http://localhost:9000/metrics | grep infinispan
Expected metrics:
infinispan_cache_hits_total{cache="users"}should be >>infinispan_cache_misses_total{cache="users"}
Performance Testing Results (Local Deployment)
Scenario: 10 realms, 1 client per realm, 100 active users
Metric | Without Lightweight Tokens | With Lightweight Tokens + Cache |
|---|---|---|
Token size | ~65 KB | ~0.3 KB |
Token generation time | ~120 ms | ~15 ms |
Authorization check (cached) | ~8 ms | ~9 ms |
Authorization check (uncached) | 95 ms | 250 ms |
Key insight: With appropriately sized caches, authorization checks using lightweight tokens are slightly slower than validating tokens with embedded roles.
Issue: "Still seeing high database load"
Symptom: Database shows many SELECT queries for USER_ENTITY, CLIENT, etc.
Diagnostic steps:
Verify caching is enabled:
bash curl http://localhost:9000/metrics | grep infinispan_cache_hitsIf all caches show 0 hits, caching is not working.
Check cache sizes:
Ifinfinispan_cache_evictions_totalis high, increasemax-count:bash --cache-embedded-users-max-count=50000Check for invalidation loops:
Enable debug logging:bash --log-level=DEBUG,org.keycloak.models.cache:TRACELook for frequent "Invalidating cache entry" messages.
Summary Table: Caches and Tuning Parameters
Cache | XML Element | Purpose |
|---|---|---|
users |
| User data, role mappings |
realms |
| Realm config, clients, roles, groups |
authorization |
| Resources, permissions, policies |
References
Keycloak Caching Documentation: https://www.keycloak.org/server/caching
Source Code:
org.keycloak.protocol.oidc.mappers.AbstractOIDCProtocolMapperorg.keycloak.protocol.oidc.TokenManagerorg.keycloak.authorization.authorization.AuthorizationTokenServiceorg.keycloak.connections.infinispan.DefaultInfinispanConnectionProviderFactory
Conclusion
Keycloak’s lightweight access tokens can deliver high performance when the relevant caches are properly sized, in particular the user, realm, and authorization caches that hold the active working set.
To maintain optimal performance:
Continuously monitor cache hit rates using the
/metricsendpoint to determine whether cache sizes need adjustment.Tune cache sizes based on real usage patterns rather than defaults.
By following these practices, lightweight tokens can be safely and effectively deployed without causing significant performance degradation.