Kubernetes Example Deployment
Overview
Right in the beginning of a long way we would highly recommend to become familiar with the Folio Eureka Platform Overview document
to be aware of main concepts for the new platform.
Setting Up the Environment
Prerequisites:
Kubernetes Cluster (system for automating deployment, scaling, and management of containerized applications)
PostgreSQL (RDBMS used by Keycloak, Kong Gateway, Eureka modules)
Apache Kafka (distributed event streaming platform)
HashiCorp Vault (identity-based secret and encryption management system)
Keycloak (Identity and Access Management)
Kong Gateway (API Gateway)
MinIO (Enterprise Object Store is built for production environments, OPTIONAL)
Elasticsearch or OpenSearch(enterprise-grade search and observability suite)
MinIO is implementation of Object Storage compatible with AWS S3 service.
It also works the other way around instead of MinIO you are free to use AWS S3 service without any problem.
To set up Eureka Platform you should already have Kubernetes Cluster installed. Then just create a new Namespace within K8s Cluster to assign and manage resources granularity for your Eureka deployment.
You can have your cluster nodes on premise in local data center or adopt any cloud provider (i.e. AWS, Azure, GCP and so on) most suitable for you to meet planned or not planned resource demand.
Eureka Platform depends on a bunch of 3rd party services (listed above) for its expected operation. Some of these services (PostgreSQL, Apache Kafka, OpenSearch, Hashicorp Vault) can be deployed as standalone servces outside of cluster namespace but others mostly never depoloyed outside.
For initial Eureka deployment you will need about 30Gb of RAM. Such setup incorporates all mentioned 3rd party services in one kubernetes namespace.
It may require some extra resources (RAM, CPU, HDD Disk Space, HDD IOPS) to be assigned to destination Kubernetes Cluster in case prerequisites services are deployed in to the same cluster namespace.
Also in case you are going to have Consortia deployment it also needs extra resources to be assigned.
In case you make decision to have everything in one place please pay attention for HDD IOPS required by PostgreSQL/OpenSearch/ApacheKafka services.
PostgreSQL RDBMS should be installed to cluster namespace first since its the prerequisite for Kong Gateway and Keycloak Identity Manager.
Apache Kafka service is used by Eureka for internal communication between modules and very important to keep it in a good shape.
HashiCorp Vault stores all secrets used within Platform. AWS SSM Parameters are also supported as secrets' storage now.
Keycloak service provides authentication and authorization (granting access) for any kind of identities (users, roles, endpoints).
Kong Gateway as API Gateway routes requests to modules and provides access to Eureka REST APIs.
MinIO object storage keeps data for some modules to be used during platform operation.
Elasticsearch instance contains huge amount of information and indexes it for a fast search. It is very important to look after appropriate level of performance for this service. Also can be installed outside of Kubernetes Cluster.
Expected Prerequisites deployment order:
Hashicorp Vault
PostgreSQL
Apache Kafka
ElasticSearch
MinIO (Optional)
Kong Gateway
Keycloak Identity Manager
Cluster setup
Lets assume you are going to set up Eureka Platform development environment on Kubernetes Cluster. To meet resource scalability ease during workload spikes it worth to use Cloud Services like EKS (AWS), AKS (Azure), GKE (GCP).
In the same time to control cloud vendor lock and cut down expences we are going to deploy all prerequisite services into the one cluster namespace except OpenSearch instance :)
To deploy prerequisite services we would recommend to adopt following Container (Docker) Images and Helm Charts:
PostgreSQL container Image: hub.docker.com/bitnami/postgresql , Helm Chart: github.com/bitnami/charts/postgresql
Apache Kafka container Image: hub.docker.com/bitnami/kafka, Helm Chart: github.com/bitnami/charts/kafka
Hashicorp Vault container Image: hub.docker.com/bitnami/kafka, Helm Chart: github.com/bitnami/charts/vault
Keycloak container Image: hub.docker.com/folioci/folio-keycloak, Helm Chart: github.com/bitnami/charts/keycloak, values for values.yaml: github.com/folio-org/pipelines-shared-library/…/keycloak.tf, Git Repository github.com/folio-org/folio-keycloak
Kong Gateway container Image: hub.docker.com/folioci/folio-kong, Helm Chart: charts/bitnami/keycloak, Git Repository github.com/folio-org/folio-kong
MinIO container Image: hub.docker.com/bitnami/minio Helm Chart: github.com/bitnami/charts/minio
Also we need to have Module Descriptors Registry to be in place.
Module Descriptors Registry service (MDR) represents HTTP Server that configured in Kubernetes Pod.
Also this Service can be hosted as a static website using Amazon S3.
This HTTP Server holds and distributes Modules Descriptors for Eureka Instance install and updade.
Module descriptor (see Module Descriptor Template) is generated during Continues Integration Flow and is put to Modules Descriptor Registry on finish.
These modules descriptors are used by Eureka install and update flows.
Deploying EUREKA on Kubernetes
Once all Prerequisites are met we can proceed with mgr-* Eureka modules deployment to cluster namespace:
mgr-applications module:
Github Repository folio-org/mgr-applications
Container Image folioci/mgr-applications
Helm Chart charts/mgr-applications
Helm Chart variable values (
./values/mgr-applications.yaml
file below):
mgr-tenant-entitlements module:
Github Repository folio-org/mgr-tenant-entitlements
Container Image folioci/mgr-tenant-entitlements
Helm Chart charts/mgr-tenant-entitlements
Helm Chart variable values(
./values/mgr-tenant-entitlements.yaml
file below):
mgr-tenants module:
Github Repository folio-org/mgr-tenants
Container Image folioci/mgr-tenants
Helm Chart charts/mgr-tenants
Helm Chart variable values(
./values/mgr-tenants.yaml
file below):
Deploy mgr-* applications to Kubernetes Cluster:
Eureka deployment flow:
Get Master Auth token from Keycloak.
To run administrative REST API requests against Eureka Instance we need to get Master Access Token from Keycloak on the start.
We need to know request parameters first (consider adopting following example)
Keycloak FQDN:
keycloak.example.org
Token Service Endpoint:
/realms/master/protocol/openid-connect/token
Client ID:
folio-backend-admin-client
(this is expected value and should not be changed)Client Secret:
SecretPhrase
Grant Type:
client_credentials
(Constant)
We need to save returned Master Access Token to run within any administrative REST API call later.
Register Applications Descriptors:
REST API Docs for "POST /applications" endpoint.
We need to register application descriptor in Eureka instance. Application descriptor is created from github.com/folio-org/app-platform-full/sprint-quesnelia/app-platform-full.template.json file taken from release branch.
Docs for registerApplication Rest API call - register a new application.
Descriptor is registered with CURL command and related parameters:
Kong Gateway FQDN (http header):
kong.example.org
Auth token (http header):
'Authorization: Bearer...'
Application Descriptor (http request body): JSON data file
Register Modules
REST API Docs for “GET /modules/discovery“ endpoint.
Once required Applications Descriptors are registered in instance we proceed with Module Discovery Flow to register modules in system.
Docs for searchModuleDiscovery Rest API call - Retrieving module discovery information by CQL query and pagination parameters.
Modules Discovery is started with CURL command and related parameters:
Kong Gateway FQDN (HTTP header):
kong.example.org
Auth token (http header):
'Authorization: Bearer...'
Module Discovery Info (http request body): JSON data file
Deploy Backend Modules
Now we are ready to deploy backend modules to Kubernetes Namespace with Eureka instance.
Helm Charts for modules are taken from Github repository folio-org/folio-helm-v2
Variable values for helm charts are stored in dedicated repository folder folio-org/pipelines-shared-library/resources/helm
For exmaple:
Create tenant
REST API Docs for “POST /tenants“ endpoint.
At this point we are ready to create application tenant in Eureka instance.
First we need to take a look on docs for createTenant Rest API call to create a new tenant.
Once we sure about required parameters we give Post HTTP request to create a new tenant.
In our example we create tenant with name “diku” and description “magic happens here”:
Set entitlement
REST API Docs for “POST /entitlements“ endpoint.
We have application tenant created so we can entitle registered applications to our tenant.
In other words we enable application(s) to tenant.
As usual we take a look on docs for create Rest API call to Install/enable application for tenant.
From mentioned docs we can get some info about passed parameters and returned value.
Following example shows how to enable application for our tenant without problem:
Add User
REST API Docs for “POST /users-keycloak/users“ endpoint.
On this stage we are ready to add first User to Eureka Instance to have administrative privileges later.
So checking parameters in docs for createUser Rest API call to create a new user.
Then use CURL command to run POST HTTP request against Eureka Instance:
Set User Password
REST API Docs for “POST /authn/credentials“ endpoint.
Having our user created we are free to assign him some secret password to use it on login.
Just carefully looking through docs for createCredentials Rest API call to add a new login to the system.
Create Role
REST API Docs for “POST /roles“ endpoint.
We need to create a Role to bundle Eureka administrative capabilities with our Admin User.
So accordingly to docs for createRole Rest API call to create a new role we need to run following POST HTTP reuest:
Assign Capabilities to Role
REST API Docs for “POST /roles/capabilities“ endpoint.
And then we just attach required Eureka application capabilities to our Admin Role
Using docs for createRoleCapabilities Rest API call we create a new record associating one or more capabilities with the already created role
To get a list of existing Capabilities we are going to use findCapabilities Rest API call
Add Roles to User
REST API Docs for “POST /roles/users“ endpoint.
The last step in the row is assigning Admin Role to Admin User to provide him Super Power to rule Eureka world.
So accordingly to existing docs for assignRolesToUser Rest API call to create a record associating role with user we should run CURL command like the next one:
Deploy Edge modules
Render Ephemeral Properties
At this step we populate Ephemeral Properties template file for every edge-* module found in github.com/folio-org/platform-complete/snapshot/install.json file.
As example for rendering we have properties file to bundle module in tenant and its admin credentials with respective capabilities.
Create config map for every edge-* module
Completed Ephemeral Properties files have to be stored in Cluster Namespace as configmaps:
Deploy edge-* modules to cluster namespace
At this point we deploy a set of edge-* modules (see install.json file) to cluster namespace:
Perform Consortia Deployment (if required)
REST API Docs for “POST /consortia“ endpoint.
Set up a Consortia Deployment with the given tenants
Create consortia deployment instance accordingly to docs for consortia REST API call to save consortium configuration.
Add Consortia Central Tenant
Add Consotia Institutial Tenant
Perform indexing on Eureka resources
There is comprehensive documentation piece for Search Indexing we would highly recommend to walk through to learn that magic closer.
Re-create search index for authority resource
Have a look into existing docs for Resource reindex REST API call to initiate reindex for the authority records (
/search/index/inventory/reindex
endpoint)It is possible to monitor indexing process with getReindexJob REST API call. To check how many records are published to Kafka topic we may use following command
Where reindex_job_id - ID returned by /search/index/inventory/reindex
endpoint in previous step.
Indexing of instance resources
First need to check related docs for Full Inventory Records reindex REST API call to initiate the full reindex for the inventory instance records (
/search/index/instance-records/reindex/full
endpoint)
Configure Edge modules
Create Eureka Users for Eureka UI
UI modules expect respcective Users created in Eureka instance. Enough system capabilites have to be assigned to UI Users to perfrom desired level of access.
To have some clue how UI modules are mapped with Eureka Accounts with required capabilities please take a look into folio-org/pipelines-shared-library/resources/edge/config_eureka.yaml file
So we need to create extra Eureka Accounts to be used by UI Modules. For example
Create User Account:
Set Password for User:
Assign Capabilities to User Account:
Assign Capabilities Set to User Account
Build FOLIO Eureka UI
Get Eureka UI source code from Github repository folio-org/platform-complete at snapshot branch
Adjust platform-complete/eureka-tpl/stripes.config.js configuration template file accordingly to existing values
To build Eureka UI (Stripes Platform) we give couple of yarn commands
To build Eureka UI (Stripes Platform) in Container Image it is just enough to use following github.com/folio-org/platform-complete/docker/Dockerfile
Once Docker Image with Eureka UI is created it should be put to some Container Image Repository to be available for deployment to Kubernetes Cluster.
Deploy Eureka UI
Deploy
'ui-bundle'
moduleAt this point of time we are free to deploy ui-bundle module into cluster namespace using folio-helm-v2/platform-complete Helm Chart
Also variable module properties can be seen in github.com/folio-org/pipelines-shared-library/resources/helm/testing.yaml and given in values/ui-bundle-diku.yaml file
Configure Eureka UI parametes
Get Tenant Realm Name from Keycloak
Put extra Tenant Configuration for Eureka UI (Stripes Platform)
Kong fine-tuning
You can customize Kong's default behavior using environment variables. When the application starts, it uses environment variables to configure the personal Nginx web server and Kong itself. To set Nginx parameters, use environment variables with the prefix KONG_NGINX_
. For Kong-specific configurations, define variables with the prefix KONG_
.
Additional resources:
Kong configuration files
How to work with Kong environment variables
Configuration variable list
Post-Deployment Tasks
Monitoring and logging
Scaling and updates
Troubleshooting and Common Issues
InternalServerErrorException error 500. Connection refused - lack of resources.
Preamble: Working with complex operations such as application tenant entitlement may pose challenges due to the need for all modules to be available, direct requests between platform modules, the loosely coupled nature of K8S, and the resulting temporary unavailability of some modules.
Issue: Various errors like the following may occur during these complex operations due to incomplete execution within the required timeframe or unavailability of modules:Enabling application for tenant failed: [errors:[[message:Flow 'd62cbd2c-9261-47df-bffd-e6a13871c59f' finished with status: FAILED, type:FlowExecutionException, code:service_error, parameters:[[key:mod-<some_module_name>-folioModuleInstaller, value:FAILED: [IntegrationException] Failed to perform doPostTenant call, parameters: [{key: cause, value: 500: {"errors":[{"type":"InternalServerErrorException","code":"service_error","message":"Failed to proxy request","parameters":[{"key":"cause","value":"Connection refused: localhost/127.0.0.1:8081"}]}],"total_records":1}}]]
Cause: The issue could be caused by resource throttling or module unavailability. If the allocated CPU or RAM limit is reached, the time needed to perform these operations significantly increases and exceeds the expected time limit. In other cases, in a self-rebalancing K8S cluster, pods for some modules may be evicted and moved to other nodes, leading to the inaccessibility of core modules or modules that play significant roles in these complex operations processes. If a core module like Kong, Keycloak, mgr-* modules, or very important modules like mod-roles-keycloak and mod-users-keycloak are affected, it could lead to breaking the invocation chain.
Resolution: To address this issue, you could use one of the following approaches or in conjunction:
- Provide node size fit the total amount module request regarding the CPU and RAM.
- Provide resource limits to each module to ensure they would not move by the cluster during the heavyweight operations.
InternalServerErrorException error 500. Connection refused - deployment timing.
Issue: During the entitlement process the following error message could appearEnabling application for tenant failed: [errors:[[message:Flow 'd62cbd2c-9261-47df-bffd-e6a13871c59f' finished with status: FAILED, type:FlowExecutionException, code:service_error, parameters:[[key:mod-<some_module_name>-folioModuleInstaller, value:FAILED: [IntegrationException] Failed to perform doPostTenant call, parameters: [{key: cause, value: 500: {"errors":[{"type":"InternalServerErrorException","code":"service_error","message":"Failed to proxy request","parameters":[{"key":"cause","value":"Connection refused: localhost/127.0.0.1:8081"}]}],"total_records":1}}]]
This error may appear even if there are enough resources available for the environment, as described in the InternalServerErrorException error 500. Connection refused - lack of resources. topic and module availability was ensured.
Cause: In some cases, this error could occur due to inappropriate deployment timing, especially when an automated deployment process is used. Even if enough resources have been provided, modules need time to become available after they start. Some heavyweight modules, such as mod-oa or mod-agreement, may need up to 5 minutes to start. Therefore, it's important to check module availability after deployment before starting any operations, such as instance entitlement on modules. Additionally, ensure the correct order of deployment: Kong, Keycloak - Mgr-components - Modules.
The application is not entitled on tenant - sidecar vs Kafka
Issue: The errorThe module is not entitled on tenant ...
may occur during certain operations, especially during the entitlement process. You can find the full log of this issue in the related module's sidecar.
Cause: This error happens due to communication issues between mgr-tenant-entitlement and the corresponding module, which notifies the module about the end of the entitlement process via Kafka. In some cases, the sidecar consumer connection could be marked as dead. The main reason for that is a wrong setting of Kafka heartbeat request and sidecar poll requests, which aren’t aligned with each other, or various networking issues that lead to the poll request being absent during the specific period.The module is not entitled on tenant ...
errors appear due to the next module portion in the entitlement process request to the previously entitled modules and sidecar which does not receive a message via Kafka about the finish of the related module entitlement process.
Resolution: In case a sidecar loses connection with Kafka during the entitlement process and it has not been affected by this issue, simply restart the affected module, and its sidecar will get entitlement information from the mgr-tenant-entitlement module. In case the entitlement process fails, repeat it. Nevertheless ensures a stable connection between Kafka and sidecars as well as aligns Kafka heartbeat request and sidecar poll request periods.The application is not entitled on tenant - sidecar vs mod-tenant-entitlement
Issue: The errorThe module is not entitled on tenant ...
may occur during certain operations.
Cause: This error can happen when the mgr-tenant-entitlement and some module pods are redeployed simultaneously, and the module's sidecar becomes ready before mgr-tenant-entitlement, causing it to be unable to obtain information about the application entitlement from the MTE module and potentially providing an error response to other modules upon request.
Resolution: To fix this issue, ensure the correct module redeployment order. If the issue occurs unexpectedly, simply restart the affected module.The upstream server is timing out - Kong fine-tuning
Issue: Some API requests may result in a 504 error code with the error message"upstream server is timing out
.” This problem is primarily caused by Kong and typically occurs during long operations, such as assigning a capability to a role or user.
Cause: When a request reaches a specific module, it always goes through Kong, which has two potential points of failure: Kong's Nginx and Kong itself.
Resolution: To address this issue, you should adjust the upstream timeout of Kong's Nginx using theKONG_NGINX_HTTP_KEEPALIVE_TIMEOUT
,KONG_NGINX_UPSTREAM_KEEPALIVE
, andKONG_NGINX_HTTP_KEEPALIVE_REQUESTS
environment variables. Additionally, consider modifying the following Kong variables:KONG_UPSTREAM_KEEPALIVE_IDLE_TIMEOUT, KONG_UPSTREAM_KEEPALIVE_POOL_SIZE, and KONG_UPSTREAM_KEEPALIVE_MAX_REQUESTS, KONG_UPSTREAM_CONNECT_TIMEOUT, KONG_RETRIES.
More information about how to perform that.Some capabilities/capability sets are absent - Kafka messages processing period
Issue: If you try to assign capabilities or capability sets to a role or user immediately after the entitlement process, you may encounter an issue with their absences.
Cause: The predefined capabilities (also known as permissions) and capability sets are created just after the application entitlement process. This process takes time to complete. Here's how it works:The mgr-tenant-entitlement module sends messages to the mod-roles-keycloak via Kafka with a list of roles. This process lasts during the entitlement process as each module is enabled on a tenant.
The mod-roles-keycloak starts processing the messages right after it has been entitled, so it could be at the end of the entitlement process.
mod-roles-keycloak proceeds through the message queue in Kafka until it reaches the end.
Resolution: Before starting to assign capabilities or capability sets, it is important to check the Kafka module consumer message queue offset to ensure that the process has been completed. Alternatively, you should determine the appropriate amount of time to allow for the process to finish based on the performance of your environment.