Deployment Environments (general overview)
Overview
FOLIO is a system built on micro-services, designed for a multi-tenant cloud environment. Having said that, some institutions will choose to deploy FOLIO on premise and the community can expect a wide range of deployment environment and mechanisms. This page will describe some of the issues that need to be understood and the choices around those issues. We will also label what are felt to be best practices when possible.
Folio Platform Architectural Diagram
Hosting Environment Choices
Generally hosing environments will include some set of the following alternatives:
Public Cloud vs On Premise
Many institutions continue to maintain on premise computing infrastructure while also managing cloud resources. Institutions will have to decide where and how to deploy FOLIO, and the public cloud vs on Premise choice will define sets of alternatives for technology and tooling. If a site chooses to deploy in AWS then it has sets of tools and technology that are not available on premise. We know that some organizations are planning an AWS deployment and they will also have choices to make - native (proprietary) AWS technologies like RDS, or not?
For some "cloud" means a hosted data center or colocation facility. Basically not on premise.
Of course the site that chooses an on premise deployment has many many options and choices to make, although likely most of that will be decided based on current systems, tools, and processes.
Virtual vs Physical
Many workloads have been virtualized, and sites are comfortable with managing virtual environments. VMWare is very popular but other choices exist. Some workloads are better served, however, on physical machines for different reasons.
Note the Virtual vs Physical choice need not be dictated by the Cloud vs On Premise choice - you can run VMware in AWS, for example.
Database Considerations
FOLIO has been designed and developed with PostgreSQL as the default database engine, although using Postgres isn't mandatory. Postgres is not the most widely adopted database - less common than products such as Oracle, SQL Server or even MySQL. For this reason sites will may consider which database they want to use under the hood of FOLIO, however, replacing Postgres is not a task to be taken lightly.
Orchestration Tools
Orchestration tools automatically deploy, scale and manage containerized applications.
Orchestration Tools: On-Premise
Kubernetes
Kubernetes is an orchestration tool. It is open source. It groups containers that make up an application into logical units for easy management and discovery. The groups are called Pods.
Kubernetes is very complex in installation and maintenance. It is designed for medium to big deployments. Some might consider it too heavy-weighted for a single library. But it might be an appropriate tool for a consortia hosting many tenants in a computer center.
Docker Swarm
Released in July 2016. Initially shipped as part of the Docker Engine in version 1.12. Docker Swarm is free to use in the Docker Community Edition and commercial support is available as part of Docker Enterprise edition offered by Docker, Inc.
Docker swarm is relatively straightforward to set up. Docker swarm may be the best option if simplicity is a primary requirement.
Docker Swarm services are defined by Docker Compose files. A compose file brings up a group of containers on a single machine and can also be run across many machines. A compose file is specified in YAML.
Production swarm deployment is typically done across multiple physical or virtual machines or cloud instances1. The tutorial https://docs.docker.com/engine/swarm/swarm-tutorial/ requires three Linux hosts which communicate over a network. In a production environment, it is only meaningful to run swarm across multiple Docker nodes. A Docker node is an instance of the Docker engine.
Apache Mesos
Apache Mesos is designed for data center management, and installing complex applications such as Kubernetes on top of data center resources.
The Apache Mesos kernel runs on every machine and provides resources to applications that run on top of it, such as Hadoop, Spark, Kafka, Elasticsearch and Kubernetes.
One can run containers directly in Mesos, but using an application like Kubernetes on top of Mesos provides better workflows.
Apache Mesos is best suited for data centers where multiple complicated applications will need to be setup and configured.
The framework for orchestrating Docker containers in Mesos is called Marathon.
SaltStack
SaltStack is an open source software for the automatic configuration of server systems. SaltStack runs a master process on a central server. The central server (master) sends configuration orders to several administrated servers, called Minions. The system administrator determines the configuration of the administrated servers by so-called SLS-files (SaLt State) which are written in YAML. Clients and groups of clients can be defined in SALT by so-called Grains. Sensitive information is stored on the master in so-called Pillars.
Orchestrate Runner
Orchestration Tools: Public Cloud
Note that all tools that are available as on-premise are also available in public cloud environments, since one can use the public cloud as a blank canvas of raw computing.
Public Cloud Solutions
Public Cloud providers will vary in their products and what technologies they support in their proprietary offerings. As mentioned above, it can be assumed that you could treat public cloud as "bare metal", but more likely a site will take advantage of packaged capabilities that improve the value proposition for using public cloud infrastructure. This packaging along with proprietary toolsets and capabilities will differentiate public cloud providers from each other as well as from on-premise capabilities.
Amazon Web Services (AWS)
AWS is the public cloud services giant. It supports many technologies and has perhaps the largest number of proprietary services. The list below outlines some of the choices a site would have within AWS.
- Container Orchestration:
- It offers a proprietary AWS container orchestration solution called ECS (Elastic Container Services), which is not based on Kubernetes, Swarm or Mesos.
- It also offers a version of ECS that is based on Kubernetes called Amazon EKS.
- As mentioned below sites can choose to use any container orchestration solution if they want to configure and run it themselves within AWS.
- Database:
- PostgreSQL on either Aurora or RDS
- Postgres on AWS EC2 instances (essentially running it standalone)
- Compute:
- Networking:
- VPC w/ public and private subnets
- Application Load Balancers (ALB) routing traffic to ECS services.
- Route 53 for DNS
- Frontend:
- Tenant-specific Webpack bundles hosted on S3
- Distributed via CloudFront
- Logging/Monitoring:
- Cloudwatch, S3 + Glacier (archival)
IBM Software
Microsoft Azure
Google Cloud Engine
GCE offers a container solution which is called GKE. It is based on Kubernetes.
OpenStack
OpenStack is an open source cloud computing platform. It is written in Python. It has a large number of components and is known to be very complex. The component for Docker orchestration in OpenStack is called Magnum.
OpenShift
OpenShift is the open source "enterprise kubernetes platfrom" by RedHat. It is built on top of Kubernetes running on RHEL. It governs applications in productive hybrid-cloud environments. Red Hat OpenShift Container Platform is a validated and certified Kubernetes solution. It can be run on-premise or in a public cloud.
Continuous Delivery
Continuous delivery solutions help to avoid downtimes for upgrades and installing patches. In a microservice environment, it might be a frequent task to exchange a container. Continuous delivery helps you manage such repeatable deployments without a downtime.
Spinnaker
An open source, multi-cloud continuous delivery platform is Spinnaker. Spinnaker helps you release software changes with high velocity and confidence.
It does so by using Pipelines as deployment manager construct. A pipeline is a sequence of stages. A stage can be a function to manipulate the infrastructure, like to deploy, resize (scale up or down), clone or disable a cluster. A Cluster in Spinnaker is a logical grouping of server groups. A server group identifies the deployable artifacts (VM image, Docker image etc.) in the cloud and a basic configuration setting. When deployed, a server group is a collection of instances of the running software (VM instances, Kubernetes pods etc.).
Spinnaker is installed on a Kubernetes cluster or on a single machine which uses Debian packages (the latter for smaller deployments). Spinnaker uses Halyard, which can be installed on Ubuntu, Debian or MacOS (virtual) machines or on a Docker container. Spinnaker is designed to be built on top of a public coud provider (AWS, Azure, DC/OS, Google Cloud, Kubernetes Cluster, OpenStack). Spinnaker deploys your applications via an account enabled by a cloud provider.
Reference Deployment Environments
Below are descriptions of reference deployments:
FOLIO-Stable
- Container Orchestration:
- Database:
- Compute:
- Networking:
- Frontend:
- Logging/Monitoring:
FOLIO-Performance
- Container Orchestration:
- Database:
- Compute:
- Networking:
- Frontend:
- Logging/Monitoring:
Sources:
- https://codefresh.io/kubernetes-tutorial/kubernetes-vs-docker-swarm-vs-apache-mesos/
- 1. https://docs.docker.com/engine/swarm/swarm-tutorial/