Overview
FOLIO is a system built on micro-services, designed for a multi-tenant cloud environment. Having said that, some institutions will choose to deploy FOLIO on premise and the community can expect a wide range of deployment environment and mechanisms. This page will describe some of the issues that need to be understood and the choices around those issues. We will also label what are felt to be best practices when possible.
====NOTE - IN PROGRESS of CHANGES - April 26, 2018 =====
Hosting Environment Choices
Generally hosing environments will include some set of the following alternatives:
Public Cloud vs On Premise
Many institutions continue to maintain on premise computing infrastructure while also managing cloud resources. Institutions will have to decide where and how to deploy FOLIO, and the public cloud vs on Premise choice will define sets of alternatives for technology and tooling. If a site chooses to deploy in AWS then it has sets of tools and technology that are not available on premise. We know that some organizations are planning an AWS deployment and they will also have choices to make - native (proprietary) AWS technologies like RDS, or not?
For some "cloud" means a hosted data center or colocation facility. Basically not on premise.
Of course the site that chooses an on premise deployment has many many options and choices to make, although likely most of that will be decided based on current systems, tools, and processes.
Virtual vs Physical
Many workloads have been virtualized, and sites are comfortable with managing virtual environments. VMWare is very popular but other choices exist. Some workloads are better served, however, on physical machines for different reasons.
Note the Virtual vs Physical choice need not be dictated by the Cloud vs On Premise choice - you can run VMware in AWS, for example.
Database Considerations
FOLIO has been designed and developed with PostgreSQL as the default database engine, although using Postgres isn't mandatory. Postgres is not the most widely adopted database - less common than products such as Oracle, SQL Server or even MySQL. For this reason sites will have to consider which database they want to use under the hood of FOLIO.
Orchestration Tools
Orchestration tools automatically deploy, scale and manage containerized applications.
Orchestration Tools: On-Premise
Kubernetes
Kubernetes is an orchestration tool. It is open source. It groups containers that make up an application into logical units for easy management and discovery. The groups are called Pods.
Kubernetes is very complex in installation and maintenance. It is designed for medium to big deployments. We consider it too heavy-weighted for a single library. But it might be an appropriate tool for a consortia hosting many tenants in a computer center.
Docker Swarm
Released in July 2016. Initially shipped as part of the Docker Engine in version 1.12. Docker Swarm is free to use in the Docker Community Edition and commercial support is available as part of Docker Enterprise edition offered by Docker, Inc.
Docker swarm is simple to set up. Docker swarm is a great option if simplicity is a requirement.
Docker Swarm services are defined by Docker Compose files. A compose file brings up a group of containers on a single machine and can also be run across many machines. A compose file is specified in YAML.
Production swarm deployment is typically done across multiple physical or virtual machines or cloud instances1. The tutorial https://docs.docker.com/engine/swarm/swarm-tutorial/ requires three Linux hosts which communicate over a network. In a production environment, it is only meaningful to run swarm across multiple Docker nodes. A Docker node is an instance of the Docker engine.
Apache Mesos
Apache Mesos is designed for data center management, and installing complex applications such as Kubernetes on top of data center resources.
The Apache Mesos kernel runs on every machine and provides resources to applications that run on top of it, such as Hadoop, Spark, Kafka, Elasticsearch and Kubernetes.
One can run containers directly in Mesos, but using an application like Kubernetes on top of Mesos provides better workflows.
Apache Mesos is best suited for data centers where multiple complicated applications will need to be setup and configured.
The framework for orchestrating Docker containers in Mesos is called Marathon.
SaltStack
Orchestrate Runner
Orchestration Tools: Public Cloud
Amazon Web Services (AWS)
AWS is the cloud services giant. It offers a container orchestration solution ECS (Elastic Container Services), which is not based on Kubernetes, Swarm or Mesos.
One possible approach
- Database:
- PostgreSQL on either Aurora or RDS
- Compute:
- Networking:
- VPC w/ public and private subnets
- Application Load Balancers (ALB) routing traffic to ECS services.
- Route 53 for DNS
- Frontend:
- Tenant-specific Webpack bundles hosted on S3
- Distributed via CloudFront
- Logging/Monitoring:
- Cloudwatch, S3 + Glacier (archival)
IBM Softlayer
Microsoft Azure
Google Cloud Engine
GCE offers a container solution which is called GKE. It is based on Kubernetes.
OpenStack
OpenStack is an open source cloud computing platform. It is written in Python. It has a large number of components and is known to be very complex. The component for Docker orchestration in OpenStack is called Magnum.
Sources:
- https://codefresh.io/kubernetes-tutorial/kubernetes-vs-docker-swarm-vs-apache-mesos/
- 1. https://docs.docker.com/engine/swarm/swarm-tutorial/