Current production-esque environment

Crunchy-Postgres Kubernetes stateful set
Deployed via Helm chart in Rancher 2.1
Postgres volumes provisioned with vSphere storage class via vSphere cloud config in Kubernetes/Rancher 2.1
XFS file system
1 primary and one replica for each Folio instance (running 3 instances of Folio)
Max connection set to 250
Max pool size of 10 for Postgres

7 pre-provisioned Oracle Linux VMs on VMware infrastructure (4 Worker and 3 etcd/Control Plane nodes)

8 node template provisioned RancherOS VMs on VMware infrastructure (5 Worker and 3 etcd/Control Plane nodes)

(Three instances of Folio being hosted by two clusters)

Preliminary Findings

Pod monitoring on our clusters via Prometheus and Grafana - deployed via Helm chart in both clusters
Gathered a list of worst offender modules for cluster resources used:
1. mod-agreements
2. mod-licenses
3. mod-permissions
Set resource reserves and limits for module Workloads - to prevent runaway or failed clusters when upgrading, rescheduling or during node down times
Set in the Workload - batch sizes of 1 when performing rolling upgrades or deployments
These limits have slowed down my Folio response times some - I don’t yet have a clear view entirely of what modules should be given higher resource limits
Java is very memory hungry

Loaded on instance of Folio with the Okapi DB and the rest of the modules DB split
The UI is MUCH more responsive than before, during the load
12k users in all took about 20 minutes to load
Not dropping the index before doing the load - no way to really do that using mod-user-import as far as I’m aware?