RANCHER-582 Investigate memory optimised instances usage for Rancher clusters
Difference between instance types
Table with compare technical characteristics EC2 instances that can be applied to our infrastructure
Result of investigation shown in table below. I tested fourth instance types on our infrastructure for compare how many nodes we will create on test application on different utilization.
Currently in use |
m5ad.xlarge, m5.xlarge, m5a.xlarge, m5d.xlarge |
Class instance | Type instance | CPU | Memory | Network Bandwidth (Gigabit) | Storage | Processor | price for 1 hour for 1 instance (On Demand) | scale-down-utilization-threshold=0,5 count of nodes | scale-down-utilization-threshold=0,75 count of nodes |
m5 | m5ad.xlarge | 4 | 16 | 10 | 1 * 150 NVMe SSD | AMD EPYC 7000 | 0,206 | ||
m5.xlarge | 4 | 16 | 10 | EBS only | Intel Xeon® Platinum 8175M | 0,192 | 5 | 5 | |
m5a.xlarge | 4 | 16 | 10 | EBS only | AMD EPYC 7000 | 0,172 | |||
m5d.xlarge | 4 | 16 | 10 | 1 * 150 NVMe SSD | Intel Xeon® Platinum 8175M | 0,226 | |||
r5 | r5.large | 2 | 16 | 10 | EBS only | Intel Xeon® Platinum 8000 (Skylake 8175M или Cascade Lake 8259CL) | 0,126 | 7 | 6 |
r5.xlarge | 4 | 32 | 10 | EBS only | Intel Xeon® Platinum 8000 (Skylake 8175M или Cascade Lake 8259CL) | 0,252 | |||
r5a.xlarge | 4 | 32 | 10 | EBS only | AMD EPYC 7000 (AMD EPYC 7571) | 0,226 | 3 | 3 | |
r5ad.xlarge | 4 | 32 | 10 | 1 * 150 NVMe SSD | AMD EPYC 7000 (AMD EPYC 7571) | 0,262 | |||
r5d.xlarge | 4 | 32 | 10 | 1 * 150 NVMe SSD | Intel Xeon® Platinum 8000 (Skylake 8175M или Cascade Lake 8259CL) | 0,288 | |||
r6 | r6a.large | 2 | 16 | 12,5 | EBS only | AMD EPYC (AMD EPYC 7R13) | 0,113 | ||
r6a.xlarge | 4 | 32 | 12,5 | EBS only | AMD EPYC (AMD EPYC 7R13) | 0,226 | |||
r6i.xlarge | 4 | 32 | 12,5 | EBS only | Intel Xeon Scalable (Ice Lake 8375C) | 0,252 | 3 | 3 | |
r6i.large | 2 | 16 | 12,5 | EBS only | Intel Xeon Scalable (Ice Lake 8375C) | 0,126 | |||
r6in.large | 2 | 16 | 25 | EBS only | Intel Xeon Scalable (Ice Lake 8375C) | 0,174 | |||
r6in.xlarge | 4 | 32 | 30 | EBS only | Intel Xeon Scalable (Ice Lake 8375C) | 0,348 |
Features of different instances class
Information was taken from official aws documentation https://aws.amazon.com/ec2/instance-types/?nc1=h_ls.
m5
- Up to 3.1 GHz Intel Xeon Scalable processor (Skylake 8175M or Cascade Lake 8259CL) with new Intel Advanced Vector Extension (AVX-512) instruction set
- Up to 25 Gbps network bandwidth using Enhanced Networking
- Instance storage offered via EBS or NVMe SSDs that are physically attached to the host server
m5a
- AMD EPYC 7000 series processors (AMD EPYC 7571) with an all core turbo clock speed of 2.5 GHz
- Up to 20 Gbps network bandwidth using Enhanced Networking
- Instance storage offered via EBS or NVMe SSDs that are physically attached to the host server
r5
- Up to 3.1 GHz Intel Xeon® Platinum 8000 series processors (Skylake 8175M or Cascade Lake 8259CL) with new Intel Advanced Vector Extension (AVX-512) instruction set
- With R5d instances, local NVMe-based SSDs are physically connected to the host server and provide block-level storage that is coupled to the lifetime of the R5 instance
r5a
- AMD EPYC 7000 series processors (AMD EPYC 7571) with an all core turbo clock speed of 2.5 GHz
- Up to 20 Gbps network bandwidth using Enhanced Networking
- Instance storage offered via EBS or NVMe SSDs that are physically attached to the host server
- With R5ad instances, local NVMe-based SSDs are physically connected to the host server and provide block-level storage that is coupled to the lifetime of the R5a instance
r6a
- Up to 3.6 GHz 3rd generation AMD EPYC processors (AMD EPYC 7R13)
- Up to 35% better compute price performance over R5a instances
- Up to 50 Gbps of networking speed
r6i
- Up to 3.5 GHz 3rd generation Intel Xeon Scalable processors (Ice Lake 8375C)
- Up to 15% better compute price performance over R5 instances
- Up to 20% higher memory bandwidth per vCPU compared to R5 instances
- Up to 50 Gbps of networking speed
- With R6id instances, up to 7.6 TB of local NVMe-based SSDs are physically connected to the host server and provide block-level storage that is coupled to the lifetime of the R6i instance
r6in
- Up to 3.5 GHz 3rd Generation Intel Xeon Scalable processors (Ice Lake 8375C)
- Up to 20% higher memory bandwidth per vCPU compared to R5n and R5dn instances
- Up to 200 Gbps of networking speed, which is up to 2x compared to R5n and R5dn instances
- Up to 80 Gbps of EBS bandwidth, which is up to 1.3x more than R5b instances
- With R6idn instances, up to 7.6 TB of local NVMe-based SSDs are physically connected to the host server and provide block-level storage that is coupled to the R6idn instance lifetime
Compare total price on tested instance types
Total price for 1 hour
In table below we can see the total price for run one test application on tmp cluster for each instance type, so total price for m5.xlarge is the base price that we are paying now.
Currently in use |
---|
m5.xlarge |
Type instance | price for 1 hour for 1 instance (On Demand) | Total count nodes | Total price |
m5.xlarge | 0,192 | 5 | 0.96 |
r5.large | 0,126 | 7 | 0.882 |
r5a.xlarge | 0,226 | 3 | 0.678 |
r6i.xlarge | 0,252 | 3 | 0.756 |
Total discount
In table below we can see total discount if we will run the test application on different instance type. So, as you can see for r5a.xlarge we will pay less on about 30% because price for AMD less on 10% than on Intel and we are creating less nodes then for m5.xlarge , so this gives to us opportunity to save money.
Type instance | Discount if compare with m5.xlarge |
---|---|
r5.large | ~10% |
r5a.xlarge | ~30% |
r6i.xlarge | ~20% |
Compare utilization on Rancher
Files with screenshots of utilization CPU/RAM that we can see in rancher.
Calculate total cost for 3 years for different EC2 plans
Calculate total cost for 3 years for 1 instance
Table below show the total cost that we will spend for 3 years using different EC2 payment options for 1 instance.
Standard RI | Save plan | ||||||
Type instance | On Demand | no upfront | Partial upfront | all upfront | no upfront | Partial upfront | all upfront |
m5.xlarge | 4976,64 | 2181 | 2007,64 | 1897 | 2549,16 | 2365,2 | 2312,64 |
r5.large | 3265,92 | 1419 | 1319 | 1245 | 1734,48 | 1603,26 | 1576,8 |
r5a.xlarge | 5857,92 | 2575 | 2370,6 | 2233 | 3101,04 | 2864,44 | 2811,96 |
r6i.xlarge | 6531,84 | 3004 | 2781,68 | 2615 | 3454,56 | 3198,48 | 3134,68 |
Calculate total cost for 3 years for amount instances
Count was taken from investigation when we had to run our test cluster on different instance types.
Type instance | count nodes |
m5.xlarge | 5 |
r5.large | 7 |
r5a.xlarge | 3 |
r6i.xlarge | 3 |
Table below show the total cost that we will spend for 3 years using different EC2 payment options for run test application on folio-tmp cluster.
Standard RI | Save plan | ||||||
Type instance | On Demand | no upfront | Partial upfront | all upfront | no upfront | Partial upfront | all upfront |
m5.xlarge | 24883,2 | 10906 | 10038,2 | 9485 | 12745,8 | 11826 | 11563,2 |
r5.large | 22861,44 | 9934 | 9233 | 8715 | 12141,36 | 11222,82 | 11037,6 |
r5a.xlarge | 17573,76 | 7726 | 7111,8 | 6699 | 9303,12 | 8593,32 | 8435,88 |
r6i.xlarge | 19595,52 | 9013 | 8345,04 | 7845 | 10363,68 | 9595,44 | 9404,04 |
Conclusion
As the result of investigation, for save money need to move to memory optimized instance class r5 or r6 because this class has much more memory and it gives opportunity to run less EC2 instances than we are running now.
Another recommendation is to use AMD processors because they cheaper then Intel on about 10%. If we are talking about count of cores in CPU, I think we should use at minimum 4 cores, although 2 cores cheaper then 4 we will have to use more EC2 instances for manage load on our application and it will be a little slowing then on 4 cores. Moreover, I found out that increasing of utilization do not give any visible advantages.
Main files where keep result of investigation: