cluster-autoscaler/proposals/scalability_tests.md
As a part of Cluster Autoscaler graduation to GA we want to guarantee a certain level of scalability limits that Cluster Autoscaler supports. We declare that Cluster Autoscaler scales to 1000 nodes with 30 pods per node. This document further defines what it means that CA scales to 1000 nodes, describes test scenarios and test setup used to measure scalability of CA and outlines its performance measured at this scale.
Cluster Autoscaler scales up to a certain number of nodes if it stays responsive. It performs scales up and scale down operations on the cluster within reasonable time frame. If CA is not responsive it can be killed by the liveness probe or fail to provide/release computational resources in cluster when needed, resulting in inability of the cluster to handle additional workload, or in higher cloud provider bills.
Cluster Autoscaler needs to be responsive from the user perspective. This means that the changes in cluster state have to be picked up by Cluster Autoscaler as soon as possible, so that its reaction time is short. To be able to do that, every iteration (cluster state analysis to see if cluster size needs to be changed and according adjustment of the cluster size) needs to finish relatively quickly. Thus, we set the upper bound for iteration duration to 30 seconds.
Using Kubernetes and kubemark on GCP we have created a following 1000 node cluster setup:
We have run multiple test scenarios with a general setup targeting load of ~1000 nodes, ~30 pods per node. During each test scenario we have collected iteration duration histogram.
[Scale-up] Scales up at all
[Scale-up] Scales up while handling previous load
[Scale-down] Scales down empty nodes
[Scale-down] Scales down underutilized nodes
[Scale-down] Doesn't scale down with underutilized but unremovable nodes
[Scale-up] Ignores unschedulable pods while continuing to schedule schedulable pods
Cluster Autoscaler in GA version fulfills all the expected results of all the listed test scenarios. Furthermore the maximum measured iteration duration for all these tests is below 10s. This satisfies the initial condition of iteration duration metric lower than 30s. Based on these tests, we conclude that Cluster Autoscaler in GA scales up to 1000 nodes with an average of 30 pods per node.