Back to Charts

DEPRECATED

stable/cluster-autoscaler/README.md

latest17.7 KB
Original Source

DEPRECATED

This Helm chart has been moved to the kubernetes/autoscaler repository.

cluster-autoscaler

The cluster autoscaler scales worker nodes within an AWS autoscaling group (ASG) or Spotinst Elastigroup.

Cluster Autoscaler version: v1.17.1

TL;DR:

console
$ helm install stable/cluster-autoscaler --name my-release --set "autoscalingGroups[0].name=your-asg-name,autoscalingGroups[0].maxSize=10,autoscalingGroups[0].minSize=1"

Introduction

This chart bootstraps a cluster-autoscaler deployment on a Kubernetes cluster using the Helm package manager.

Prerequisites

  • Kubernetes 1.8+

older versions may work by overriding the image. Cluster-autoscaler internally simulates the scheduler and bugs between mismatched versions may be subtle.

  • Azure AKS specific Prerequisites:
    • Kubernetes 1.10+ with RBAC-enabled

Upgrading from <2.X

In order to upgrade to chart version to 2.X from 1.X or 0.X, deleting the old helm release first is required.

console
$ helm del --purge my-release

Once the old release is deleted, the new 2.X release can be installed using the standard instructions. Note that autoscaling will not occur during the time between deletion and installation.

Upgrading from 4.X to 5.X

In order to upgrade to chart version 5.X from <=4.X, deleting the old helm release first is required.

console
$ helm del --purge my-release

Once the old release is deleted, the new 5.X release can be installed using the standard instructions. Note that autoscaling will not occur during the time between deletion and installation.

Installing the Chart

By default, no deployment is created and nothing will autoscale.

You must provide some minimal configuration, either to specify instance groups or enable auto-discovery. It is not recommended to do both.

Either:

  • set autoDiscovery.clusterName and tag your autoscaling groups appropriately (--cloud-provider=aws only) or
  • set at least one ASG as an element in the autoscalingGroups array with its three values: name, minSize and maxSize.

To install the chart with the release name my-release:

Using auto-discovery of tagged instance groups

AWS

Auto-discovery finds ASGs tags as below and automatically manages them based on the min and max size specified in the ASG. cloudProvider=aws only.

  1. tag the ASGs with keys to match .Values.autoDiscovery.tags, by default: k8s.io/cluster-autoscaler/enabled and k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
  2. verify the IAM Permissions
  3. set autoDiscovery.clusterName=<YOUR CLUSTER NAME>
  4. set awsRegion=<YOUR AWS REGION>
  5. set awsAccessKeyID=<YOUR AWS KEY ID> and awsSecretAccessKey=<YOUR AWS SECRET KEY> if you want to use AWS credentials directly instead of an instance role
console
$ helm install stable/cluster-autoscaler --name my-release --set autoDiscovery.clusterName=<CLUSTER NAME>

The auto-discovery section provides more details and examples

GCE

Required parameters
  • autoDiscovery.clusterName=any-name
  • --cloud-provider=gce
  • autoscalingGroupsnamePrefix[0].name=your-ig-prefix,autoscalingGroupsnamePrefix[0].maxSize=10,autoscalingGroupsnamePrefix[0].minSize=1

To use Managed Instance Group (MIG) auto-discovery, provide a YAML file setting autoscalingGroupsnamePrefix (see values.yaml) or use --set when installing the Chart - e.g.

console
$ helm install stable/cluster-autoscaler \
--name my-release \
--set autoDiscovery.clusterName=<CLUSTER NAME> \
--set cloudProvider=gce \
--set "autoscalingGroupsnamePrefix[0].name=your-ig-prefix,autoscalingGroupsnamePrefix[0].maxSize=10,autoscalingGroupsnamePrefix[0].minSize=1"

Note that your-ig-prefix should be a prefix matching one or more MIGs, and not the full name of the MIG. For example, to match multiple instance groups - k8s-node-group-a-standard, k8s-node-group-b-gpu, you would use a prefix of k8s-node-group-.

In the event you want to explicitly specify MIGs instead of using auto-discovery, set members of the autoscalingGroups array directly - e.g.

# where 'n' is the index, starting at 0
-- set autoscalingGroups[n].name=https://content.googleapis.com/compute/v1/projects/$PROJECTID/zones/$ZONENAME/instanceGroupManagers/$FULL-MIG-NAME,autoscalingGroups[n].maxSize=$MAXSIZE,autoscalingGroups[n].minSize=$MINSIZE

Azure AKS

Required Parameters
  • cloudProvider=azure
  • autoscalingGroups[0].name=your-agent-pool,autoscalingGroups[0].maxSize=10,autoscalingGroups[0].minSize=1
  • azureClientID: "your-service-principal-app-id"
  • azureClientSecret: "your-service-principal-client-secret"
  • azureSubscriptionID: "your-azure-subscription-id"
  • azureTenantID: "your-azure-tenant-id"
  • azureClusterName: "your-aks-cluster-name"
  • azureResourceGroup: "your-aks-cluster-resource-group-name"
  • azureVMType: "AKS"
  • azureNodeResourceGroup: "your-aks-cluster-node-resource-group"

Specifying groups manually (only aws)

Without autodiscovery, specify an array of elements each containing ASG name, min size, max size. The sizes specified here will be applied to the ASG, assuming IAM permissions are correctly configured.

  1. verify the IAM Permissions
  2. Either provide a yaml file setting autoscalingGroups (see values.yaml) or use --set e.g.:
console
$ helm install stable/cluster-autoscaler --name my-release --set "autoscalingGroups[0].name=your-asg-name,autoscalingGroups[0].maxSize=10,autoscalingGroups[0].minSize=1"

Uninstalling the Chart

To uninstall my-release:

console
$ helm delete my-release

The command removes all the Kubernetes components associated with the chart and deletes the release.

Tip: List all releases using helm list or start clean with helm delete --purge my-release

Configuration

The following table lists the configurable parameters of the cluster-autoscaler chart and their default values.

ParameterDescriptionDefault
affinitynode/pod affinitiesNone
autoDiscovery.clusterNameenable autodiscovery for name in ASG tag (only cloudProvider=aws). Must be set for cloudProvider=gce, but no MIG tagging required."" required unless autoscalingGroups[] provided
autoDiscovery.tagsASG tags to match, run through tpl[ "k8s.io/cluster-autoscaler/enabled", "k8s.io/cluster-autoscaler/{{ .Values.autoDiscovery.clusterName }}" ]
autoscalingGroups[].nameautoscaling group nameNone. Required unless autoDiscovery.enabled=true
autoscalingGroups[].maxSizemaximum autoscaling group sizeNone. Required unless autoDiscovery.enabled=true
autoscalingGroups[].minSizeminimum autoscaling group sizeNone. Required unless autoDiscovery.enabled=true
awsRegionAWS region (required if cloudProvider=aws)us-east-1
awsAccessKeyIDAWS access key ID (if AWS user keys used)""
awsSecretAccessKeyAWS access secret key (if AWS user keys used)""
autoscalingGroupsnamePrefix[].nameGCE MIG name prefix (the full name is invalid)None. Required for cloudProvider=gce
autoscalingGroupsnamePrefix[].maxSizemaximum MIG sizeNone. Required for cloudProvider=gce
autoscalingGroupsnamePrefix[].minSizeminimum MIG sizeNone. Required for cloudProvider=gce
cloudProvideraws or spotinst are currently supported for AWS. gce for GCE. azure for Azure AKSaws
image.repositoryImagek8s.gcr.io/cluster-autoscaler
image.tagImage tagv1.17.1
image.pullPolicyImage pull policyIfNotPresent
image.pullSecretsImage pull secrets[]
extraArgsadditional container arguments{}
podDisruptionBudgetPod disruption budgetmaxUnavailable: 1
extraEnvadditional container environment variables{}
envFromConfigMapadditional container environment variables from a configmap{}
envFromSecretsecret name containing keys that will be exposed as envsnil
extraEnvSecretsadditional container environment variables from a secret{}
fullnameOverrideString to fully override cluster-autoscaler.fullname template""
nameOverrideString to partially override cluster-autoscaler.fullname template (will maintain the release name)""
nodeSelectornode labels for pod assignment{}
podAnnotationsannotations to add to each pod{}
rbac.createIf true, create & use RBAC resourcestrue
rbac.serviceAccount.createIf true and rbac.create is also true, a service account will be createdtrue
rbac.serviceAccount.nameThe name of the ServiceAccount to use. If not set and create is true, a name is generated using the fullname templatenil
rbac.serviceAccountAnnotationsAdditional Service Account annotations{}
rbac.pspEnabledMust be used with rbac.create true. If true, creates & uses RBAC resources required in the cluster with Pod Security Policies enabled.false
replicaCountdesired number of pods1
priorityClassNamepriorityClassNamenil
dnsPolicydnsPolicynil
securityContextSecurity context for podnil
containerSecurityContextSecurity context for containernil
resourcespod resource requests & limits{}
updateStrategyDeployment update strategynil
service.annotationsannotations to add to servicenone
service.externalIPsservice external IP addresses[]
service.loadBalancerIPIP address to assign to load balancer (if supported)""
service.loadBalancerSourceRangeslist of IP CIDRs allowed access to load balancer (if supported)[]
service.servicePortservice port to expose8085
service.portNamename for service porthttp
service.typetype of service to createClusterIP
spotinst.accountSpotinst Account ID (required if cloudprovider=spotinst)""
spotinst.tokenSpotinst API token (required if cloudprovider=spotinst)""
spotinst.image.repositoryImage (used if cloudProvider=spotinst)spotinst/kubernetes-cluster-autoscaler
spotinst.image.tagImage tag (used if cloudProvider=spotinst)v0.6.0
spotinst.image.pullPolicyImage pull policy (used if cloudProvider=spotinst)IfNotPresent
tolerationsList of node taints to tolerate (requires Kubernetes >= 1.6)[]
serviceMonitor.enabledif true, creates a Prometheus Operator ServiceMonitorfalse
serviceMonitor.intervalInterval that Prometheus scrapes Cluster Autoscaler metrics10s
serviceMonitor.namespaceNamespace which Prometheus is running inmonitoring
serviceMonitor.pathThe path to scrape for metrics/metrics
serviceMonitor.selectorDefault to kube-prometheus install (CoreOS recommended), but should be set according to Prometheus install{ prometheus: kube-prometheus }
azureClientIDService Principal ClientID with contributor permission to Cluster and Node ResourceGroupnone
azureClientSecretService Principal ClientSecret with contributor permission to Cluster and Node ResourceGroupnone
azureSubscriptionIDAzure subscription where the resources are locatednone
azureTenantIDAzure tenant where the resources are locatednone
azureClusterNameAzure AKS cluster namenone
azureResourceGroupAzure resource group that the cluster is locatednone
azureVMType: "AKS"Azure VM typeAKS
azureNodeResourceGroupazure resource group where the clusters Nodes are located, typically set as MC_<cluster-resource-group-name>_<cluster-name>_<location>none
azureUseManagedIdentityExtensionWhether to use Azure's managed identity extension for credentialsfalse
kubeTargetVersionOverrideOverride the .Capabilities.KubeVersion.GitVersion""
expanderPrioritiesThe expanderPriorities is used if extraArgs.expander is set to priority and expanderPriorities is also set with the priorities.

Specify each parameter you'd like to override using a YAML file as described above in the installation section or by using the --set key=value[,key=value] argument to helm install. For example, to change the region and expander:

console
$ helm install stable/cluster-autoscaler --name my-release \
    --set extraArgs.expander=most-pods \
    --set awsRegion=us-west-1

IAM

The worker running the cluster autoscaler will need access to certain resources and actions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeLaunchConfigurations",
                "autoscaling:DescribeTags",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup"
            ],
            "Resource": "*"
        }
    ]
}
  • DescribeTags is required for autodiscovery.
  • DescribeLaunchconfigurations is required to scale up an ASG from 0

Unfortunately AWS does not support ARNs for autoscaling groups yet so you must use "*" as the resource. More information here.

IAM Roles for Service Accounts (IRSA)

For Kubernetes clusters that use Amazon EKS, the service account can be configured with an IAM role using IAM Roles for Service Accounts to avoid needing to grant access to the worker nodes for AWS resources.

In order to accomplish this, you will first need to create a new IAM role with the above mentions policies. Take care in configuring the trust relationship to restrict access just to the service account used by cluster autoscaler.

Once you have the IAM role configured, you would then need to --set rbac.serviceAccountAnnotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::123456789012:role/MyRoleName when installing.

Auto-discovery

For auto-discovery of instances to work, they must be tagged with the keys in .Values.autoDiscovery.tags, which by default are k8s.io/cluster-autoscaler/enabled and k8s.io/cluster-autoscaler/<ClusterName>

The value of the tag does not matter, only the key.

An example kops spec excerpt:

apiVersion: kops/v1alpha2
kind: Cluster
metadata:
  name: my.cluster.internal
spec:
  additionalPolicies:
    node: |
      [
        {"Effect":"Allow","Action":["autoscaling:DescribeAutoScalingGroups","autoscaling:DescribeAutoScalingInstances","autoscaling:DescribeLaunchConfigurations","autoscaling:DescribeTags","autoscaling:SetDesiredCapacity","autoscaling:TerminateInstanceInAutoScalingGroup"],"Resource":"*"}
      ]
      ...
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
  labels:
    kops.k8s.io/cluster: my.cluster.internal
  name: my-instances
spec:
  cloudLabels:
    k8s.io/cluster-autoscaler/enabled: ""
    k8s.io/cluster-autoscaler/my.cluster.internal: ""
  image: kope.io/k8s-1.8-debian-jessie-amd64-hvm-ebs-2018-01-14
  machineType: r4.large
  maxSize: 4
  minSize: 0

In this example you would need to --set autoDiscovery.clusterName=my.cluster.internal when installing.

It is not recommended to try to mix this with setting autoscalingGroups

See autoscaler AWS documentation for a more discussion of the setup

Troubleshooting

The chart will succeed even if the container arguments are incorrect. A few minutes after starting kubectl logs -l "app=aws-cluster-autoscaler" --tail=50 should loop through something like

polling_autoscaler.go:111] Poll finished
static_autoscaler.go:97] Starting main loop
utils.go:435] No pod using affinity / antiaffinity found in cluster, disabling affinity predicate for this loop
static_autoscaler.go:230] Filtering out schedulables

If not, find a pod that the deployment created and describe it, paying close attention to the arguments under Command. e.g.:

Containers:
  cluster-autoscaler:
    Command:
      ./cluster-autoscaler
      --cloud-provider=aws
# if specifying ASGs manually
      --nodes=1:10:your-scaling-group-name
# if using autodiscovery
      --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<ClusterName>
      --v=4

PodSecurityPolicy

Though enough for the majority of installations, the default PodSecurityPolicy could be too restrictive depending on the specifics of your release. Please make sure to check that the template fits with any customizations made or disable it by setting rbac.pspEnabled to false.