vertical-pod-autoscaler/docs/quickstart.md
After installation the system is ready to recommend and set resource requests for your pods. In order to use it, you need to insert a Vertical Pod Autoscaler resource for each controller that you want to have automatically computed resource requirements. This will be most commonly a Deployment. There are five modes in which VPAs operate:
"Auto" [deprecated]: VPA assigns resource requests on pod creation as well as updates
them on existing pods using the preferred update mechanism. Currently, this is
equivalent to "Recreate" (see below). This mode is deprecated and will be removed in a future API version.
Use explicit modes like "Recreate", "Initial", or "InPlaceOrRecreate" instead."Recreate" [default]: VPA assigns resource requests on pod creation as well as updates
them on existing pods by evicting them when the requested resources differ significantly
from the new recommendation (respecting the Pod Disruption Budget, if defined).
This mode should be used rarely, only if you need to ensure that the pods are restarted
whenever the resource request changes."InPlaceOrRecreate": VPA assigns resource requests on pod creation as well as updates
them on existing pods by leveraging Kubernetes in-place update capability.
If in-place update fails, it falls back to evicting the pods, performing a recreation.
For more details, see the In-Place Updates documentation."Initial": VPA only assigns resource requests on pod creation and never changes them
later."Off": VPA does not automatically change the resource requirements of the pods.
The recommendations are calculated and can be inspected in the VPA object.A simple way to check if Vertical Pod Autoscaler is fully operational in your cluster is to create a sample deployment and a corresponding VPA config:
kubectl create -f examples/hamster.yaml
The above command creates a deployment with two pods, each running a single container that requests 100 millicores and tries to utilize slightly above 500 millicores. The command also creates a VPA config pointing at the deployment. VPA will observe the behaviour of the pods, and after about 5 minutes, they should get updated with a higher CPU request (note that VPA does not modify the template in the deployment, but the actual requests of the pods are updated). To see VPA config and current recommended resource requests run:
kubectl describe vpa
Note: if your cluster has little free capacity these pods may be unable to schedule. You may need to add more nodes or adjust examples/hamster.yaml to use less CPU.
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Recreate" # Use explicit mode instead of deprecated "Auto"
To diagnose problems with a VPA installation, perform the following steps:
kubectl --namespace=kube-system get pods|grep vpa
The above command should list 3 pods (recommender, updater and admission-controller) all in state Running.
kubectl --namespace=kube-system logs [pod name] | grep -e '^E[0-9]\{4\}'
kubectl get customresourcedefinition | grep verticalpodautoscalers