vertical-pod-autoscaler/docs/features.md
When setting limits VPA will conform to resource policies. It will maintain limit to request ratio specified for all containers.
VPA will try to cap recommendations between min and max of limit ranges. If limit range conflicts with VPA resource policy, VPA will follow VPA policy (and set values outside the limit range).
To disable getting VPA recommendations for an individual container, set mode to "Off" in containerPolicies.
[!WARNING] DEPRECATED: This feature is deprecated as of VPA v1.5.0 and will be removed in a future version. Use
--round-memory-bytesinstead for memory recommendation formatting.
[!NOTE] This feature was added in v1.3.0.
VPA can present memory recommendations in human-readable binary units (KiB, MiB, GiB, TiB) instead of raw bytes, making resource recommendations easier to understand. This feature is controlled by the --humanize-memory flag in the recommender component.
When enabled, memory values in recommendations will be:
For example, instead of seeing a memory recommendation of 262144000 bytes, you would see 250.00Mi.
Note: Due to the conversion to binary units and decimal place rounding, the humanized values may be slightly higher than the raw byte recommendations. For example, 1537 bytes would be shown as "1.50Ki" (1536 bytes). Consider this small difference when doing precise capacity planning.
To enable this feature, set the --humanize-memory flag to true when running the VPA recommender:
--humanize-memory=true
VPA can provide CPU recommendations rounded up to user-specified values, making it easier to interpret and configure resources. This feature is controlled by the --round-cpu-millicores flag in the recommender component.
When enabled, CPU recommendations will be:
For example, with --round-cpu-millicores=50, a CPU recommendation of 79m would be rounded up to 100m, and a recommendation of 34m would be rounded up to 50m.
To enable this feature, set the --round-cpu-millicores flag when running the VPA recommender:
--round-cpu-millicores=50
VPA can provide Memory recommendations rounded up to user-specified values, making it easier to interpret and configure resources. This feature is controlled by the --round-memory-bytes flag in the recommender component.
When enabled, Memory recommendations will be:
For example, with --round-memory-bytes=134217728, a memory recommendation of 200Mi would be rounded up to 256Mi, and a recommendation of 80Mi would be rounded up to 128Mi.
To enable this feature, set the --round-memory-bytes flag when running the VPA recommender:
--round-memory-bytes=134217728
InPlaceOrRecreate)[!NOTE] FEATURE STATE:
- VPA v1.4.0 [alpha]
- VPA v1.5.0 [beta]
- VPA v1.6.0 [ga]
VPA supports in-place updates to reduce disruption when applying resource recommendations. This feature leverages Kubernetes' in-place update capabilities (which is in beta as of Kubernetes 1.33) to modify container resources without requiring pod recreation. For more information, see AEP-4016: Support for in place updates in VPA
To use in-place updates, set the VPA's updateMode to InPlaceOrRecreate:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-vpa
spec:
updatePolicy:
updateMode: "InPlaceOrRecreate"
When using InPlaceOrRecreate mode, VPA will first attempt to apply updates in-place, if in-place update fails, VPA will fall back to pod recreation.
Updates are attempted when:
Important Notes
Disruption Possibility: While in-place updates aim to minimize disruption, they cannot guarantee zero disruption as the container runtime is responsible for the actual resize operation.
Memory Limit Downscaling: In the beta version, memory limit downscaling is not supported for pods with resizePolicy: PreferNoRestart. In such cases, VPA will fall back to pod recreation.
By default, VPA respects disruption budgets (eviction tolerance, min replicas) even for in-place updates. However, when an in-place update doesn't require container restarts, it's truly non-disruptive and these checks may be unnecessarily restrictive.
The --in-place-skip-disruption-budget flag (default: false) allows VPA to skip disruption budget checks for in-place updates when all containers in the pod have NotRequired resize policy for both CPU and memory or no resize policy is defined.
Even with this flag enabled, disruption budgets are enforced when:
RestartContainer resize policy for any resourceInPlacePodVerticalScaling feature gate enabledInPlaceOrRecreate feature gate enabledVPA will fall back to pod recreation in the following scenarios:
VPA provides metrics to track in-place update operations:
vpa_updater_in_place_updatable_pods_total: Number of pods matching in-place update criteriavpa_updater_in_place_updated_pods_total: Number of pods successfully updated in-placevpa_updater_vpas_with_in_place_updatable_pods_total: Number of VPAs with pods eligible for in-place updatesvpa_updater_vpas_with_in_place_updated_pods_total: Number of VPAs with successfully in-place updated podsvpa_updater_failed_in_place_update_attempts_total: Number of failed attempts to update pods in-place.[!WARNING] FEATURE STATE: VPA v1.7.0 [alpha]
The CPU Startup Boost feature allows VPA to temporarily increase CPU requests and limits for containers during pod startup. This can help workloads that have high CPU demands during their initialization phase, such as Java applications, to start faster. Once the pod is considered Ready and an optional duration has passed, VPA scales the CPU resources back down to their normal levels using an in-place resize.
For more details, see AEP-7862: CPU Startup Boost.
CPU Startup Boost is configured via the startupBoost field in the VerticalPodAutoscalerSpec or within the per-container containerPolicies. This allows for both global and per-container boost configurations.
This example enables a startup boost for all containers in the targeted deployment. The CPU will be multiplied by a factor of 3 for 10 seconds after the pod becomes ready.
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: example-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: example
updatePolicy:
updateMode: "Recreate"
startupBoost:
cpu:
type: "Factor"
factor: 3
durationSeconds: 10
Ready and the startupBoost.cpu.durationSeconds has elapsed, it scales the CPU resources down in-place.InPlacePodVerticalScaling feature gate enabled.CPUStartupBoost feature gate enabled.Enable the feature by setting the CPUStartupBoost feature gate in the VPA admission-controller and updater components:
--feature-gates=CPUStartupBoost=true
The startupBoost field contains a cpu field with the following sub-fields:
type: (Required) The type of boost. Can be Factor to multiply the CPU, or Quantity to add a specific CPU value.factor: (Optional) The multiplier to apply if type is Factor (e.g., 2 for 2x CPU). Required if type is Factor.quantity: (Optional) The amount of CPU to add if type is Quantity (e.g., "500m"). Required if type is Quantity.durationSeconds: (Optional) How long to keep the boost active after the pod becomes Ready. Defaults to 0.