vertical-pod-autoscaler/enhancements/7862-cpu-startup-boost/README.md
Long application start time is a known problem for more traditional workloads running in containerized applications, especially Java workloads. This delay can negatively impact the user experience and overall application performance. One potential solution is to provide additional CPU resources to pods during their startup phase, but this can lead to waste if the extra CPU resources are not set back to their original values after the pods have started up.
This proposal allows VPA to boost the CPU request and limit of containers during
the pod startup and to scale the CPU resources back down when the pod is
Ready or after certain time has elapsed, leveraging the
in-place pod resize Kubernetes feature.
Ready
condition is true and StartupBoost.CPU.DurationSeconds has elapsed.To extend VerticalPodAutoscalerSpec
with a new StartupBoost field to allow users to configure the CPU startup
boost.
To extend ContainerResourcePolicy
with a new StartupBoost field to allow users to optionally customize the
startup boost behavior for individual containers.
To enable only startup boost (if the StartupBoost config is present in the
VPA object) without having to ALSO use the traditional VPA functionality.
The user first configures the CPU startup boost on their VPA object
When a pod targeted by that VPA is created, the kube-apiserver invokes the VPA Admission Controller
The VPA Admission Controller modifies the pod's containers CPU request and
limits to align with its StartupBoost policy, if specified, during the pod
creation. The base value for the boost calculation is the VPA recommended CPU
request. If the VPA recommendation is not available or is zero, the container's
original CPU request from the Pod spec is used as the base.
The behavior for CPU limits depends on the ControlledValues setting in the
ContainerResourcePolicy:
ControlledValues is RequestsOnly , the boosted CPU request
will be capped just below the container's original CPU limit(to preserve pod QoS), if one is set.ControlledValues is RequestsAndLimits (the default), the CPU limit is also boosted.
The new limit is calculated to maintain the container's original limit-to-request ratio, applied to the new boosted CPU request. In cases where this ratio cannot be established (e.g., if the original CPU limit was unspecified), the limit will not be changed by the boost.The VPA Updater monitors pods targeted by the VPA object and when the pod
condition is Ready and StartupBoost.CPU.DurationSeconds has elapsed, it scales
down the CPU resources to the appropriate non-boosted value. This "unboosting"
resizes the pod to whatever the recommendation is at that moment. The specific
behavior is determined by the VPA updatePolicy:
updatePolicy is Auto, Recreate or InPlaceOrRecreate, the VPA
Updater will apply the current VPA recommendation, even if it's higher than
the boosted value.updatePolicy is Off for the VPA object, or mode is Off in a
container policy, the VPA Updater will revert the CPU resources to the
values specified in the pod spec.The new StartupBoost parameter will be added to both:
VerticalPodAutoscalerSpec:
Will allow users to specify the default CPU startup boost for all containers of the pod targeted by the VPA object.ContainerResourcePolicy:
Will allow users to optionally customize the startup boost behavior for individual containers.Here is the Go struct definition for CPUStartupBoost:
// +enum
type CPUBoostType string
const (
FactorType CPUBoostType = "Factor"
QuantityType CPUBoostType = "Quantity"
)
type CPUStartupBoost struct {
// +unionDiscriminator
// +required
Type CPUBoostType `json:"type"`
// +unionMember=Factor
// +optional
Factor *int32 `json:"factor,omitempty"`
// +unionMember=Quantity
// +optional
Quantity *resource.Quantity `json:"quantity,omitempty"`
// +optional
DurationSeconds *int32 `json:"durationSeconds,omitempty"`
}
StartupBoost will contain the following fields:
[Required] StartupBoost.CPU.Type (type: string): A string that specifies
the kind of boost to apply. Supported values are:
Factor: The StartupBoost.CPU.Factor field will be interpreted as a
multiplier for the recommended CPU request. For example, a value of 2 will
double the CPU request.Quantity: The StartupBoost.CPU.Quantity field will be interpreted as an
additional CPU resource quantity (e.g., "500m", "1") to be added to the existing CPU
request or limit during the boost phase.[!NOTE] For forward compatibility, an unrecognized
StartupBoost.CPU.Typevalue will be treated as no boost.
[Optional] StartupBoost.CPU.Factor: (type: integer): The factor to apply to the CPU request. Defaults to 1 if not specified.
StartupBoost.CPU.Typeis Factor, this field is required.StartupBoost.CPU.Typeis Quantity, this field is not allowed.[Optional] StartupBoost.CPU.Quantity: (type: resource.Quantity): The additional CPU resource quantity.
StartupBoost.CPU.Typeis Quantity, this field is required.StartupBoost.CPU.Typeis Factor, this field is not allowed.[Optional] StartupBoost.CPU.DurationSeconds (type: integer): if specified, it
indicates for how long to keep the pod boosted after it goes to Ready.
0 if not specified.[!IMPORTANT] The boosted CPU value will be capped by
--max-allowed-cpu-boostflag value, if set.
[!NOTE] To ensure that containers are unboosted only after their applications are started and ready, it is recommended to configure a Readiness or a Startup probe for the containers that will be CPU boosted. Check the Test Plan section for more details on this feature's behavior for different combinations of probers +
StartupBoost.CPU.DurationSeconds.
StartupBoostThe new StartupBoost field will take precedence over the rest of the fields
in VerticalPodAutoscalerSpec
and ContainerResourcePolicy,
except for:
VerticalPodAutoscalerSpec.TargetRefContainerResourcePolicy.ContainerNameContainerResourcePolicy.ControlledValuesThis means that a container's CPU request/limit can be boosted during startup
beyond MaxAllowed,
for example, or it will be able to be boosted even if CPU is explicitly
excluded from ControlledResources.
startupBoost configuration is valid when VPA objects
are created/updated:
Type is set to Factor and Value is set to a value that can't be
parsed as a float64 (e.g., 500m), the API must reject the startupBoost
configuration as invalid.The VPA Updater will not evict a pod if it attempted to scaled the pod down in place (to unboost its CPU resources) and the update failed (see the scenarios where the VPA updater will consider that the update failed). This is to avoid an eviction loop:
CPUStartupBoostEnabling of feature gates CPUStartupBoost will cause the following to happen:
StartupBoost configured.Disabling of feature gates CPUStartupBoost will cause the following to happen:
StartupBoost configured.
StartupBoost config.StartupBoost
config.Similarly to AEP-4016,
StartupBoost configuration is built assuming that VPA will be running on a
Kubernetes 1.33+ with the beta version of
KEP-1287: In-Place Update of Pod Resources
enabled. If this is not the case, VPA's attempt to unboost pods may fail and the
pods may remain boosted for their whole lifecycle.
Other than comprehensive unit tests, we will also add the following scenarios to our e2e tests:
CPU Startup Boost recommendation is applied to pod controlled by VPA until it
becomes Ready and StartupBoost.CPU.DurationSeconds has elapsed. Then, the pod is
scaled back down in-place. We'll also test the following sub-cases:
StartupBoost.CPU.DurationSeconds:
StartupBoost.CPU.DurationSeconds specified: unboost will
likely happen immediately.StartupBoost.CPU.DurationSeconds: unboost will likely
happen after 60s.StartupBoost.CPU.DurationSeconds specified:
unboost will likely as soon as the pod becomes Ready.StartupBoost.CPU.DurationSeconds
specified: unboost will likely happen 60s after the pod becomes Ready.Pod is not evicted if the in-place update fails when scaling the pod back down.
Here are some examples of the VPA CR incorporating CPU boosting for different scenarios.
startupBoost configured in VerticalPodAutoscalerSpec)apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: example-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: example
updatePolicy:
# This only disables VPA actuations. It doesn't disable
# startup boost configurations.
updateMode: "Off"
startupBoost:
cpu:
type: "Factor"
factor: 3
durationSeconds: 10
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: example-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: example
updatePolicy:
updateMode: "Auto"
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: example-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: example
updatePolicy:
updateMode: "Auto"
startupBoost:
cpu:
type: "Factor"
factor: 3
durationSeconds: 10
startupBoost configured in ContainerPolicies)All containers under example deployment will receive "regular" VPA updates
(VPA is in "Auto" mode in this example), except for
boosted-container-name. boosted-container-name will only be CPU
boosted/unboosted (StartupBoost is enabled and VPA Mode is set to Off).
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: example-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: example
resourcePolicy:
containerPolicies:
- containerName: "boosted-container-name"
# VPA mode is set to Off, so it never changes pod resources for this
# container. This setting is independent from the startup boost mode.
# CPU startup boost changes will still be applied.
mode: "Off"
startupBoost:
cpu:
type: "Quantity"
quantity: "2"
All containers under example deployment will receive "regular" VPA updates
and be CPU boosted/unboosted, except for disable-cpu-boost-for-this-container.
It has a containerPolicy startupBoost overriding the global VPA config that
sets the boost factor to 1.
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: example-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: example
startupBoost:
cpu:
type: "Factor"
factor: 2
resourcePolicy:
containerPolicies:
- containerName: "disable-cpu-boost-for-this-container"
startupBoost:
cpu:
type: "Factor"
factor: 1
All containers under example deployment will receive "regular" VPA updates,
including boosted-container-name. Additionally, boosted-container-name
will be CPU boosted/unboosted, because it has a StartupBoost config in its
container policy.
apiVersion: "autoscaling.k8s.io/v1"
kind: VerticalPodAutoscaler
metadata:
name: example-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: example
resourcePolicy:
containerPolicies:
- containerName: "boosted-container-name"
minAllowed:
cpu: "250m"
memory: "100Mi"
maxAllowed:
cpu: "500m"
memory: "600Mi"
# The CPU boosted resources can go beyond maxAllowed.
startupBoost:
cpu:
type: "Quantity"
quantity: "4"
startupBoost.cpu.duration to startupBoost.cpu.durationSeconds and its type from string to int32 (seconds).startupBoost.cpu.type field to correctly indicate it is a required field, not optional. The field has no default value and must be explicitly set to either "Factor" or "Quantity".startupBoost config in VerticalPodAutoscalerSpec and in
ContainerPolicies to make the API simpler and add more yaml examples.