Scale container resource requests and limits

This page explains how you can analyze and adjust theCPU requestsandmemory requestsof a container in a Google Kubernetes Engine (GKE) cluster usingvertical Pod autoscaling.

You can scale container resources manually through the Google Cloud console,analyze resources using aVerticalPodAutoscaler object, orconfigure automatic scaling usingvertical Pod autoscaling.

Before you begin

Note: Vertical Pod autoscaling is compatible only with workloads managed by a controller,such as Deployments, StatefulSets, ReplicaSets, and ReplicationControllers. Youcan't use vertical Pod autoscaling with standalone Pods.

Before you start, make sure that you have performed the following tasks:

Analyze resource requests

The Vertical Pod Autoscaler automatically analyzes your containers and providessuggested resource requests. You can view these resource requests using theGoogle Cloud console, Cloud Monitoring, or Google Cloud CLI.

Caution: In some cases, when you access the vertical Pod autoscalingrecommendations, GKE initializes the Vertical Pod Autoscalercontroller. This might cause a control plane recreation.

Console

To view suggested resource requests in the Google Cloud console, you musthave an existing workload deployed that is at least 24 hours old. Somesuggestions might not be available or relevant for certain workloads, such asthose created within the last 24 hours, standalone Pods, and apps written inJava.

  1. Go to theWorkloads page in the Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the workload you want to scale.

  3. ClickActions> Scale> Edit resource requests.

    The Analyze resource utilization data section shows historic usage datathat the Vertical Pod Autoscaler controller analyzed to create thesuggested resource requests in the Adjust resource requests and limitssection.

Cloud Monitoring

To view suggested resource requests in Cloud Monitoring, you musthave an existing workload deployed.

  1. Go to theMetrics Explorer page in the Google Cloud console.

    Go to Metrics Explorer

  2. ClickConfiguration.

  3. Expand theSelect a Metric menu.

  4. In theResource menu, selectKubernetes Scale.

  5. In theMetric category menu, selectAutoscaler.

  6. In theMetric menu, selectRecommended per replicate request bytesandRecommended per replica request core.

  7. ClickApply.

gcloud CLI

To view suggested resource requests, you must create aVerticalPodAutoscalerobject and a Deployment.

  1. For Standard clusters, enable vertical Pod autoscaling for yourcluster. For Autopilot clusters, vertical Pod autoscaling isenabled by default.

    gcloudcontainerclustersupdateCLUSTER_NAME--enable-vertical-pod-autoscaling

    ReplaceCLUSTER_NAME with the name of your cluster.

  2. Save the following manifest asmy-rec-deployment.yaml:

    apiVersion:apps/v1kind:Deploymentmetadata:name:my-rec-deploymentspec:replicas:2selector:matchLabels:app:my-rec-deploymenttemplate:metadata:labels:app:my-rec-deploymentspec:containers:-name:my-rec-containerimage:nginx

    This manifest describes aDeployment that does not have CPU or memoryrequests. Thecontainers.name value ofmy-rec-deployment specifies thatall Pods in the Deployment belong to theVerticalPodAutoscaler.

  3. Apply the manifest to the cluster:

    kubectlcreate-fmy-rec-deployment.yaml
  4. Save the following manifest asmy-rec-vpa.yaml:

    apiVersion:autoscaling.k8s.io/v1kind:VerticalPodAutoscalermetadata:name:my-rec-vpaspec:targetRef:apiVersion:"apps/v1"kind:Deploymentname:my-rec-deploymentupdatePolicy:updateMode:"Off"

    This manifest describes aVerticalPodAutoscaler. TheupdateMode valueofOff means that when Pods are created, the Vertical Pod Autoscalercontroller analyzes the CPU and memory needs of the containers and recordsthose recommendations in thestatus field of the resource. The Vertical PodAutoscaler controller does not automatically update the resource requests forrunning containers.

  5. Apply the manifest to the cluster:

    kubectlcreate-fmy-rec-vpa.yaml
  6. After some time, view theVerticalPodAutoscaler:

    kubectlgetvpamy-rec-vpa--outputyaml

    The output is similar to the following:

    ...  recommendation:    containerRecommendations:    - containerName: my-rec-container      lowerBound:        cpu: 25m        memory: 262144k      target:        cpu: 25m        memory: 262144k      upperBound:        cpu: 7931m        memory: 8291500k...

    This output shows recommendations for CPU and memory requests.

Set Pod resource requests manually

You can set Pod resource requests manually using the Google Cloud consoleor kubectl. Use the following best practices for setting your container resource requests and limits:

  • Memory: Set the same amount of memory for the request and limit.
  • CPU: For the request, specify the minimum CPU needed to ensure correct operation, according to your own SLOs. Set an unbounded CPU limit.
Caution: Applying vertical Pod autoscaling recommendations causes workload disruptions.

Console

  1. Go to theWorkloads page in the Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the workload you want to scale.

  3. ClickActions> Scale> Edit resource requests.

    1. TheAdjust resource requests and limits section shows the currentCPU and memory requests for each container as well as suggested CPU andmemory requests.
  4. ClickApply Latest Suggestions to view suggested requests for eachcontainer.

  5. ClickSave Changes.

  6. ClickConfirm.

kubectl

Vertically scale your workload with minimal disruption

Preview

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Starting inKubernetes version 1.33,you can use thekubectl patch commandto vertically scale your workload by updating theresources that are assigned to a container, without recreating the Pod. For moreinformation, including limitations, see theKubernetes documentation for resizing CPU and memory resources.

To use thekubectl patch command, specify the updated resource request under the--patch flag. For example, to scalemy-app to 800 mCPUs, run thefollowing command:

kubectlpatchpodmy-app--subresourceresize--patch\'{"spec":{"containers":[{"name":"pause", "resources":{"requests":{"cpu":"800m"}, "limits":{"cpu":"800m"}}}]}}'

Vertically scale your workload

To set resource requests for a Pod, set the requests.cpu and memory.cpuvalues in your Deployment manifest. In this example, you manually modify theDeployment created inAnalyze resource requests with suggested resource requests.

  1. Save the following example manifest asmy-adjusted-deployment.yaml:

    apiVersion:apps/v1kind:Deploymentmetadata:name:my-rec-deploymentspec:replicas:2selector:matchLabels:app:my-rec-deploymenttemplate:metadata:labels:app:my-rec-deploymentspec:containers:-name:my-rec-containerimage:nginxresources:requests:cpu:25mmemory:256Mi

    This manifest describes a Deployment that has two Pods. Each Pod has onecontainer that requests 25 milliCPU and 256 MiB of memory.

  2. Apply the manifest to the cluster:

    kubectlapply-fmy-adjusted-deployment.yaml

Set Pod resource requests automatically

Vertical Pod autoscaling uses theVerticalPodAutoscaler object toautomatically set resource requests on Pods when theupdateMode isAuto. Youcan configure aVerticalPodAutoscaler using the gcloud CLI or theGoogle Cloud console.

Caution: Enabling or disabling vertical Pod autoscaling can cause workloaddisruptions by re-creating Pods.

Console

To set resource requests automatically, you must have a cluster with thevertical Pod autoscaling feature enabled. Autopilot clusters havethe vertical Pod autoscaling feature enabled by default.

Enable Vertical Pod Autoscaling

  1. Go to theGoogle Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the cluster you want to modify.

  3. In theAutomation section, clickEdit for theVertical Pod Autoscaling option.

  4. Select theEnable Vertical Pod Autoscaling checkbox.

  5. ClickSave changes.

Configure Vertical Pod Autoscaling

  1. Go to theWorkloads page in Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the Deployment you want to configurevertical Pod autoscaling for.

  3. ClickActions> Autoscale> Vertical pod autoscaling.

  4. Choose an autoscaling mode:

    • Auto mode: Vertical Pod autoscaling updates CPU and memory requestsduring the life of a Pod.
    • Initial mode: Vertical Pod autoscaling assigns resource requests onlyat Pod creation and never changes them later.
  5. (Optional) Set container policies. This option lets you ensure that therecommendation is never set above or below a specified resource request.

    1. ClickAdd Policy.
    2. SelectAuto forEdit container mode.
    3. InControlled resources, select which resources you want toautoscale the container on.
    4. ClickAdd Rule to set one or more minimum or maximum ranges for thecontainer's resource requests:
      • Min. allowed Memory: the minimum amount of memory that thecontainer should always have, in MiB.
      • Min. allowed CPU: the minimum amount of CPU that the containershould always have, in mCPU.
      • Max allowed Memory: the maximum amount of memory that thecontainer should always have, in MiB.
      • Max allowed CPU: the maximum amount of CPU that the containershould always have, in mCPU.
  6. ClickDone.

  7. ClickSave.

gcloud

To set resource requests automatically, you must use a cluster that has thevertical Pod autoscaling feature enabled. Autopilot clusters have the feature enabled by default.

  1. For Standard clusters, enable vertical Pod autoscaling for yourcluster:

    gcloudcontainerclustersupdateCLUSTER_NAME--enable-vertical-pod-autoscaling

    ReplaceCLUSTER_NAME with the name of your cluster.

  2. Save the following manifest asmy-auto-deployment.yaml:

    apiVersion:apps/v1kind:Deploymentmetadata:name:my-auto-deploymentspec:replicas:2selector:matchLabels:app:my-auto-deploymenttemplate:metadata:labels:app:my-auto-deploymentspec:containers:-name:my-containerimage:registry.k8s.io/ubuntu-slim:0.14resources:requests:cpu:100mmemory:50Micommand:["/bin/sh"]args:["-c","whiletrue;dotimeout0.5syes>/dev/null;sleep0.5s;done"]

    This manifest describes a Deployment that has two Pods. Each Pod has onecontainer that requests 100 milliCPU and 50 MiB of memory.

  3. Apply the manifest to the cluster:

    kubectlcreate-fmy-auto-deployment.yaml
  4. (Optional) ConfigureresizePolicy for in-place updates.

    If you plan to useInPlaceOrRecreate mode, you can explicitly define howcontainers should handle resource changes in your Deployment manifest.

    Add theresizePolicy field to your container spec:

    apiVersion:apps/v1kind:Deploymentmetadata:name:my-auto-deploymentspec:...template:...spec:containers:-name:my-containerimage:registry.k8s.io/ubuntu-slim:0.14resizePolicy:-resourceName:cpurestartPolicy:NotRequired-resourceName:memoryrestartPolicy:RestartContainerresources:requests:cpu:100mmemory:50Mi...

    In this example, CPU resizing is permitted without restarting the containerwhereas a container restart is required for memory resizes.

  5. List the running Pods:

    kubectlgetpods

    The output shows the names of the Pods inmy-deployment:

    NAME                            READY     STATUS             RESTARTS   AGEmy-auto-deployment-cbcdd49fb-d6bf9   1/1       Running            0          8smy-auto-deployment-cbcdd49fb-th288   1/1       Running            0          8s
  6. Save the following manifest asmy-vpa.yaml:

    apiVersion:autoscaling.k8s.io/v1kind:VerticalPodAutoscalermetadata:name:my-vpaspec:targetRef:apiVersion:"apps/v1"kind:Deploymentname:my-auto-deploymentupdatePolicy:updateMode:"Recreate"

    This manifest describes aVerticalPodAutoscaler with the followingproperties:

    • targetRef.name: specifies that any Pod that is controlled by aDeployment namedmy-deployment belongs to thisVerticalPodAutoscaler.
    • updateMode: "Recreate": specifies that the Vertical Pod Autoscaler controllercan delete a Pod, adjust the CPU and memory requests, and then start anew Pod. This is the default behavior if no mode is specified (also referred to asAuto mode).You can also change the update mode to one of the following values:
      • updateMode: "Initial": vertical Pod autoscaling assigns resourcerequests only at Pod creation time.
      • updateMode: "InPlaceOrRecreate" (Preview):vertical Pod autoscaling attempts to update resources without re-creatingthe Pod, falling back on re-creation if necessary. This works bestwhen your workloads are configured withresizePolicy: NotRequired.
  7. Apply the manifest to the cluster:

    kubectlcreate-fmy-vpa.yaml
  8. Wait a few minutes, and view the running Pods again:

    kubectlgetpods

    The output shows that the Pod names have changed:

    NAME                                 READY     STATUS             RESTARTS   AGEmy-auto-deployment-89dc45f48-5bzqp   1/1       Running            0          8smy-auto-deployment-89dc45f48-scm66   1/1       Running            0          8s

    If the Pod names have not changed, wait a bit longer, and then view therunning Pods again.

Set minimum and maximum resource values

You can set minimum and maximum constraints for the recommendations generated byvertical Pod autoscaling.

console

  1. Go to theWorkloads page in Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the Deployment you want to configure.

  3. ClickActions> Autoscale> Vertical pod autoscaling.

  4. ClickAdd Policy.

  5. Select the container you want to configure.

  6. ClickAdd Rule.

  7. Enter values forMin. allowed andMax allowed for CPU and Memory.

  8. ClickDone.

  9. ClickSave.

gcloud

To set minimum and maximum resource constraints, include theresourcePolicy section in yourVerticalPodAutoscaler manifest.

  1. Save the following manifest asmy-constrained-vpa.yaml:

    apiVersion:autoscaling.k8s.io/v1kind:VerticalPodAutoscalermetadata:name:my-constrained-vpaspec:targetRef:apiVersion:"apps/v1"kind:Deploymentname:my-auto-deploymentupdatePolicy:updateMode:"Auto"resourcePolicy:containerPolicies:-containerName:my-containerminAllowed:cpu:100mmemory:50MimaxAllowed:cpu:1000mmemory:1024Mi

    This configuration helps to ensure that the recommendation formy-container won't be lower than 100 mCPU/50 MiB or higher than 1000 mCPU/1024 MiB.

  2. Apply the manifest to the cluster:

    kubectlapply-fmy-constrained-vpa.yaml

View information about a Vertical Pod Autoscaler

To view details about a Vertical Pod Autoscaler, do the following:

  1. Get detailed information about one of your running Pods:

    kubectlgetpodPOD_NAME--outputyaml

    ReplacePOD_NAME with the name of one of your Podsthat you retrieved in the previous step.

    The output is similar to the following:

    apiVersion: v1kind: Podmetadata:  annotations:    vpaUpdates: 'Pod resources updated by my-vpa: container 0: cpu capped to node capacity, memory capped to node capacity, cpu request, memory request'...spec:  containers:  ...    resources:      requests:        cpu: 510m        memory: 262144k    ...

    This output shows that the Vertical Pod Autoscaler controller hasa memory request of 262144k and a CPU request of 510 milliCPU.

  2. Get detailed information about theVerticalPodAutoscaler:

    kubectlgetvpamy-vpa--outputyaml

    The output is similar to the following:

    ...  recommendation:    containerRecommendations:    - containerName: my-container      lowerBound:        cpu: 536m        memory: 262144k      target:        cpu: 587m        memory: 262144k      upperBound:        cpu: 27854m        memory: "545693548"

    This output shows recommendations for CPU and memory requests and includesthe following properties:

    • target: specifies that for the container to run optimally, it shouldrequest 587 milliCPU and 26,2144 kilobytes of memory.
    • lowerBound andupperBound: vertical Pod autoscaling uses theseproperties to decide whether to delete a Pod and replace it with a newPod. If a Pod has requests less than the lower bound or greater than theupper bound, the Vertical Pod Autoscaler deletes the Pod and replaces itwith a Pod that meets the target attribute.
  3. Check for resize events. If you are usingInPlaceOrRecreate mode, you canverify if an update happened in-place by checking the events for your Pods:

    kubectlgetevents--field-selectorinvolvedObject.kind=Pod

    Look forResizeCompleted events, which indicate a successful in-placeupdate:

    Reason            Message------            -------ResizeCompleted   Pod resize completed: {"containers":[...]}

Opt out specific containers

You can opt out specific containers from vertical Pod autoscaling using thegcloud CLI or the Google Cloud console.

Console

To opt out specific containers from vertical Pod autoscaling, you musthave a cluster with the vertical Pod autoscaling feature enabled.Autopilot clusters have the vertical Pod autoscaling featureenabled by default.

Enable Vertical Pod Autoscaling

  1. Go to theGoogle Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. In the cluster list, click the name of the cluster you want to modify.

  3. In theAutomation section, clickEdit for theVertical Pod Autoscaling option.

  4. Select theEnable Vertical Pod Autoscaling checkbox.

  5. ClickSave changes.

Configure Vertical Pod Autoscaling

  1. Go to theWorkloads page in Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the Deployment you want toconfigure vertical Pod autoscaling for.

  3. ClickActions> Autoscale> Vertical pod autoscaling.

  4. Choose an autoscaling mode:

    • Auto mode: Vertical Pod autoscaling updates CPU and memory requestsduring the life of a Pod.
    • Initial mode: Vertical Pod autoscaling assigns resource requests onlyat Pod creation and never changes them later.
  5. ClickAdd Policy.

  6. Select the container you want to opt out.

  7. ForEdit container mode, selectOff.

  8. ClickDone.

  9. ClickSave.

gcloud

To opt out specific containers from vertical Pod autoscaling, perform thefollowing steps:

  1. Save the following manifest asmy-opt-vpa.yaml:

    apiVersion:autoscaling.k8s.io/v1kind:VerticalPodAutoscalermetadata:name:my-opt-vpaspec:targetRef:apiVersion:"apps/v1"kind:Deploymentname:my-opt-deploymentupdatePolicy:updateMode:"Recreate"resourcePolicy:containerPolicies:-containerName:my-opt-sidecarmode:"Off"

    This manifest describes aVerticalPodAutoscaler. Themode: "Off" valueturns off recommendations for the containermy-opt-sidecar.

  2. Apply the manifest to the cluster:

    kubectlapply-fmy-opt-vpa.yaml
  3. Save the following manifest asmy-opt-deployment.yaml:

    apiVersion:apps/v1kind:Deploymentmetadata:name:my-opt-deploymentspec:replicas:1selector:matchLabels:app:my-opt-deploymenttemplate:metadata:labels:app:my-opt-deploymentspec:containers:-name:my-opt-containerimage:nginx-name:my-opt-sidecarimage:busyboxcommand:["sh","-c","whiletrue;doechoDoingsidecarstuff!;sleep60;done"]
  4. Apply the manifest to the cluster:

    kubectlapply-fmy-opt-deployment.yaml
  5. After some time, view the Vertical Pod Autoscaler:

    kubectlgetvpamy-opt-vpa--outputyaml

    The output shows recommendations for CPU and memory requests:

    ...  recommendation:    containerRecommendations:    - containerName: my-opt-container...

    In this output, there are only recommendations for one container. Thereare no recommendations formy-opt-sidecar.

    The Vertical Pod Autoscaler never updates resources on opted out containers.If you wait a few minutes, the Pod recreates but only one container hasupdated resource requests.

Identify workloads without resource requests or limits

You might want to identify workloads without configured resource requests andlimits because GKE recommends setting resource requests andlimits for all workloads as abest practiceto avoid abrupt Pod termination under node resource pressure and improveaccuracy of cost allocation. DefiningBestEffortPods or Pods withBurstablememory might lead to reliability issues when a node experiencesmemory pressure.Use the following best practices for setting your container resource requests and limits:

  • Memory: Set the same amount of memory for the request and limit.
  • CPU: For the request, specify the minimum CPU needed to ensure correct operation, according to your own SLOs. Set an unbounded CPU limit.

GKE generates insights and recommendations for workloads runningwithout resource requests and limits.

The following table describes the resource configuration scenarios thatGKE detects and the criteria for each scenario.

Insight subtypeMissing settings scenarioDetails
REQUEST_OR_LIMIT_NOT_SET No configured memory request and limit. (MEMORY_REQUEST_AND_LIMIT_NOT_SET) Pods are running without memory requests and limits set for their containers. GKE cannot throttle memory usage and might abruptly terminate such Pods if a node experiences memory pressure, which might cause reliability issues.
REQUEST_OR_LIMIT_NOT_SET No configured memory limits. (MEMORY_LIMIT_NOT_SET) Pods are running without memory limits set for their containers. GKE cannot throttle memory usage and might abruptly terminate such Pods if a node experiences memory pressure and the Pods' memory usage exceeds requests, which might cause reliability issues. You should set the same amount of memory for requests and limits to avoid Pods using more memory than requested.
REQUEST_OR_LIMIT_NOT_SET No configured CPU request and limit. (CPU_REQUEST_AND_LIMIT_NOT_SET) Pods are running without CPU requests and limits set for containers. This increases the chances of node resource exhaustion, makes Pods more likely to be throttled when node CPU utilization is close to its limit, and might cause performance issues.

For more information about these insights, follow the instructions toview insights and recommendations.

Manually check resource requests and limits

You might want to manually review which resource requests and limits are missingand need to be specified for a given workload so that you can update theconfiguration as recommended.

To review or update resource requests and limits configuration for a specifiedworkload, do the following:

  1. Go to theWorkloads page in the Google Cloud console.

    Go to Workloads

  2. In the workloads list, click the name of the workload you want to inspect.

  3. ClickActions> Scale> Edit resource requests.

    1. TheAdjust resource requests and limits section shows the currentCPU and memory requests for each container.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.