About GKE cluster autoscaling

Autopilot Standard

This page explains how Google Kubernetes Engine (GKE) automatically resizes yourStandard cluster's node pools based on the demands of your workloads.When demand is high, the cluster autoscaler adds nodes to the node pool. Tolearn how to configure the cluster autoscaler, seeAutoscaling a cluster.

This page is for Admins, Architects and Operators whoplan capacity and infrastructure needs, and optimize systems architecture andresources to achieve the lowest total cost of ownership for their company orbusiness unit. To learn more about common roles and example tasks that wereference in Google Cloud content, seeCommon GKE user roles and tasks.

With Autopilot clusters, you don't need to worry aboutprovisioning nodes or managing node pools because node pools areautomaticallyprovisioned throughnode auto-provisioning,and are automatically scaled to meet the requirements of your workloads.

Before reading this page, ensure that you're familiar with basicKubernetes concepts,and howresource requests and limitswork.

Best practice:

Plan and design your cluster configuration with your organization's Admins and architects, Developers, or other team who is responsible for the implementation and maintenance of your application.

Why use cluster autoscaler

GKE'scluster autoscaler automatically resizes the numberof nodes in a given node pool, based on the demands of your workloads. Whendemand is low, the cluster autoscaler scales back down to a minimum size thatyou designate. This can increase the availability of your workloads when youneed it, while controlling costs. You don't need to manually add or remove nodesor over-provision your node pools. Instead, you specify a minimum and maximumsize for the node pool, and the rest is automatic.

If resources are deleted or moved when autoscaling your cluster, your workloadsmight experience transient disruption. For example, if your workload consists ofa controller with a single replica, that replica's Pod might be rescheduled ontoa different node if its current node is deleted. Before enabling clusterautoscaler, design your workloads to tolerate potential disruption or ensurethat critical Pods are not interrupted.

Best practice:

To increase your workload's tolerance to interruption, deployyour workload using a controller with multiple replicas, such as a Deployment.

You can increase the cluster autoscaler performance withImage streaming, whichremotely streams required image data from eligible container images whilesimultaneously caching the image locally to allow workloads on new nodes tostart faster.

How cluster autoscaler works

Cluster autoscaler works per node pool. When you configure a nodepool with cluster autoscaler, you specify a minimum and maximum size for thenode pool.

Cluster autoscaler increases or decreases the size of the node pool automaticallyby adding or removing virtual machine (VM) instances in the underlying Compute EngineManaged Instance Group (MIG)for the node pool. Cluster autoscaler makes these scaling decisions based on theresource requests (rather than actual resource utilization) of Pods running onthat node pool's nodes. It periodically checks the status of Pods andnodes, and takes action:

If Pods fail to be scheduled on any of the current nodes, the cluster autoscaler adds nodes, up to the maximum size of the nodepool. For more information about when cluster autoscaler change the size of acluster, seeWhen does Cluster Autoscaler change the size of a cluster?
If GKE decides to add new nodes into the node pool, clusterautoscaler adds as many nodes as needed, up to per-nodepool or per-clusterlimits.
Cluster autoscaler doesn't wait for one node to come up before creating thenext one. Once GKE decides how many nodes to create, node creationhappens in parallel. The objective is to minimize the time needed forunschedulable Pods to becomeActive.
If some nodes aren't created due to quotaexhaustion, Cluster autoscaler waits until resources can besuccessfully scheduled.
If nodes are underutilized, and all Pods could be scheduled even with fewernodes in the node pool, cluster autoscaler removes nodes, down to theminimum size of the node pool.
If there are Pods on a node that cannot move toother nodes in the cluster, cluster autoscaler does not attempt to scale downthat node.
If Pods can be moved to other nodes, but the node cannot be drainedgracefully after a timeout period,the node is forcibly terminated. This timeout period is one hour forGKE versions 1.32.7-gke.1079000 or later, and 10 minutes forearlier GKE versions.The maximum honored grace period is not configurable forGKE clusters. For more information about how scale down works,seeHow does scale-down work? in the cluster autoscaler FAQ in the open source documentation.

The frequency at which cluster autoscaler inspects a cluster for unschedulablePods largely depends on the cluster's size. In small clusters, the inspectionmight happen every few seconds. It is not possible to define an exact timeframe requiredfor this inspection.

If your nodes are experiencing shortages because yourPods have requested or defaulted to insufficient resources, the clusterautoscaler does not correct the situation. You can help ensure clusterautoscaler works as accurately as possible by making explicit resource requestsfor all of your workloads.

Don't enable Compute Engineautoscaling for managed instancegroups for your cluster nodes. GKE's cluster autoscaler isseparate from Compute Engine autoscaling. This can lead to node pools failingto scale up or scale down because the Compute Engine autoscaler will be in conflictwith GKE's cluster autoscaler.

Operating criteria

When resizing a node pool, the cluster autoscalermakes the following assumptions:

All replicated Pods can be restarted on some other node, possibly causing abrief disruption.
Users or administrators are not manually managing nodes.Cluster autoscaler can override anymanual node management operations you perform.
All nodes in a single node pool have the same set of labels.
Cluster autoscaler considers the relative cost of the instance types in thevarious pools, and attempts to expand the least expensive possible node pool.However, the following conditions apply to this behavior of clusterautoscaler:
- The cluster autoscaler takes into account the reduced cost of node poolsthat contain Spot VMs, which are preemptible. However, clusterautoscaler also considers the availability of resources in each zone, andmight choose the more expensive, but available, resource.
- When multiple node pools utilize Spot VMs, the cluster autoscalerdoes not automatically select the lowest-cost option. To optimizecost-effective Spot VMs usage and prevent this scenario, werecommend that you usecustom computeclasses.
Cluster autoscaler considers the init container requests before schedulingPods. Init container requests can use any unallocated resources available onthe nodes, which might prevent Pods from being scheduled. Cluster autoscaler followsthe same request calculation rules that Kubernetes uses. To learn more, seethe Kubernetes documentation for using init containers.
Labels that are manually added after initial cluster or node pool creation are nottracked. Nodes that are created by the cluster autoscaler are assigned labels specifiedwith--node-labels at the time of node pool creation.
In GKE version 1.21 or earlier, cluster autoscaler considers the taint informationof the existing nodes from a node pool to represent the whole node pool.Starting in GKE version 1.22, cluster autoscaler combinesinformation from existing nodes in the cluster and the node pool. Clusterautoscaler also detects the manual changes you make to the node and node pool.

Best practice:

Don't enable the cluster autoscaler if your applications are not disruption-tolerant.

Balancing across zones

If your node pool contains multiple managed instance groups with the sameinstance type, the cluster autoscaler attempts to keep these managed instance groupsizes balancedwhen scaling up. This helps prevent an uneven distribution ofnodes among managed instance groups in multiple zones of a node pool.GKE does not consider the autoscaling policy when scaling down.

Cluster autoscaler only balances across zones during a scale-up event.Cluster autoscaler scales down underutilized nodes regardless of the relativesizes of underlying managed instance groups in a node pool, which can cause thenodes to be distributed unevenly across zones.

Location policy

Starting in GKE version 1.24.1-gke.800, you can change thelocation policy of the cluster autoscaler. You can controlthe cluster autoscaler distribution policy by specifying thelocation_policyflag with any of the following values:

BALANCED: this policy instructs the cluster autoscaler to distribute node pool resources across selected zones as equally as possible, in the best effort manner, while considering Pod requirements (such as affinity) and the availability of resources. This policy is the default location policy for node pools using reservations or on-demand nodes, but you can also use it for Spot VMs.BALANCED is not supported for flex-start provisioning mode node pools.
ANY: this policy instructs the cluster autoscaler to search for requested capacity across all specified zones. The cluster autoscaler prioritizes unused reservations and zones with enough capacity, which can lead to concentration of node pool resources. It is the default location policy for flex-start provisioning mode and node pools that use Spot VMs, but you can also use it for node pools using reservations or on-demand nodes. For this policy to work, autoscaling has to be enabled and the initial number of nodes has to be set to 0, so that the autoscaler is responsible for provisioning all nodes.

Best practice:

Use theBALANCED policy if your workloads use only easily obtainable accelerator resources and benefit from being distributed across zones (for example, for better fault tolerance). Use theANY policy to prioritize utilization of unused reservations and higher obtainability of scarce compute resources (such as accelerators).

Reservations

Starting in GKE version 1.27, the cluster autoscaler always considersreservations when making thescale-up decisions. The node pools with matching unused reservations areprioritized when choosing the node pool to scale up, even when the node poolis not the most efficient one. Additionally, unused reservations are alwaysprioritized when balancing multi-zonal scale-ups.

However, the cluster autoscaler checks for reservations only in its own project. Asa result, if a less expensive node option is available within the cluster's ownproject, the autoscaler might select that option instead of theshared reservation.If you need to share reservations across projects, consider usingcustom compute classes,which let you configure the priority that the cluster autoscaler uses to scalenodes, including shared reservations.

Default values

ForSpot VMs node pools,the default cluster autoscaler distribution policy isANY. In this policy,Spot VMs have a lower risk of being preempted.

For non-preemptiblenode pools,the default cluster autoscaler distribution policy isBALANCED.

Minimum and maximum node pool size

When creating a new node pool, you can specify the minimum and maximum size foreach node pool in your cluster, and the cluster autoscaler makes rescalingdecisions within these scaling constraints. To update the minimum size, manuallyresize the cluster to a size within the new constraints after specifying the newminimum value. The cluster autoscaler then makes rescaling decisions based onthe new constraints.

Current node pool size	Cluster autoscaler action	Scaling constraints
Lower than the minimum you specified	Cluster autoscaler scales up to provision pending pods. Scaling down is disabled.	The node pool does not scale down below the value you specified.
Within the minimum and maximum size you specified	Cluster autoscaler scales up or down according to demand.	The node pool stays within the size limits you specified.
Greater than the maximum you specified	Cluster autoscaler scales down only the nodes that can be safely removed. Scaling up is disabled.	The node pool does not scale above the value you specified.

On Standard clusters, the cluster autoscaler neverautomaticallyscales down a cluster to zero nodes. One or more nodes must always be availablein the cluster to run system Pods. Additionally, if the current number of nodesis zero due tomanual removal of nodes, cluster autoscaler and node auto-provisioning can scale up from zero node clusters.

To learn more about autoscaler decisions, seecluster autoscaler limitations.

Autoscaling limits

You can set the minimum and maximum number of nodes for the cluster autoscalerto use when scaling a node pool. Use the--min-nodes and--max-nodes flagsto set the minimum and maximum number of nodes per zone

Starting in GKE version 1.24, you can use the--total-min-nodesand--total-max-nodes flags for new clusters. These flags set the minimum andmaximum number of the total number of nodes in the node pool across all zones.

Min and max nodes example

The following command creates an autoscalingmulti-zonal clusterwith six nodes across three zones initially, with a minimum of one node per zoneand a maximum of four nodes per zone:

gcloud container clusters create example-cluster \    --num-nodes=2 \    --location=us-central1-a \    --node-locations=us-central1-a,us-central1-b,us-central1-f \--enable-autoscaling --min-nodes=1 --max-nodes=4

In this example, the total size of the cluster can be between three and twelvenodes, spread across the three zones. If one of the zones fails, the total sizeof the cluster can be between two and eight nodes.

Total nodes example

The following command, available in GKE version 1.24 or later,creates an autoscalingmulti-zonal clusterwith six nodes across three zones initially, with a minimum of three nodes anda maximum of twelve nodes in the node pool across all zones:

gcloud container clusters create example-cluster \    --num-nodes=2 \    --location=us-central1-a \    --node-locations=us-central1-a,us-central1-b,us-central1-f \--enable-autoscaling --total-min-nodes=3 --total-max-nodes=12

In this example, the total size of the cluster can be between three and twelvenodes, regardless of spreading between zones.

Autoscaling profiles

The decision of when to remove a node is a trade-off between optimizing forutilization or the availability of resources. Removing underutilized nodesimproves cluster utilization, but new workloads might have to wait for resourcesto be provisioned again before they can run.

You can specify which autoscaling profile to use when making such decisions.The available profiles are:

balanced: The default profile that prioritizes keeping more resourcesreadily available for incoming pods and thus reducing the time needed forhaving them active for Standard clusters. Thebalanced profileisn't available for Autopilot clusters.
optimize-utilization: Prioritize optimizing utilization over keeping spareresources in the cluster. When you enable this profile, the cluster autoscaler scales downthe cluster more aggressively. GKE can remove more nodes, and remove nodesfaster. GKE prefers to schedule Pods in nodes that already havehigh allocation of CPU, memory, or GPUs. However, other factorsinfluence scheduling,such as spread of Pods belonging to the same Deployment, StatefulSet orService, across nodes.

Theoptimize-utilization autoscaling profile helps thecluster autoscaler to identify and remove underutilized nodes. To achieve thisoptimization, GKE sets the scheduler name in the Pod spec togke.io/optimize-utilization-scheduler. Pods that specify a custom schedulerare not affected.

The following command enablesoptimize-utilization autoscaling profile in anexisting cluster:

gcloudcontainerclustersupdateCLUSTER_NAME\--autoscaling-profileoptimize-utilization

Considering Pod scheduling and disruption

When scaling down, the cluster autoscaler respects scheduling and eviction rules seton Pods. These restrictions can prevent a node from being deleted by theautoscaler. A node's deletion could be prevented if it contains a Pod with anyof these conditions:

The Pod'saffinity or anti-affinity rules prevent rescheduling.
The Pod is not managed by aController such as a Deployment, StatefulSet, Job or ReplicaSet.
The Pod has local storage and the GKE control plane version is lower than 1.22. In GKE clusters with control plane version 1.22 or later, Pods with local storage no longer block scaling down.
The Pod has the"cluster-autoscaler.kubernetes.io/safe-to-evict": "false" annotation.
The node's deletion would exceed the configuredPodDisruptionBudget.

For more information about cluster autoscaler and preventing disruptions, seethe following questions in theCluster autoscaler FAQ:

Autoscaling TPUs in GKE

GKE supports Tensor Processing Units (TPUs) to accelerate machine learning workloads. Bothsingle-host TPU slice node pool andmulti-host TPU slice node pool support autoscaling and auto-provisioning.

With the--enable-autoprovisioning flag on a GKE cluster,GKE creates or deletes single-host or multi-host TPU slice node pools with a TPUversion and topology that meets the requirements of pending workloads.

When you use--enable-autoscaling, GKE scales the node pool based on its type, as follows:

Single-host TPU slice node pool: GKE adds or removes TPU nodes in theexisting node pool. The node pool may contain any number of TPU nodes betweenzero and the maximum size of the node pool as determined by the--max-nodes andthe--total-max-nodesflags. When the node pool scales, all the TPU nodes in the node pool have thesame machine type and topology. To learn more how to create a single-host TPUslice node pool, seeCreate a nodepool.
Multi-host TPU slice node pool: GKE atomically scales up the node poolfrom zero to the number of nodes required to satisfy the TPU topology. Forexample, with a TPU node pool with a machine typect5lp-hightpu-4t and atopology of16x16, the node pool contains 64 nodes. The GKEautoscaler ensures that this node pool has exactly 0 or 64 nodes. When scalingback down, GKE evicts all scheduled pods, and drains the entirenode pool to zero. To learn more how to create a multi-host TPU slice nodepool, seeCreate a node pool.

Spot VMs and cluster autoscaler

Because cluster autoscaler prefers expanding the least expensive node pools,when your workloads and resource availability allow it, cluster autoscaler addsSpot VMs when scaling up.

However, even though cluster autoscaler prefers adding Spot VMs,this preference doesn't guarantee that the majority of your Pods will run onthese types of VMs. Spot VMs can be preempted. Because of thispreemption, Pods on Spot VMs are more likely to be evicted.When they're evicted, they only have15 seconds to terminate.

For example, imagine a scenario where you have 10 Pods and a mixture ofon-demand and Spot VMs:

You begin with 10 Pods running on on-demand VMs because theSpot VMs weren't available.
You don't need all 10 Pods, so cluster autoscaler removes two Pods andshuts down the extra on-demand VMs.
When you need 10 Pods again, cluster autoscaler addsSpot VMs (because they're cheaper) and schedules two Pods onthem. The other eight Pods remain on the on-demand VMs.
If cluster autoscaler needs to scale down again, Spot VMs arelikely to be preempted first, leaving the majority of your Pods running onon-demand VMs.

To prioritize Spot VMs, and avoid the preceding scenario, werecommend that you usecustom compute classes.Custom compute classes let you createpriority rulesthat favor Spot VMs during scale-up by giving them higherpriority than on-demand nodes. To further maximize the likelihood of your Podsrunning on nodes backed by Spot VMs, configureactive migration.

The following example shows you one way to use custom compute classes toprioritize Spot VMs. To learn more about ComputeClass parameters, seeComputeClass CRD documentation:

apiVersion:cloud.google.com/v1kind:ComputeClassmetadata:name:prefer-l4-spotspec:# Defines a prioritized list of machine types and configurations for node provisioning.priorities:-machineType:g2-standard-24# Specifically requests Spot VMs for this configuration. GKE will try to provision these VMs first.spot:truegpu:type:nvidia-l4count:2# If GKE can't satisfy the preceding rule, request on-demand nodes with the same configuration-machineType:g2-standard-24spot:falsegpu:type:nvidia-l4count:2nodePoolAutoCreation:enabled:true# Configures active migration behavior for workloads using this ComputeClass.activeMigration:optimizeRulePriority:true# Enables Cluster Autoscaler to attempt to migrate workloads to Spot VMs# if Spot capacity becomes available and the workload is currently# running on an on-demand VM (based on the priority rules in this example).

In the preceding example, the priority rule declares a preference for creatingnodes with theg2-standard-24 machine type and Spot VMs. IfSpot VMs aren't available, then GKE uses on-demandVMs as a fallback option. This compute class also enablesactiveMigration,enabling cluster autoscaler to migrate workloads to Spot VMs whenthe capacity becomes available.

If you can't use custom compute classes, add anode affinity, taint, or toleration.For example, the following node affinity rule declares a preference for schedulingPods on nodes that are backed by Spot VMs (GKEautomatically adds thecloud.google.com/gke-spot=true label to these typesof nodes):

affinity:nodeAffinity:preferredDuringSchedulingIgnoredDuringExecution:-weight:1preference:matchExpressions:# set to "true". GKE automatically applies this label to Spot VMs.-key:cloud.google.com/gke-spotoperator:Equalvalues:-true

To learn more about using node affinities, taints, and tolerations to scheduleSpot VMs, see theRunning a GKE application on spot nodes with on-demand nodes as fallbackblog.

ProvisioningRequest CRD

A ProvisioningRequest is a namespaced custom resource that lets users request capacity for a group of Pods from the cluster autoscaler. This is particularly useful for applications with interconnected pods that must be scheduled together as a single unit.

Supported Provisioning Classes

There are three supported ProvisioningClasses:

queued-provisioning.gke.io: this GKE-specific class integrates with the Dynamic Workload Scheduler, lets you queue requests and have them fulfilled when resources become available. This is ideal for batch jobs or delay-tolerant workloads. SeeDeploy GPUs for batch and AI workloads with Dynamic Workload Scheduler to learn how to use queued provisioning in GKE.Supported from GKE version 1.28.3-gke.1098000 in Standard clusters and from GKE version 1.30.3-gke.1451000 in Autopilot clusters.
check-capacity.autoscaling.x-k8s.io: this open-source class verifies the availability of resources before it attempts to schedule Pods.Supported from GKE version 1.30.2-gke.1468000.
best-effort-atomic.autoscaling.x-k8s.io: this open-source class attempts to provision resources all Pods in the request together. If it is impossible to provision enough resources for all pods, no resources will be provisioned and the entire request will fail.Supported from GKE version 1.31.27.

To learn more about the CheckCapacity and BestEffortAtomicScaleUp classes, refer to theopen-source documentation.

Limitations when using ProvisioningRequest

GKE cluster autoscaler supports only 1 PodTemplate per ProvisioningRequest.
GKE cluster autoscaler can scale up only 1 node pool at a time. If your ProvisioningRequest requires resources from multiple node pools, you must create separate ProvisioningRequests for each node pool.

Best practices when using ProvisioningRequest

Usetotal-max-nodes: instead of limiting the maximum number of nodes (--max nodes), use--total-max-nodes to constrain the total resources that are consumed by your application.
Uselocation-policy=ANY: this setting allows your Pods to be scheduled in any available location, which can expedite provisioning and optimize resource utilization.
(Optional) Integrate with Kueue: Kueue can automate the creation of ProvisioningRequests, streamlining your workflow. For more information, see theKueue documentation.

Backoff periods

A scale-up operation can fail due to node creation errors such as insufficientquota or IP address exhaustion. When these errors occur, the underlying ManagedInstance Group (MIG) retries the operation after an initial five-minute backoff.If errors continue, this backoff period increases exponentially to a maximum of30 minutes. During this time, the cluster autoscaler can still scale up othernode pools in the cluster that aren't experiencing errors.

Additional information

You can find more information about cluster autoscaler in theAutoscaling FAQ in the open-source Kubernetes project.

Limitations

Cluster autoscaler has the following limitations:

Local PersistentVolumes are not supported by the cluster autoscaler.
In GKE control plane version earlier than 1.24.5-gke.600, when Pods request ephemeral storage, the cluster autoscaler does not support scaling up a node pool with zero nodes thatusesLocal SSDs as ephemeral storage.
Cluster size limitations: up to 15,000 nodes.Account for othercluster limitsand ourbest practiceswhen running clusters of this size.
When scaling down, the cluster autoscaler honors a graceful termination periodof one hour for rescheduling the node's Pods onto a different node beforeforcibly terminating the node.
Occasionally, the cluster autoscaler cannot scale down completely and an extranode exists after scaling down. This can occur when required system Pods arescheduled onto different nodes, because there is no trigger for any of thosePods to be moved to a different node. SeeI have a couple of nodes with low utilization, but they are not scaled down. Why?.To work around this limitation, you can configure aPod disruption budget.
Custom scheduling with alteredFiltersis not supported.
Cluster Autoscaler considers default kube-scheduler behavior when deciding to provision new nodes for pending Pods. Using custom schedulers is not supported and might result in unexpected scaling behavior.
Nodes won't scale up if Pods have aPriorityClassvalue below-10. Learn more inHow does Cluster Autoscaler work with Pod Priority and Preemption?
Cluster autoscaler might not have enough unallocated IP address space to useto add new nodes or Pods, resulting in scale-up failures, which are indicated byeventResult eventswith the reasonscale.up.error.ip.space.exhausted. You can add more IPaddresses for nodes byexpanding the primary subnet,or add new IP addresses for Pods usingdiscontiguous multi-Pod CIDR.For more information, seeNot enough free IP space for Pods.
GKE cluster autoscaler is different from Cluster autoscaler ofthe open source Kubernetes project. The parameters of the GKECluster autoscaler depend on the cluster configuration and are subject tochange. If you need more control over the autoscaling behavior, disableGKE Cluster autoscaler and run Cluster autoscaler ofthe open source Kubernetes. However, the open source Kubernetes has no Google Cloud support.
When you delete a GKE node pool that has autoscaling enabled, the nodes get theNoSchedule flag set, and any Pods on those nodes are immediately evicted. To mitigate the sudden decrease in available resources, the autoscaler of the node pool might provision new nodes within the same node pool. These newly created nodes become available for scheduling, and evicted Pods are scheduled back onto them. Eventually, the entire node pool—including the newly provisioned nodes and their Pods—is deleted, which can lead to potential service interruptions.As a workaround, to prevent the autoscaler from provisioning new nodes during deletion,disable autoscaling on the node pool before you initiate deletion.
Cluster Autoscaler needs to predict the amount of available resources on new nodes in order to makescaling decisions. DaemonSet Pods are included, which decreases the available resources.The predictions are not 100% accurate, and the amount of available resources can change betweenGKE versions. Because of this, we don't recommend sizing and constraining workloads to fit a particular instance type. Consider usingcustom compute classes instead.If a workload needs to target a particular instance type, make sure to size it so that it leaves a buffer of allocatableresources on the nodes. In that case, you also need to ensure that all relevant DaemonSet Pods can fit on the nodes togetherwith your workload Pods.
The cluster autoscaler does not support strictPod topology spread constraintswhen thewhenUnsatisfiable field is set to theDoNotSchedule value. You cansoften the spread requirements by setting thewhenUnsatisfiable field to theScheduleAnyway value.

Known issues

In GKE control plane version prior to 1.22,GKE cluster autoscaler stops scaling up all node pools onempty (zero node) clusters. This behavior doesn't occur in GKEversion 1.22 and later.

Troubleshooting

For troubleshooting advice, see the following pages:

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換

About GKE cluster autoscaling Stay organized with collections Save and categorize content based on your preferences.

Why use cluster autoscaler

How cluster autoscaler works

Operating criteria

Balancing across zones

Location policy

Reservations

Default values

Minimum and maximum node pool size

Autoscaling limits

Autoscaling profiles

Considering Pod scheduling and disruption

Autoscaling TPUs in GKE

Spot VMs and cluster autoscaler

ProvisioningRequest CRD

Supported Provisioning Classes

Limitations when using ProvisioningRequest

Best practices when using ProvisioningRequest

Backoff periods

Additional information

Limitations

Known issues

Troubleshooting

What's next

About GKE cluster autoscaling