About Balanced and Scale-Out ComputeClasses in Autopilot clusters

Autopilot

You can use theBalanced andScale-OutComputeClasses inGoogle Kubernetes Engine (GKE) Autopilot clusters to run workloads thatrequire extra compute capacity or specialized CPU configurations. This page isintended for cluster administrators who want more flexible compute options thanthe default Autopilot cluster configuration provides.

Overview of Balanced and Scale-Out ComputeClasses

By default, Pods in GKE Autopilot clusters run on acontainer-optimized compute platform. This platform is ideal forgeneral-purpose workloads such as web servers and medium-intensity batch jobs.The container-optimized compute platform provides a reliable, scalable,cost-optimized hardware configuration that can handle the requirementsof most workloads.

If you have workloads that have unique hardware requirements (such as performingmachine learning or AI tasks, running real-time high traffic databases, orneeding specific CPU platforms and architecture) you can useComputeClasses toprovision that hardware.

In Autopilot clusters only, GKE provides the followingcurated ComputeClasses that let you run Pods that need more flexibility thanthe default container-optimized compute platform:

Balanced: provides higher maximum CPU and memory capacity than thecontainer-optimized compute platform.
Scale-Out: disables simultaneous multi-threading (SMT) and is optimized forscaling out.

These ComputeClasses are available in only Autopilot clusters. Similarto the default container-optimized compute platform, Autopilot managesnode sizing and resource allocation based on your running Pods.

Custom ComputeClasses for additional flexibility

If the Balanced or Scale-Out ComputeClasses in Autopilot clustersdon't meet your workload requirements, you can configureyour own ComputeClasses.You deploy ComputeClass Kubernetes custom resources to your clusters with setsof node attributes that GKE uses to configure new nodes in thecluster. These custom ComputeClasses can, for example, let you deploy workloadson the same hardware as theBalanced orScale-Out ComputeClasses in anyGKE Autopilot or Standard cluster. For moreinformation, seeAbout Autopilot mode workloads in GKE Standard.

Pricing

Pods that use theBalanced orScale-Out ComputeClasses are billed based onthe following SKUs:

For more information, seeGKE pricing.

Balanced and Scale-Out technical details

This section describes the machine types and use cases for theBalanced andScale-Out classes. If you don't request a ComputeClass in your Pods,Autopilot places the Pods on the container-optimized compute platformby default. You might sometimes seeek as the node machine series in yourAutopilot nodes that use the container-optimized compute platform. EKmachines are E2 machine types that are exclusive to Autopilot.

The following table provides a technical overview of theBalanced andScale-Out ComputeClasses.

Note: The machine types that back each ComputeClass might change over time.

Balanced and Scale-Out ComputeClasses

Balanced and Scale-Out ComputeClasses
`Balanced`	Provides more CPU capacity and memory capacity than thecontainer-optimized compute platform maximums. Provides additional CPU platforms and the ability to setminimum CPU platforms for Pods, such as Intel Ice Lake or later. Available CPUs: AMD EPYC Rome, AMD EPYC Milan, Intel Ice Lake, Intel Cascade Lake Available architecture: amd64 Machine series:N2 (Intel CPUs) orN2D machine series (AMD CPUs). Use the`Balanced` class for applications such as the following: Web servers Medium to large databases Caching Streaming and media serving Hyperdisk Throughput and Extreme storage
`Scale-Out`	Provides single-thread-per-core computing and horizontal scaling. Available CPUs: Ampere Altra Arm or AMD EPYC Milan Available architecture: arm64 or amd64 Machine series:T2A (Arm) orT2D (x86). Additional features: SMT is disabled, so one vCPU is equal to one physical core. 3.5GHz maximum clock speed. Use the`Scale-Out` class for applications such as the following: Web servers Containerized microservices Data log processing Large-scale Java apps Hyperdisk Throughput storage

Balanced

Provides more CPU capacity and memory capacity than thecontainer-optimized compute platform maximums. Provides additional CPU platforms and the ability to setminimum CPU platforms for Pods, such as Intel Ice Lake or later.

Available CPUs: AMD EPYC Rome, AMD EPYC Milan, Intel Ice Lake, Intel Cascade Lake
Available architecture: amd64
Machine series:N2 (Intel CPUs) orN2D machine series (AMD CPUs).

Use theBalanced class for applications such as the following:

Web servers
Medium to large databases
Caching
Streaming and media serving
Hyperdisk Throughput and Extreme storage

Scale-Out

Provides single-thread-per-core computing and horizontal scaling.

Available CPUs: Ampere Altra Arm or AMD EPYC Milan
Available architecture: arm64 or amd64
Machine series:T2A (Arm) orT2D (x86).
Additional features:
- SMT is disabled, so one vCPU is equal to one physical core.
- 3.5GHz maximum clock speed.

Use theScale-Out class for applications such as the following:

Web servers
Containerized microservices
Data log processing
Large-scale Java apps
Hyperdisk Throughput storage

ComputeClass selection in workloads

To use a ComputeClass for a GKE workload, you select theComputeClass in the workload manifest by using a node selector for thecloud.google.com/compute-class label.

The following example Deployment manifest selects a ComputeClass:

apiVersion:apps/v1kind:Deploymentmetadata:name:helloweblabels:app:hellospec:selector:matchLabels:app:hellotemplate:metadata:labels:app:hellospec:nodeSelector:# Replace with the name of a compute classcloud.google.com/compute-class:COMPUTE_CLASScontainers:-name:hello-appimage:us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0ports:-containerPort:8080resources:requests:cpu:"250m"memory:"1Gi"

ReplaceCOMPUTE_CLASS with the name of a ComputeClass,such asBalanced orScale-Out. You can select a maximum of one ComputeClassin a workload.

When you deploy the workload, GKE does the following:

Automatically provisions nodes backed by the specified configuration to runyour Pods.
Automatically adds node labels andtaints to the new nodes to prevent other Pods from scheduling on those nodes. Thetaints are unique to each ComputeClass. If you also select a CPUarchitecture, GKE adds a separate taint unique to thatarchitecture.
Automatically adds tolerations corresponding to the applied taints to yourdeployed Pods, which lets GKE place those Pods on the newnodes.

For example, if you request theScale-Out ComputeClass for a Pod:

Autopilot adds a taint specific toScale-Out for those nodes.
Autopilot adds a toleration for that taint to theScale-Out Pods.

Pods that don't requestScale-Out won't get the toleration. As a result,GKE won't schedule those Pods on theScale-Out nodes.

If you don't explicitly request a ComputeClass in your workload specification,Autopilot schedules Pods on nodes that use the defaultcontainer-optimized compute platform. Most general-purpose workloads can runwith no issues on this platform.

How to request a CPU architecture

In some cases, your workloads might be built for a specific architecture, suchasArm. The Scale-Out ComputeClass supportsmultiple CPU architectures. You can request a specific architecture alongsideyour ComputeClass request by specifying a label in your node selector or nodeaffinity rule, such as in the following example:

apiVersion:apps/v1kind:Deploymentmetadata:name:nginx-armspec:replicas:3selector:matchLabels:app:nginx-armtemplate:metadata:labels:app:nginx-armspec:nodeSelector:cloud.google.com/compute-class:COMPUTE_CLASSkubernetes.io/arch:ARCHITECTUREcontainers:-name:nginx-armimage:nginxresources:requests:cpu:2000mmemory:2Gi

ReplaceARCHITECTURE with the CPU architecture that youwant, such asarm64 oramd64. You can select a maximum of one architecturein your workload. The ComputeClass that you select must support yourspecified architecture.

If you don't explicitly request an architecture, Autopilot uses thedefault architecture of the ComputeClass.

Arm architecture on Autopilot

Autopilot supports requests for nodes that use the Arm CPUarchitecture. Arm nodes are more cost-efficient than similar x86 nodes whiledelivering performance improvements. For instructions to request Arm nodes,refer toDeploy Autopilot workloads on Arm architecture.

Ensure that you're using the correct images in your deployments. If yourPods use Arm images and you don't request Arm nodes, Autopilotschedules the Pods on x86 nodes and the Pods will crash. Similarly, if youaccidentally use x86 images but request Arm nodes for the Pods, the Pods willcrash.

Default, minimum, and maximum resource requests

When choosing a ComputeClass for your Autopilot workloads, make surethat you specify resource requests that meet the minimum and maximum requestsfor that ComputeClass. For information about the default requests, as well asthe minimum and maximum requests for each ComputeClass, refer toResource requests and limits in GKE Autopilot.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.

Movatterモバイル変換

About Balanced and Scale-Out ComputeClasses in Autopilot clusters Stay organized with collections Save and categorize content based on your preferences.