GPU machine types

This document describes the GPU machine series that AI Hypercomputer supports.You can create instances and clusters that use these machine series for runningyour artificial intelligence (AI), machine learning (ML), and high performancecomputing (HPC) workloads.

To use GPUs on AI Hypercomputer, you can use most of the machine series from theaccelerator-optimized machine family. Each machine series in theaccelerator-optimized machine family uses a specific GPU model. Formore information about the accelerator-optimized machine family, seeAccelerator-optimized machine family.

The following section describes the accelerator-optimized machine series thatAI Hypercomputer supports.

A4X series

Caution: The Compute Engine Service Level Agreement (SLA)doesn't apply to the A4X machine series.

This section outlines the available configurations for the A4X machine series.For more information about this machine series, seeA4X accelerator-optimized machine seriesin the Compute Engine documentation.

A4X

A4X machine types use NVIDIA GB200 Grace Blackwell Superchips (nvidia-gb200) and are ideal for foundation model training and serving.

A4X is an exascale platform based onNVIDIA GB200 NVL72. Each machine has two sockets with NVIDIA Grace CPUs with Arm Neoverse V2 cores. These CPUs are connected to four NVIDIA B200 Blackwell GPUs with fast chip-to-chip (NVLink-C2C) communication.

Tip: When provisioning A4X instances, you must reserve capacity to create instances and cluster. You can then create instances that use the features and services available from AI Hypercomputer. For more information, seeDeployment options overview.

						Attached NVIDIA GB200 Grace Blackwell Superchips
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3e)
`a4x-highgpu-4g`	140	884	12,000	6	2,000	4	744

¹A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
²Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
³GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A4 series

Note: You can use the Cluster Health Scanner (CHS) tool to troubleshoot your A4machine series GPU clusters. For more information, see Troubleshoot GPU clusters.

This section outlines the available configurations for the A4 machine series. Formore information about this machine series, seeA4 accelerator-optimized machine seriesin the Compute Engine documentation.

A4

A4machine types haveNVIDIA B200 Blackwell GPUs(nvidia-b200) attached and are ideal for foundation modeltraining and serving.

Tip: When provisioning A4 machine types, you mustreserve capacity to create instances or clusters, use Spot VMs, useFlex-start VMs, or create a resize request in a MIG. For instructions on how to create A4instances, see Create VMs and clusters overview. .

						Attached NVIDIA B200 Blackwell GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3e)
`a4-highgpu-8g`	224	3,968	12,000	10	3,600	8	1,440

¹A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
²Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth, seeNetwork bandwidth.
³GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A3 series

Note: You can use the CHS tool to troubleshoot your A3 Ultra,A3 Mega, and A3 High GPU clusters. For more information, see Troubleshoot GPU clusters.

This section outlines the available configurations for the A3 machine series. Formore information about this machine series, seeA3 accelerator-optimized machine seriesin the Compute Engine documentation.

A3 Ultra

A3 Ultramachine types haveNVIDIA H200 SXM GPUs(nvidia-h200-141gb) attached and provides the highest networkperformance in the A3 series. A3 Ultra machine types are ideal for foundation model training andserving.

Tip: When provisioning A3 Ultra machinetypes, you must reserve capacity to create instances or clusters, use Spot VMs, useFlex-start VMs, or create a resize request in a MIG. For more information about theparameters to set when creating an A3 Ultra instance, see Create VMs and clusters overview.

						Attached NVIDIA H200 GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3e)
`a3-ultragpu-8g`	224	2,952	12,000	10	3,600	8	1128

A3 Mega

A3 Megamachine types haveNVIDIA H100 SXM GPUsand are ideal for large model training and multi-host inference.Tip: When provisioninga3-megagpu-8g machine types, we recommend using a cluster of these instances and deployingwith a scheduler such as Google Kubernetes Engine (GKE) or Slurm. For detailed instructions on either ofthese options, review the following:

To create Google Kubernetes Engine cluster, see Deploy an A3 Mega cluster with GKE.
To create a Slurm cluster, seeDeploy an A3 Mega Slurm cluster.

						Attached NVIDIA H100 GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3)
`a3-megagpu-8g`	208	1,872	6,000	9	1,800	8	640

A3 High

A3 Highmachine types haveNVIDIA H100 SXM GPUsand are well-suited for both large model inference and model fine tuning.Tip: When provisioninga3-highgpu-1g,a3-highgpu-2g, ora3-highgpu-4g machine types,you must create instances by using Spot VMs orFlex-start VMs. For detailed instructions on these options, review the following:

To create Spot VMs, set the provisioning model toSPOT when you create an accelerator-optimized VM.
To create Flex-start VMs, you can use one of the following methods:
- Create a standalone VM and set the provisioning model toFLEX_START when youcreate an accelerator-optimized VM.
- Create a resize request in a managed instance group (MIG). For instructions, seeCreate a MIG with GPU VMs.

						Attached NVIDIA H100 GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3)
`a3-highgpu-1g`	26	234	750	1	25	1	80
`a3-highgpu-2g`	52	468	1,500	1	50	2	160
`a3-highgpu-4g`	104	936	3,000	1	100	4	320
`a3-highgpu-8g`	208	1,872	6,000	5	1,000	8	640

A3 Edge

A3 Edgemachine types haveNVIDIA H100 SXM GPUsand are designed specifically for serving and are available inalimited set of regions.Tip: To get started with A3 Edge instances, see Create an A3 VM with GPUDirect-TCPX enabled.

						Attached NVIDIA H100 GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3)
`a3-edgegpu-8g`	208	1,872	6,000	5	800:for asia-south1 and northamerica-northeast2 400:for all otherA3 Edge regions	8	640

What's next?

For more information about GPUs, see the following pagesin the Compute Engine documentation:

Learnabout GPUs on Compute Engine.
Review theGPU regions and zones availability.
Learn aboutGPU pricing.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換

GPU machine types Stay organized with collections Save and categorize content based on your preferences.