GPU machine types

This document describes the GPU machine series that AI Hypercomputer supports.You can create instances and clusters that use these machine series for runningyour artificial intelligence (AI), machine learning (ML), and high performancecomputing (HPC) workloads.

To use GPUs on AI Hypercomputer, you can use most of the machine series from theaccelerator-optimized machine family. Each machine series in theaccelerator-optimized machine family uses a specific GPU model. Formore information about the accelerator-optimized machine family, seeAccelerator-optimized machine family.

The following section describes the accelerator-optimized machine series thatAI Hypercomputer supports.

A4X series

Caution: TheCompute Engine Service Level Agreement (SLA)doesn't apply to the A4X machine series.

This section outlines the available configurations for the A4X machine series.For more information about this machine series, seeA4X accelerator-optimized machine seriesin the Compute Engine documentation.

A4X

A4X machine types use NVIDIA GB200 Grace Blackwell Superchips (nvidia-gb200) and are ideal for foundation model training and serving.

A4X is an exascale platform based onNVIDIA GB200 NVL72. Each machine has two sockets with NVIDIA Grace CPUs with Arm Neoverse V2 cores. These CPUs are connected to four NVIDIA B200 Blackwell GPUs with fast chip-to-chip (NVLink-C2C) communication.

Tip: When provisioning A4X instances, you mustreserve capacity to create instances and cluster. You can then create instances that use the features and services available from AI Hypercomputer. For more information, seeDeployment options overview.
Attached NVIDIA GB200 Grace Blackwell Superchips
Machine typevCPU count1Instance memory (GB)Attached Local SSD (GiB)Physical NIC countMaximum network bandwidth (Gbps)2GPU countGPU memory3
(GB HBM3e)
a4x-highgpu-4g14088412,00062,0004744

1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A4 series

Note: You can use the Cluster Health Scanner (CHS) tool to troubleshoot your A4machine series GPU clusters. For more information, seeTroubleshoot GPU clusters.

This section outlines the available configurations for the A4 machine series. Formore information about this machine series, seeA4 accelerator-optimized machine seriesin the Compute Engine documentation.

A4

A4machine types haveNVIDIA B200 Blackwell GPUs(nvidia-b200) attached and are ideal for foundation modeltraining and serving.

Tip: When provisioning A4 machine types, you mustreserve capacity to create instances or clusters, use Spot VMs, useFlex-start VMs, or create a resize request in a MIG. For instructions on how to create A4instances, seeCreate VMs and clusters overview. .
Attached NVIDIA B200 Blackwell GPUs
Machine typevCPU count1Instance memory (GB)Attached Local SSD (GiB)Physical NIC countMaximum network bandwidth (Gbps)2GPU countGPU memory3
(GB HBM3e)
a4-highgpu-8g2243,96812,000103,60081,440

1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth, seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A3 series

Note: You can use the CHS tool to troubleshoot your A3 Ultra,A3 Mega, and A3 High GPU clusters. For more information, seeTroubleshoot GPU clusters.

This section outlines the available configurations for the A3 machine series. Formore information about this machine series, seeA3 accelerator-optimized machine seriesin the Compute Engine documentation.

A3 Ultra

A3 Ultramachine types haveNVIDIA H200 SXM GPUs(nvidia-h200-141gb) attached and provides the highest networkperformance in the A3 series. A3 Ultra machine types are ideal for foundation model training andserving.

Tip: When provisioning A3 Ultra machinetypes, you must reserve capacity to create instances or clusters, use Spot VMs, useFlex-start VMs, or create a resize request in a MIG. For more information about theparameters to set when creating an A3 Ultra instance, seeCreate VMs and clusters overview.
Attached NVIDIA H200 GPUs
Machine typevCPU count1Instance memory (GB)Attached Local SSD (GiB)Physical NIC countMaximum network bandwidth (Gbps)2GPU countGPU memory3
(GB HBM3e)
a3-ultragpu-8g2242,95212,000103,60081128

1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A3 Mega

A3 Megamachine types haveNVIDIA H100 SXM GPUsand are ideal for large model training and multi-host inference.Tip: When provisioninga3-megagpu-8g machine types, we recommend using a cluster of these instances and deployingwith a scheduler such as Google Kubernetes Engine (GKE) or Slurm. For detailed instructions on either ofthese options, review the following:
Attached NVIDIA H100 GPUs
Machine typevCPU count1Instance memory (GB)Attached Local SSD (GiB)Physical NIC countMaximum network bandwidth (Gbps)2GPU countGPU memory3
(GB HBM3)
a3-megagpu-8g2081,8726,00091,8008640

1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A3 High

A3 Highmachine types haveNVIDIA H100 SXM GPUsand are well-suited for both large model inference and model fine tuning.Tip: When provisioninga3-highgpu-1g,a3-highgpu-2g, ora3-highgpu-4g machine types,you must create instances by using Spot VMs orFlex-start VMs. For detailed instructions on these options, review the following:
Attached NVIDIA H100 GPUs
Machine typevCPU count1Instance memory (GB)Attached Local SSD (GiB)Physical NIC countMaximum network bandwidth (Gbps)2GPU countGPU memory3
(GB HBM3)
a3-highgpu-1g26234750125180
a3-highgpu-2g524681,5001502160
a3-highgpu-4g1049363,00011004320
a3-highgpu-8g2081,8726,00051,0008640

1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A3 Edge

A3 Edgemachine types haveNVIDIA H100 SXM GPUsand are designed specifically for serving and are available inalimited set of regions.Tip: To get started with A3 Edge instances, seeCreate an A3 VM with GPUDirect-TCPX enabled.
Attached NVIDIA H100 GPUs
Machine typevCPU count1Instance memory (GB)Attached Local SSD (GiB)Physical NIC countMaximum network bandwidth (Gbps)2GPU countGPU memory3
(GB HBM3)
a3-edgegpu-8g2081,8726,0005
  • 800:for asia-south1 and northamerica-northeast2
  • 400:for all otherA3 Edge regions
8640

1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

What's next?

For more information about GPUs, see the following pagesin the Compute Engine documentation:

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.