GPU machine types Stay organized with collections Save and categorize content based on your preferences.
This document describes the GPU machine series that AI Hypercomputer supports.You can create instances and clusters that use these machine series for runningyour artificial intelligence (AI), machine learning (ML), and high performancecomputing (HPC) workloads.
To use GPUs on AI Hypercomputer, you can use most of the machine series from theaccelerator-optimized machine family. Each machine series in theaccelerator-optimized machine family uses a specific GPU model. Formore information about the accelerator-optimized machine family, seeAccelerator-optimized machine family.
The following section describes the accelerator-optimized machine series thatAI Hypercomputer supports.
A4X series
Caution: TheCompute Engine Service Level Agreement (SLA)doesn't apply to the A4X machine series.This section outlines the available configurations for the A4X machine series.For more information about this machine series, seeA4X accelerator-optimized machine seriesin the Compute Engine documentation.
A4X
A4X machine types use NVIDIA GB200 Grace Blackwell Superchips (nvidia-gb200) and are ideal for foundation model training and serving.
A4X is an exascale platform based onNVIDIA GB200 NVL72. Each machine has two sockets with NVIDIA Grace CPUs with Arm Neoverse V2 cores. These CPUs are connected to four NVIDIA B200 Blackwell GPUs with fast chip-to-chip (NVLink-C2C) communication.
Tip: When provisioning A4X instances, you mustreserve capacity to create instances and cluster. You can then create instances that use the features and services available from AI Hypercomputer. For more information, seeDeployment options overview.| Attached NVIDIA GB200 Grace Blackwell Superchips | |||||||
|---|---|---|---|---|---|---|---|
| Machine type | vCPU count1 | Instance memory (GB) | Attached Local SSD (GiB) | Physical NIC count | Maximum network bandwidth (Gbps)2 | GPU count | GPU memory3 (GB HBM3e) |
a4x-highgpu-4g | 140 | 884 | 12,000 | 6 | 2,000 | 4 | 744 |
1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.
A4 series
Note: You can use the Cluster Health Scanner (CHS) tool to troubleshoot your A4machine series GPU clusters. For more information, seeTroubleshoot GPU clusters.This section outlines the available configurations for the A4 machine series. Formore information about this machine series, seeA4 accelerator-optimized machine seriesin the Compute Engine documentation.
A4
A4machine types haveNVIDIA B200 Blackwell GPUs(nvidia-b200) attached and are ideal for foundation modeltraining and serving.
| Attached NVIDIA B200 Blackwell GPUs | |||||||
|---|---|---|---|---|---|---|---|
| Machine type | vCPU count1 | Instance memory (GB) | Attached Local SSD (GiB) | Physical NIC count | Maximum network bandwidth (Gbps)2 | GPU count | GPU memory3 (GB HBM3e) |
a4-highgpu-8g | 224 | 3,968 | 12,000 | 10 | 3,600 | 8 | 1,440 |
1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth, seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.
A3 series
Note: You can use the CHS tool to troubleshoot your A3 Ultra,A3 Mega, and A3 High GPU clusters. For more information, seeTroubleshoot GPU clusters.This section outlines the available configurations for the A3 machine series. Formore information about this machine series, seeA3 accelerator-optimized machine seriesin the Compute Engine documentation.
A3 Ultra
A3 Ultramachine types haveNVIDIA H200 SXM GPUs(nvidia-h200-141gb) attached and provides the highest networkperformance in the A3 series. A3 Ultra machine types are ideal for foundation model training andserving.
| Attached NVIDIA H200 GPUs | |||||||
|---|---|---|---|---|---|---|---|
| Machine type | vCPU count1 | Instance memory (GB) | Attached Local SSD (GiB) | Physical NIC count | Maximum network bandwidth (Gbps)2 | GPU count | GPU memory3 (GB HBM3e) |
a3-ultragpu-8g | 224 | 2,952 | 12,000 | 10 | 3,600 | 8 | 1128 |
1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.
A3 Mega
A3 Megamachine types haveNVIDIA H100 SXM GPUsand are ideal for large model training and multi-host inference.Tip: When provisioninga3-megagpu-8g machine types, we recommend using a cluster of these instances and deployingwith a scheduler such as Google Kubernetes Engine (GKE) or Slurm. For detailed instructions on either ofthese options, review the following:- To create Google Kubernetes Engine cluster, seeDeploy an A3 Mega cluster with GKE.
- To create a Slurm cluster, seeDeploy an A3 Mega Slurm cluster.
| Attached NVIDIA H100 GPUs | |||||||
|---|---|---|---|---|---|---|---|
| Machine type | vCPU count1 | Instance memory (GB) | Attached Local SSD (GiB) | Physical NIC count | Maximum network bandwidth (Gbps)2 | GPU count | GPU memory3 (GB HBM3) |
a3-megagpu-8g | 208 | 1,872 | 6,000 | 9 | 1,800 | 8 | 640 |
1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.
A3 High
A3 Highmachine types haveNVIDIA H100 SXM GPUsand are well-suited for both large model inference and model fine tuning.Tip: When provisioninga3-highgpu-1g,a3-highgpu-2g, ora3-highgpu-4g machine types,you must create instances by using Spot VMs orFlex-start VMs. For detailed instructions on these options, review the following:- To create Spot VMs, set the provisioning model to
SPOTwhen youcreate an accelerator-optimized VM. - To create Flex-start VMs, you can use one of the following methods:
- Create a standalone VM and set the provisioning model to
FLEX_STARTwhen youcreate an accelerator-optimized VM. - Create a resize request in a managed instance group (MIG). For instructions, seeCreate a MIG with GPU VMs.
- Create a standalone VM and set the provisioning model to
| Attached NVIDIA H100 GPUs | |||||||
|---|---|---|---|---|---|---|---|
| Machine type | vCPU count1 | Instance memory (GB) | Attached Local SSD (GiB) | Physical NIC count | Maximum network bandwidth (Gbps)2 | GPU count | GPU memory3 (GB HBM3) |
a3-highgpu-1g | 26 | 234 | 750 | 1 | 25 | 1 | 80 |
a3-highgpu-2g | 52 | 468 | 1,500 | 1 | 50 | 2 | 160 |
a3-highgpu-4g | 104 | 936 | 3,000 | 1 | 100 | 4 | 320 |
a3-highgpu-8g | 208 | 1,872 | 6,000 | 5 | 1,000 | 8 | 640 |
1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.
A3 Edge
A3 Edgemachine types haveNVIDIA H100 SXM GPUsand are designed specifically for serving and are available inalimited set of regions.Tip: To get started with A3 Edge instances, seeCreate an A3 VM with GPUDirect-TCPX enabled.| Attached NVIDIA H100 GPUs | |||||||
|---|---|---|---|---|---|---|---|
| Machine type | vCPU count1 | Instance memory (GB) | Attached Local SSD (GiB) | Physical NIC count | Maximum network bandwidth (Gbps)2 | GPU count | GPU memory3 (GB HBM3) |
a3-edgegpu-8g | 208 | 1,872 | 6,000 | 5 |
| 8 | 640 |
1A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
2Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
3GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.
What's next?
For more information about GPUs, see the following pagesin the Compute Engine documentation:
- Learnabout GPUs on Compute Engine.
- Review theGPU regions and zones availability.
- Learn aboutGPU pricing.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.