Networking and GPU machines

This document outlines the network bandwidth capabilities and configurations forCompute Engine instances with attached GPUs. Learn about the maximum networkbandwidth, Network Interface Card (NIC) arrangements, and recommendedVPC network setups for various GPU machine types, including theA4X Max, A4X, A4, A3, A2, G4, G2, and N1 series. Understanding theseconfigurations can help you optimize performance for your distributed workloadson Compute Engine.

Overview

The following table provides a general comparison of networking capabilitiesacross GPU machine types.

Machine type	GPU model	Max total bandwidth	GPU to GPU networking technology
A4X Max	NVIDIA GB300 Ultra Superchips	3,600 Gbps	GPUDirect RDMA
A4X	NVIDIA GB200 Superchips	2,000 Gbps	GPUDirect RDMA
A4	NVIDIA B200	3,600 Gbps	GPUDirect RDMA
A3 Ultra	NVIDIA H200	3,600 Gbps	GPUDirect RDMA
A3 Mega	NVIDIA H100 80GB	1,800 Gbps	GPUDirect-TCPXO
A3 High	NVIDIA H100 80GB	1,000 Gbps	GPUDirect-TCPX
A3 Edge	NVIDIA H100 80GB	600 Gbps	GPUDirect-TCPX
G4	NVIDIA RTX PRO 6000	400 Gbps	N/A
A2 Standard and A2 Ultra	NVIDIA A100 40GB, NVIDIA A100 80GB	100 Gbps	N/A
G2	NVIDIA L4	100 Gbps	N/A
N1	NVIDIA T4, NVIDIA V100	100 Gbps	N/A
N1	NVIDIA P100, NVIDIA P4	32 Gbps	N/A

GPUDirect RDMA and MRDMA functions

On certain accelerator-optimized machine types, Google Cloud usesMRDMA as the network interface implementation for GPU-to-GPU networking thatsupports GPUDirect RDMA.

GPUDirect RDMA is an NVIDIA technology that enables a network interface card(NIC) to directly access GPU memory over PCIe, bypassing host CPU and systemmemory. This peer-to-peer communication between NIC and GPU significantlyreduces latency for inter-node GPU-to-GPU communication.

MRDMA is the network interface implementation used on A4X Max, A4X, A4,and A3 Ultra machine types to provide GPUDirect RDMA capabilities. MRDMA isbased on NVIDIA ConnectX NICs and is deployed in one of the following ways:

MRDMA Virtual Functions (VFs): used in A3 Ultra, A4, and A4X series.
MRDMA Physical Functions (PFs): used in the A4X Max series.

MRDMA functions and network monitoring tools

The A4X, A4, and A3 Ultra machine types implement high-performance GPU-to-GPUnetworking by using MRDMA Virtual Functions (VFs). As these are virtualizedentities, certain hardware-level monitoring capabilities are restricted comparedto Physical Functions (PFs).

With MRDMA VFs, standard physical port counters (such as thoseending in_phy) appear inethtool -Soutput but won't update during network activity. This is a characteristic ofthe MRDMA VF architecture. To accurately track network performance onthese interfaces, review the entries for thevPort Counter Table insteadof thePhysical Port Counter Table.

The A4X Max machine type uses MRDMA PFs. Unlike the MRDMA VF-based machine types,A4X Max supports the full range of physical port counters for GPU networking.

Review networking concepts for GPU machine types

Use the following section to review the network arrangement and bandwidth speedfor each GPU machine type.

A4X Max and A4X machine types

The A4X Max and A4X machine series, which are both based on the NVIDIA Blackwellarchitecture, are designed for demanding, large-scale, distributed AI workloads.The primary differentiator between the two is their attached accelerators andnetworking hardware, as outlined in the following table:

	A4X Max machine series	A4X machine series
Attached hardware	NVIDIA GB300 Ultra Superchips	NVIDIA GB200 Superchips
GPU-to-GPU networking	4 NVIDIA ConnectX-8 (CX-8) SuperNICs that provide 3,200 Gbps bandwidth in an 8-way rail-aligned topology	4 NVIDIA ConnectX-7 (CX-7) NICs that provide 1,600 Gbps bandwidth in a 4-way rail-aligned topology
GPU-to-GPU networking implementation	MRDMA Physical Functions (PFs)	MRDMA Virtual Functions (VFs)
General purpose networking	2 Titanium smart NICs that provide 400 Gbps bandwidth	2 Titanium smart NICs that provide 400 Gbps bandwidth
Total maximum network bandwidth	3,600 Gbps	2,000 Gbps

Multi-layered networking architecture

A4X Max and A4X compute instances use a multi-layered, hierarchical networkingarchitecture with a rail-aligned design to optimize performance for variouscommunication types. In this topology, instances connect across multiple independentnetwork planes, called rails.

A4X Max instances use an 8-way rail-aligned topology where each of the four800 Gbps ConnectX-8 NICs connects to two separate 400 Gbps rails.
A4X instances use a 4-way rail-aligned topology where each of the fourConnectX-7 NICs connects to a separate rail.

The networking layers for these machine types are as follows:

Intra-node and Intra-subblock communication (NVLink): A high-speedNVLink fabric interconnects GPUs for high-bandwidth, low-latencycommunication. This fabric connects all the GPUs within a single instanceand extends across a subblock, which consists of 18 A4X Max or A4X instances(a total of 72 GPUs). This allows all 72 GPUs in a subblock to communicateas if they were in a single, large-scale GPU server.
Inter-subblock communication (ConnectX NICs with RoCE): to scaleworkloads beyond a single subblock, these machines use NVIDIA ConnectX NICs.These NICs use RDMA over Converged Ethernet (RoCE) toprovide high-bandwidth, low-latency communication between subblocks, to letyou build large-scale training clusters with thousands of GPUs.
General-purpose networking (Titanium Smart NICs): in additionto the specialized GPU networks, each instance has two Titaniumsmart NICs, providing a combined 400 Gbps of bandwidth for general networkingtasks. This includes traffic for storage, management, and connecting toother Google Cloud services or the public internet.

A4X Max architecture

The A4X Max architecture is built around NVIDIA GB300 Ultra Superchips. A key feature of this design is the direct connection of the four 800 Gbps NVIDIA ConnectX-8 (CX-8) SuperNICs to the GPUs. These NICs are part of an 8-way rail-aligned network topology where each NIC connects to two separate 400 Gbps rails. This direct path enables RDMA, providing high bandwidth and low latency for GPU-to-GPU communication across different subblocks. These Compute Engine instances also include high-performance local SSDs that are attached to the ConnectX-8 NICs, bypassing the PCIe bus for faster data access.

Network architecture for A4X Max showing four NICs for GPU communication and two Titanium NICs for general networking. — Figure 1. Network architecture for a single A4X Max host

A4X architecture

The A4X architecture uses NVIDIA GB200 Superchips. In this configuration, the four NVIDIA ConnectX-7 (CX-7) NICs are connected to the host CPU. This setup provides high-performance networking for GPU-to-GPU communication between subblocks.

Network architecture for A4X showing four NICs for GPU communication and two Titanium NICs for general networking. — Figure 2. Network architecture for a single A4X host

A4X Max and A4X Virtual Private Cloud (VPC) network configuration

To use the full networking capabilities of these machine types, you needto create and attach VPC networks to your instances. To use allavailable NICs, you must create VPC networks as follows:

Two regular VPC networks for the TitaniumSmart NICs.
- For A4X Max, these VPC networks use theIntel IDPF LAN PF device driver.
- For A4X, these VPC networks use theGoogle Virtual NIC (gVNIC) networkinterface.
One VPC network with the RoCE network profile is requiredfor the ConnectX NICs when you create clusters of multiple A4X Max or A4Xsubblocks. TheRoCE VPC network must haveone subnet for each network rail. This means eight subnets for A4X Maxinstances and four subnets for A4X instances. If you use a single subblock,you can omit this VPC network because the multi-node NVLinkfabric handles direct GPU-to-GPU communication.

To set up these networks,seeCreate VPC networksin the AI Hypercomputer documentation.

A4X Max and A4X machine types

A4X Max

Note: To get started with A4X Max machine types, contact your account team.

						Attached NVIDIA GB300 Grace Blackwell Ultra Superchips
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3e)
`a4x-maxgpu-4g-metal`	144	960	12,000	6	3,600	4	1,116

¹A vCPU is implemented as a single hardware hyper-thread on one ofthe available CPU platforms.
²Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
³GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A4X

Note: When provisioning A4X instances, you must reserve capacity to create instances and cluster. You can then create instances that use the features and services available from AI Hypercomputer. For more information, seeDeployment options overview in the AI Hypercomputer documentation.

						Attached NVIDIA GB200 Grace Blackwell Superchips
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3e)
`a4x-highgpu-4g`	140	884	12,000	6	2,000	4	744

¹A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
²Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
³GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A4 and A3 Ultra machine types

The A4 machine types have NVIDIA B200 GPUs attached and A3 Ultra machinetypes have NVIDIA H200 GPUs attached.

These machine types provide eight NVIDIA ConnectX-7 (CX-7) network interfacecards (NICs) and two Google virtual NICs (gVNIC). The eight CX-7 NICs deliver atotal network bandwidth of 3,200 Gbps. These NICs are dedicated foronly high-bandwidth GPU to GPU communication and can't be used for othernetworking needs such as public internet access. As outlined in the followingdiagram, each CX-7 NIC is aligned with one GPU to optimize non-uniform memoryaccess (NUMA). All eight GPUs can rapidly communicate with each other byusing the all to all NVLink bridge that connects them. The two other gVNICnetwork interface cards are smart NICs that provide an additional 400 Gbpsof network bandwidth for general purpose networking requirements. Combined, thenetwork interface cards provide a total maximum network bandwidth of3,600 Gbps for these machines.

The high-performance GPU-to-GPU networking on A4 and A3 Ultra instances isimplemented by using MRDMA Virtual Functions (VFs) for each of the eightConnectX-7 NICs.

Network architecture for A4 and A3 Ultra showing eight CX-7 NICs for GPU communication and two gVNICs for general networking. — Figure 3. Network architecture for a single A4 or A3 Ultra host

To use these multiple NICs, you need to create 3 Virtual Private Cloud networksas follows:

Two regular VPC networks: each gVNIC must attach to a different VPC network
One RoCE VPC network: all eight CX-7NICs share the same RoCE VPC network

To set up these networks,seeCreate VPC networksin the AI Hypercomputer documentation.

A4

Note: When provisioning A4 machine types, you mustreserve capacity to create instances or clusters, use Spot VMs, useFlex-start VMs, or create a resize request in a MIG. For instructions on how to create A4instances, see Create an A3 Ultra or A4 instance. .

						Attached NVIDIA B200 Blackwell GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3e)
`a4-highgpu-8g`	224	3,968	12,000	10	3,600	8	1,440

¹A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
²Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth, seeNetwork bandwidth.
³GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A3 Ultra

Note: When provisioning A3 Ultra machinetypes, you must reserve capacity to create instances or clusters, use Spot VMs, useFlex-start VMs, or create a resize request in a MIG. For more information about theparameters to set when creating an A3 Ultra instance, see Create an A3 Ultra or A4 instance.

						Attached NVIDIA H200 GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3e)
`a3-ultragpu-8g`	224	2,952	12,000	10	3,600	8	1128

¹A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
²Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
³GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A3 Mega, High, and Edge machine types

These machine types have H100 GPUs attached. Each of these machine typeshave a fixed GPU count, vCPU count, and memory size.

Single NIC A3 VMs: For A3 VMs with 1 to 4 GPUs attached, only asingle physical network interface card (NIC) is available.
Multi-NIC A3 VMs: For A3 VMs with 8 GPUS attached,multiple physical NICs are available. For these A3 machine types the NICs are arranged as follows ona Peripheral Component Interconnect Express (PCIe) bus:
- For theA3 Mega machine type: a NIC arrangement of 8+1 is available.With this arrangement, 8 NICs share the same PCIe bus, and 1 NIC resides on a separate PCIe bus.
- For theA3 High machine type: a NIC arrangement of 4+1 is available. With this arrangement, 4 NICs share the same PCIe bus, and 1 NIC resides on a separate PCIe bus.
- For theA3 Edge machine type machine type: a NIC arrangement of 4+1 is available.With this arrangement, 4 NICs share the same PCIe bus, and 1 NIC resides on a separate PCIe bus.These 5 NICs provide a total network bandwidth of 400 Gbps for each VM.
NICs that share the same PCIe bus, have a non-uniform memory access (NUMA) alignment of one NICper two NVIDIA H100 GPUs. These NICs are ideal for dedicated high bandwidth GPU to GPUcommunication. The physical NIC that resides on a separate PCIe bus is ideal for other networkingneeds. For instructions on how to setup networking for A3 High and A3 Edge VMs, seeset up jumbo frame MTU networks.

A3 Mega

Note: When provisioninga3-megagpu-8g machine types, we recommend using a cluster of these instances and deployingwith a scheduler such as Google Kubernetes Engine (GKE) or Slurm. For detailed instructions on either ofthese options, review the following:

To create Google Kubernetes Engine cluster, see Deploy an A3 Mega cluster with GKE.
To create a Slurm cluster, seeDeploy an A3 Mega Slurm cluster.

						Attached NVIDIA H100 GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3)
`a3-megagpu-8g`	208	1,872	6,000	9	1,800	8	640

A3 High

Note: When provisioninga3-highgpu-1g,a3-highgpu-2g, ora3-highgpu-4g machine types,you must create instances by using Spot VMs orFlex-start VMs. For detailed instructions on these options, review the following:

To create Spot VMs, set the provisioning model toSPOT when you create an accelerator-optimized VM.
To create Flex-start VMs, you can use one of the following methods:
- Create a standalone VM and set the provisioning model toFLEX_START when youcreate an accelerator-optimized VM.
- Create a resize request in a managed instance group (MIG). For instructions, seeCreate a MIG with GPU VMs.

						Attached NVIDIA H100 GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3)
`a3-highgpu-1g`	26	234	750	1	25	1	80
`a3-highgpu-2g`	52	468	1,500	1	50	2	160
`a3-highgpu-4g`	104	936	3,000	1	100	4	320
`a3-highgpu-8g`	208	1,872	6,000	5	1,000	8	640

A3 Edge

Note: To get started with A3 Edge instances, see Create an A3 VM with GPUDirect-TCPX enabled.

						Attached NVIDIA H100 GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Physical NIC count	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM3)
`a3-edgegpu-8g`	208	1,872	6,000	5	600:for asia-south1 and northamerica-northeast2 400:for all otherA3 Edge regions	8	640

¹A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
²Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
³GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

A2 machine types

Each A2 machine type has a fixed number of NVIDIA A100 40GB or NVIDIA A10080 GB GPUs attached. Each machine type also has a fixed vCPU count andmemory size.

A2 machine series are available in two types:

A2 Ultra: these machine types have A100 80GB GPUs and Local SSD disks attached.
A2 Standard: these machine types have A100 40GB GPUs attached.

A2 Ultra

					Attached NVIDIA A100 80GB GPUs
Machine type	vCPU count¹	Instance memory (GB)	Attached Local SSD (GiB)	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM2e)
`a2-ultragpu-1g`	12	170	375	24	1	80
`a2-ultragpu-2g`	24	340	750	32	2	160
`a2-ultragpu-4g`	48	680	1,500	50	4	320
`a2-ultragpu-8g`	96	1,360	3,000	100	8	640

A2 Standard

					Attached NVIDIA A100 40GB GPUs
Machine type	vCPU count¹	Instance memory (GB)	Local SSD supported	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB HBM2)
`a2-highgpu-1g`	12	85	Yes	24	1	40
`a2-highgpu-2g`	24	170	Yes	32	2	80
`a2-highgpu-4g`	48	340	Yes	50	4	160
`a2-highgpu-8g`	96	680	Yes	100	8	320
`a2-megagpu-16g`	96	1,360	Yes	100	16	640

¹A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
²Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
³GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

G4 machine types

Preview

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

G4 accelerator-optimized machine types use NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs (nvidia-rtx-pro-6000) and are suitable for NVIDIA Omniverse simulation workloads, graphics-intensive applications, video transcoding, and virtual desktops. G4 machine types also provide a low-cost solution for performing single host inference and model tuning compared with A series machine types.

Note: To get started with G4 instances, see Create a G4 instance.

						Attached NVIDIA RTX PRO 6000 GPUs
Machine type	vCPU count¹	Instance memory (GB)	Maximum Titanium SSD supported (GiB)²	Physical NIC count	Maximum network bandwidth (Gbps)³	GPU count	GPU memory⁴ (GB GDDR7)
`g4-standard-48`	48	180	1,500	1	50	1	96
`g4-standard-96`	96	360	3,000	1	100	2	192
`g4-standard-192`	192	720	6,000	1	200	4	384
`g4-standard-384`	384	1,440	12,000	2	400	8	768

¹A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
²You can add Titanium SSD disks when creating a G4 instance. For the number of disksyou can attach, seeMachine types that require you to choose a number of Local SSD disks.
³Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.SeeNetwork bandwidth.
⁴GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

G2 machine types

G2 accelerator-optimizedmachine types haveNVIDIA L4 GPUsattached and are ideal for cost-optimized inference, graphics-intensive andhigh performance computing workloads.

Each G2 machine type also has a default memory and a custommemory range. The custom memory range defines the amount of memory thatyou can allocate to your instance for each machine type. You can also add LocalSSD disks when creating a G2 instance. For the number of disksyou can attach, seeMachine types that require you to choose a number of Local SSD disks.

To get the higher network bandwidth rates (50 Gbps or higher) appliedto most GPU instances, it is recommended that you use Google Virtual NIC (gVNIC).For more information about creating GPU instances that use gVNIC, seeCreating GPU instances that use higher bandwidths.

						Attached NVIDIA L4 GPUs
Machine type	vCPU count¹	Default instance memory (GB)	Custom instance memory range (GB)	Max Local SSD supported (GiB)	Maximum network bandwidth (Gbps)²	GPU count	GPU memory³ (GB GDDR6)
`g2-standard-4`	4	16	16 to 32	375	10	1	24
`g2-standard-8`	8	32	32 to 54	375	16	1	24
`g2-standard-12`	12	48	48 to 54	375	16	1	24
`g2-standard-16`	16	64	54 to 64	375	32	1	24
`g2-standard-24`	24	96	96 to 108	750	32	2	48
`g2-standard-32`	32	128	96 to 128	375	32	1	24
`g2-standard-48`	48	192	192 to 216	1,500	50	4	96
`g2-standard-96`	96	384	384 to 432	3,000	100	8	192

¹A vCPU is implemented as a single hardware hyper-thread on one ofthe availableCPU platforms.
²Maximum egress bandwidth cannot exceed the number given. Actualegress bandwidth depends on the destination IP address and other factors.For more information about network bandwidth,seeNetwork bandwidth.
³GPU memory is the memory on a GPU device that can be used fortemporary storage of data. It is separate from the instance's memory and isspecifically designed to handle the higher bandwidth demands of yourgraphics-intensive workloads.

N1 + GPU machine types

For N1 general-purpose virtual machine (VM) instances that have T4 and V100 GPUs attached, you can get amaximum network bandwidth of up to 100 Gbps, based on the combination ofGPU and vCPU count. For all other N1 GPU instances, seeOverview.

Review the following section to calculate the maximum network bandwidth thatis available for your T4 and V100 instances based on the GPU model, vCPU, and GPU count.

Less than 5 vCPUs

For T4 and V100 instances that have 5 vCPUs or less, a maximum network bandwidthof 10 Gbps is available.

More than 5 vCPUs

For T4 and V100 instances that have more than 5 vCPUs, maximum network bandwidthis calculated based on the number of vCPUs and GPUs for that VM.

GPU model	Number of GPUs	Maximum network bandwidth calculation
NVIDIA V100	1	`min(vcpu_count * 2, 32)`
	2	`min(vcpu_count * 2, 32)`
	4	`min(vcpu_count * 2, 50)`
	8	`min(vcpu_count * 2, 100)`
NVIDIA T4	1	`min(vcpu_count * 2, 32)`
	2	`min(vcpu_count * 2, 50)`
	4	`min(vcpu_count * 2, 100)`

MTU settings and GPU machine types

To increase network throughput, set a highermaximum transmission unit (MTU) value for yourVPC networks. Higher MTU values increase the packet size andreduce the packet-header overhead, which in turn increases payload data throughput.

For GPU machine types, we recommend thefollowing MTU settings for your VPC networks.

GPU machine type	Recommended MTU (in bytes)
	Regular VPC network	RoCE VPC network
A4X Max A4X A4 A3 Ultra	8896	8896
A3 Mega A3 High A3 Edge	8244	N/A
A2 Standard A2 Ultra G4 G2 N1 machine types that support GPUs	8896	N/A

When setting the MTU value, note the following:

8192 is two 4 KB pages.
8244 is recommended in A3 Mega, A3 High, and A3 Edge VMs for GPU NICs thathave header split enabled.
Use a value of 8896 unless otherwise indicated in the table.

Create high bandwidth GPU machines

To create GPU instances that use higher network bandwidths, use one of thefollowing methods based on the machine type:

To createA2, G2 and N1 instances that use higher network bandwidths, seeUse higher network bandwidth for A2, G2, and N1 instances.To test or verify the bandwidth speed for these machines, you can use thebenchmarking test. For more information, seeChecking network bandwidth.
To createA3 Mega instances that use higher network bandwidths, seeDeploy an A3 Mega Slurm cluster for ML training.To test or verify the bandwidth speed for these machines, use abenchmarking test by following the steps inChecking network bandwidth.
ForA3 High and A3 Edge instances that use higher network bandwidths, seeCreate an A3 VM with GPUDirect-TCPX enabled.To test or verify the bandwidth speed for these machines, you can use thebenchmarking test. For more information, seeChecking network bandwidth.
For other accelerator-optimized machine types, no action is required to usehigher network bandwidth; creating an instance as documented already useshigh network bandwidth. To learn how to create instances for otheraccelerator-optimized machine types, seeCreate a VM that has attached GPUs.

What's next?

Learn more aboutGPU platforms.
Learn how tocreate instances with attached GPUs.
Learn aboutUse higher network bandwidth.
Learn aboutGPU pricing.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換

Networking and GPU machines Stay organized with collections Save and categorize content based on your preferences.

Overview

GPUDirect RDMA and MRDMA functions

MRDMA functions and network monitoring tools

Review networking concepts for GPU machine types

A4X Max and A4X machine types

Multi-layered networking architecture

A4X Max architecture

A4X architecture

A4X Max and A4X Virtual Private Cloud (VPC) network configuration

A4X Max and A4X machine types

A4X Max

A4X

A4 and A3 Ultra machine types

A4

A3 Ultra

A3 Mega, High, and Edge machine types

A3 Mega

A3 High

A3 Edge

A2 machine types

A2 Ultra

A2 Standard

G4 machine types

G2 machine types

N1 + GPU machine types

Less than 5 vCPUs

More than 5 vCPUs

MTU settings and GPU machine types

Create high bandwidth GPU machines

What's next?

Networking and GPU machines