About GPUs on Google Cloud

Google Cloud is focused on delivering world-class artificial intelligence (AI)infrastructure to power your most demanding GPU-accelerated workloads across awide range of segments. You can use GPUs on Google Cloud to run AI, machinelearning (ML), scientific, analytics, engineering, consumer, and enterpriseapplications.

Through our partnership with NVIDIA, Google Cloud delivers the latest GPUs whileoptimizing the software stack with a wide array of storage and networkingoptions. For a full list of GPUs available, seeGPU platforms.

The following sections outline the benefits of GPUs on Google Cloud.

GPU-accelerated VMs

On Google Cloud, you can access and provision GPUs in the way that best suitsyour needs. A specializedaccelerator-optimized machine family isavailable, with pre-attached GPUs and networking capabilities that are ideal formaximizing performance. These are available in the A4X Max, A4X, A4, A3, A2, G4,and G2 machine series.

Multiple provisioning options

You can provision clusters by using the accelerator-optimized machine familywith any of the following open-source or Google Cloud products.

Vertex AI

Vertex AI is a fully-managed machine learning (ML) platform that youcan use to train and deploy ML models and AI applications. In Vertex AIapplications, you can use GPU-accelerated VMs to improve performance in thefollowing ways:

Use GPU-enabled VMs in custom training GKE worker pools.
Useopen source LLM models from the Vertex AI Model Garden.
Reduceprediction latency.
Improve performance ofVertex AI Workbench notebook code.
Improve performance of aColab Enterprise runtime.

AI Hypercomputer

AI Hypercomputer is a supercomputing system that is optimized tosupport your artificial intelligence (AI) and machine learning (ML) workloads.It's an integrated system of performance-optimized hardware, open software, MLframeworks, and flexible consumption models. AI Hypercomputer featuresand services that are designed to let you deploy and manage largenumbers, up to tens of thousands, of accelerator and networking resources thatfunction as a single homogeneous unit. This option is ideal for creating adensely allocated, performance-optimized infrastructure that has integrationsfor Google Kubernetes Engine (GKE) and Slurm schedulers. For more information, seetheAI Hypercomputer overview.

To get started with Cluster Director, seeChoose a deployment strategy.

Compute Engine

You can also create and manage individual VMs or small clusters of VMs withattached GPUs on Compute Engine. This method is mostly used for runninggraphics-intensive workloads, simulation workloads, or small-scale ML modeltraining.

The following table shows the methods that you can use to create VMs that haveGPUs attached:

Deployment option	Deployment guides
Create a VM for serving and single node workloads	Create an A3 Edge or A3 High VM
Create managed instance groups (MIGs) This option uses the Dynamic Workload Scheduler (DWS).	Create a MIG with GPU VMs
Create VMs in bulk	Create a group of GPU VMs in bulk
Create a single VM	Create a single GPU VM
Create virtual workstations	Create a virtual GPU-accelerated workstation

Cloud Run

You can configure GPUs for your Cloud Run instances. GPUs are ideal forrunning AI inference workloads using large language models on Cloud Run.

On Cloud Run, consult these resources for running AI workloads on GPUs:

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.

Movatterモバイル変換

About GPUs on Google Cloud Stay organized with collections Save and categorize content based on your preferences.