AI Zones

AI zones are specialized zones used for Artificial Intelligence and MachineLearning (AI and ML) training and inference workloads. They provide significantML accelerator (GPU and TPU) capacity.

Within a region, AI zones are geographically located away from standard(non-AI) zones. The following figure shows an example of an AI zone(us-central1-ai1a) located further away relative to the standard zones in theus-central1 region.

Note: AI zones meet the security and data residency requirements of theirregion, similar to the standard zones. For more information, see Geographic management of data.

Parent zone

Each AI zone is associated with a standard zone in the region, referred to asitsparent zone. A parent zone is a standard zone with the same suffix as theAI zone. For example, in the diagram,us-central1-a is the parent zone ofus-central1-ai1a. They share software update schedules and sometimesinfrastructure. This means that any software or infrastructure issues affectinga parent zone could also affect the AI zone. When designing your highavailability solutions, review theHigh availability (HA) considerations to account for thedependency on the parent zone.

When to use AI zones

AI zones are optimized for AI and ML workloads. Use the following guidance todetermine which of your workloads are best suited for AI zones and which arebetter served by standard zones.

Recommended for:

Large-scale training: Ideal for large-scale training workloads—such asLarge Language Model (LLM) and foundational model training—because of theavailability of a large number of accelerators.
Small-scale training, fine-tuning, bulk inference, and retraining: AIzones perform well for workloads that require substantialaccelerator capacity.
Real-time ML inference: AI zones support real-time inference workloads.Performance depends on the application design and model latencyrequirements, especially if the workload requires round-trip requests to theparent region.

Not recommended for:

Non-ML workloads: Since AI zones do not offer all Google Cloud serviceslocally, we recommend running your non-ML workloads in the standard zones.

Access services from an AI zone

You can access all Google Cloud products in a Google Cloud region from its AIzone. However, accessing services in a Google Cloud region from an AI zone canadd network latency, as the AI zone is physically separate from thelocations of the region's standard zones.

Specific products support creating or accessing zonal resources locally in anAI zone. For more information about these services, see the following table:

Product	Description
Google Kubernetes Engine (GKE)	Setup for using AI zones in GKE clusters, including configuration using ComputeClasses, node auto-provisioning, and GKE Standard node pools. Using AI zones in GKE
Cloud Storage	Configuration of object storage for workloads in AI zones, including zonal storage to maximize performance during active jobs and persistent storage for datasets and model checkpoints. Use AI zones with Cloud Storage
Compute Engine	Methods to identify available AI zones using the console, Google Cloud CLI, and REST API, including how to filter by naming convention, accelerator type, or machine Find available AI zones

Locations

Important: AI zones are currently restricted by an allowlist.

AI zones are available in the following locations:

AI zone	AI zone location	Google Cloud region	Google Cloud region location	Parent zone
`us-south1-ai1b`	Austin, Texas, North America	`us-south1`	Dallas, Texas, North America	`us-south1-b`
`us-central1-ai1a`	Lincoln, Nebraska, North America	`us-central1`	Council Bluffs, Iowa, North America	`us-central1-a`

Using AI zones

AI zones are accessible throughthe Google Cloud console, Google Cloud CLI, or REST. However, when using theGoogle Cloud console to create your VMs, you must manually select an AI zone. Itisn't selected for you, as it is with standard zones.To use AI zones with the following features, you must explicitly select anAI zone while you are setting up these resources.

Certain Compute Engine and GKE features: AI zones arenot automatically selected in certain Compute Engine andGKE regional features (for example, Regional Managed InstanceGroups, Regional GKE clusters). For more details aboutGKE, refer to the GKEdocumentation.
Non-accelerator workload restrictions: When you run CPU-only VMs in AIzones, be aware of Compute Engine-enforced restrictions. These mightinclude requirements for GPU:CPU ratios and reservations.
Vertex AI: GKE based Vertex AI regionalproducts mustconfigure GKEto include AI zones in regional clusters. You don't need to opt in toVertex AI. Vertex AI manages this configuration.
Google Cloud Service Metadata Locations API: You must enable the--extraLocationTypes flag when using thelocations.list APIto ensure AI zones appear only to those who intend to use them.

Using AI zones in GKE

By default, GKE doesn't deploy your workloads in AI zones. To usean AI zone, you configure one of the following options:

ComputeClasses: Set your highest priority to request on-demand TPUs inan AI zone. ComputeClasses help you define a prioritized list of hardwareconfigurations for your workloads. For an example, seeAbout ComputeClasses.
Node auto-provisioning: Use anodeSelector ornodeAffinity in yourpod specification to instruct node auto-provisioning to create a node poolin the AI zone. If your workload doesn't explicitly target an AI zone, nodeauto-provisioning considers only standard zones when creating new nodepools. This configuration ensures that workloads that don't run AI/ML modelsremain in standard zones unless you explicitly configure otherwise. For anexample of a manifest that uses anodeSelector, seeSet the default zonesfor auto-created nodes.
GKE Standard: If you directly manage your node pools, usean AI zone in the--node-locations flag when you create a node pool. Foran example, seeDeploy TPU workloads in GKE Standard.

Limitations

The following are not available in AI zones:

Design considerations with AI zones

Consider the following when designing your applications to use AI zones.

High availability (HA) considerations

AI zones share software rollouts and infrastructure with their parent zones. Toensure high availability for your workloads, avoid these deployment patternswhen you select zones, whether automatically or manually:

Avoid deploying HA workloads across an AI zone and its parent zone.
Avoid deploying HA workloads across two AI zones that share the same parentzone.

Storage best practices

We recommend a tiered storage architecture to balance cost, durability, andperformance:

Cold storage layer: Use regional Cloud Storage buckets in standard zonesfor persistent, highly-durable storage of your training datasets and modelcheckpoints.
Performance layer: Use specialized zonal storage services to act as ahigh-speed cache or temporary scratch space. This approach eliminatesinter-zonal latency and maximizesgoodputduring active jobs.
To help ensure that GPUs and TPUs remain fully saturated, maximizinggoodput, provision your performance layer in the same AI zone as yourcompute resources.

The following storage solutions are recommended for optimizing AI and ML systemperformance with AI zones:

Storage service	Description	Use cases
Anywhere Cache feature of Cloud Storage	A fully managed, SSD-backed zonal read cache that brings frequently read data from a bucket into the AI zone.	Recommended for: Read-heavy workloads Low-latency model training and serving Not recommended for: Applications that require full POSIX compliance

Storage service

Description

Use cases

Anywhere Cache feature of Cloud Storage

A fully managed, SSD-backed zonal read cache that brings frequently read data from a bucket into the AI zone.