Run a pipeline with TPUs Stay organized with collections Save and categorize content based on your preferences.
This page explains how to run an Apache Beam pipeline onDataflow with TPUs. Jobs that use TPUs incur charges as specifiedin the Dataflowpricing page.
For more information about using TPUs with Dataflow, seeDataflow support for TPUs.
Optional: Make a specific reservation to use accelerators
While you can use TPUs on-demand, we strongly recommend you useDataflow TPUs withspecifically targeted Google Cloud Platformreservations.This helps to ensure you have access to available accelerators and quick workerstartup times. Pipelines that consume a TPU reservation don't require additional TPUquota.
If you don't make a reservation and choose to use TPUs on-demand,provision TPU quota before you run your pipeline.
Optional: Provision TPU quota
You can use TPUs in an on-demand capacity or using a reservation. If you want touse TPUs on-demand, you must provision TPU quota before you do. If you use aspecifically targeted reservation, you can skip this section.
To use TPUs on-demand without a reservation, check the limit andcurrent usage of your Compute Engine API quota for TPUs as follows:
Console
Go to theQuotas page in the Google Cloud console:
In theFilter box,do the following:
Use the following table to select and copy the property of the quotabased on the TPU version and machine type. For example, if you plan tocreate on-demand TPU v5e nodes whose machine type begins with
ct5lp-,enterName: TPU v5 Lite PodSlice chips.TPU version, machine type begins with Property and name of the quota for on-demand instances TPU v5e, ct5lp-Name:
TPU v5 Lite PodSlice chipsTPU v5p, ct5p-Name:
TPU v5p chipsTPU v6e, ct6e-Dimensions (e.g. location):
tpu_family:CT6ESelect theDimensions (e.g. location) property and enter
region:followed by the name of the region in which you plan tostart your pipeline. For example, enterregion:us-west4if you planto use the zoneus-west4-a. TPU quota is regional, so all zoneswithin the same region consume the same TPU quota.
Configure a custom container image
To interact with TPUs in Dataflow pipelines, you need to providesoftware that can operate on XLA devices in your pipeline runtime environment.This requires installing TPU libraries based on your pipeline needs andconfiguring environment variables based on the TPU device you use.
To customize the container image, install Apache Beam into anoff-the-shelf base imagethat has the necessary TPU libraries. Alternatively,install the TPU softwareinto the images publishedwith Apache Beam SDK releases.
To provide a custom container image, use thesdk_container_image pipelineoption. For more information, seeUse custom containers inDataflow.
When you use a TPU accelerator, you need to set the following environmentvariables in the container image.
ENVTPU_SKIP_MDS_QUERY=1# Don't query metadataENVTPU_HOST_BOUNDS=1,1,1# There's only one hostENVTPU_WORKER_HOSTNAMES=localhostENVTPU_WORKER_ID=0# Always 0 for single-host TPUsDepending on the accelerator you use, the variables in the following table alsoneed to be set.
| type | topology | Required Dataflowworker_machine_type | additional environment variables |
|---|---|---|---|
| tpu-v5-lite-podslice | 1x1 | ct5lp-hightpu-1t | TPU_ACCELERATOR_TYPE=v5litepod-1 |
| tpu-v5-lite-podslice | 2x2 | ct5lp-hightpu-4t | TPU_ACCELERATOR_TYPE=v5litepod-4 |
| tpu-v5-lite-podslice | 2x4 | ct5lp-hightpu-8t | TPU_ACCELERATOR_TYPE=v5litepod-8 |
| tpu-v6e-slice | 1x1 | ct6e-standard-1t | TPU_ACCELERATOR_TYPE=v6e-1 |
| tpu-v6e-slice | 2x2 | ct6e-standard-4t | TPU_ACCELERATOR_TYPE=v6e-4 |
| tpu-v6e-slice | 2x4 | ct6e-standard-8t | TPU_ACCELERATOR_TYPE=v6e-8 |
| tpu-v5p-slice | 2x2x1 | ct5p-hightpu-4t | TPU_ACCELERATOR_TYPE=v5p-8 |
A sample Dockerfile for the custom container image might look like the followingexample:
FROMpython:3.11-slimCOPY--from=apache/beam_python3.11_sdk:2.66.0/opt/apache/beam/opt/apache/beam# Configure the environment to access TPU deviceENVTPU_SKIP_MDS_QUERY=1ENVTPU_HOST_BOUNDS=1,1,1ENVTPU_WORKER_HOSTNAMES=localhostENVTPU_WORKER_ID=0# Configure the environment for the chosen accelerator.# Adjust according to the accelerator you use.ENVTPU_ACCELERATOR_TYPE=v5litepod-1ENVTPU_CHIPS_PER_HOST_BOUNDS=1,1,1# Install TPU software stack.RUNpipinstalljax[tpu]apache-beam[gcp]==2.66.0-fhttps://storage.googleapis.com/jax-releases/libtpu_releases.htmlENTRYPOINT["/opt/apache/beam/boot"]Run your job with TPUs
The considerations for running a Dataflow job with TPUs includethe following:
- Because TPU containers can be large, to avoid running out of disk space,increase the default boot disk size to 50 gigabytes or an appropriate sizeas required by your container image by using the
--disk_size_gbpipeline option. - Limitintra-worker parallelism.
TPUs and worker parallelism
In the default configuration, Dataflow Python pipelines launchone Apache Beam SDK process per VMcore. TPU machine types have alarge number of vCPU cores, but only one process may perform computations on aTPU device. Additionally, a TPU device might be reserved by a process for thelifetime of the process. Therefore, you must limit intra-worker parallelism whenrunning a Dataflow TPU pipeline. To limit worker parallelism, usethe following guidance:
- If your use case involves running inferences on a model, use the Beam
RunInferenceAPI. For more information, seeLarge Language ModelInference inBeam. - If you cannot use the Beam
RunInferenceAPI, use Beam'smulti-processsharedobjects to restrict certain operations to a single process. - If you cannot use the preceding recommendations and prefer to launchonlyone Python process perworker, set the
--experiments=no_use_multiple_sdk_containerspipeline option. - You can further reduce the number of threads by usingthe
--number_of_worker_harness_threadspipeline option if thatachieves better performance.
The following table lists the total compute resources per worker for each TPUconfiguration.
| TPU type | topology | machine type | TPU chips | vCPU | RAM (GB) |
|---|---|---|---|---|---|
| tpu-v5-lite-podslice | 1x1 | ct5lp-hightpu-1t | 1 | 24 | 48 |
| tpu-v5-lite-podslice | 2x2 | ct5lp-hightpu-4t | 4 | 112 | 192 |
| tpu-v5-lite-podslice | 2x4 | ct5lp-hightpu-8t | 8 | 224 | 384 |
| tpu-v6e-slice | 1x1 | ct6e-standard-1t | 1 | 44 | 176 |
| tpu-v6e-slice | 2x2 | ct6e-standard-4t | 4 | 180 | 720 |
| tpu-v6e-slice | 2x4 | ct6e-standard-8t | 8 | 360 | 1440 |
| tpu-v5p-slice | 2x2x1 | ct5p-hightpu-4t | 4 | 208 | 448 |
Run a pipeline with TPUs
To run a Dataflow job with TPUs, use the following command.
pythonPIPELINE \--runner"DataflowRunner" \--project"PROJECT" \--temp_location"gs://BUCKET/tmp" \--region"REGION" \--dataflow_service_options"worker_accelerator=type:TPU_TYPE;topology:TPU_TOPOLOGY" \--worker_machine_type"MACHINE_TYPE" \--disk_size_gb"DISK_SIZE_GB" \--sdk_container_image"IMAGE" \--number_of_worker_harness_threadsNUMBER_OF_THREADSReplace the following:
- PIPELINE: Your pipeline source code file.
- PROJECT: The Google Cloud project name.
- BUCKET: The Cloud Storage bucket.
- REGION: A Dataflow region, for example,
us-central1. - TPU_TYPE: A supported TPU type, for example,
tpu-v5-lite-podslice. For a full list of types and topologies, seeSupported TPU accelerators. - TPU_TOPOLOGY: The TPU topology, for example,
1x1. - MACHINE_TYPE: The corresponding machine type, for example,
ct5lp-hightpu-1t. - DISK_SIZE_GB: The size of the boot disk for each worker VM, for example,
100. - IMAGE: The Artifact Registry path for your Docker image.
- NUMBER_OF_THREADS: Optional. The number of worker harness threads.
Verify your Dataflow job
To confirm that the job uses worker VMs with TPUs, follow these steps:
In the Google Cloud console, go to theDataflow>Jobs page.
Select a job.
Click theJob metrics tab.
In theAutoscaling section, confirm that there's at least oneCurrentworkers VM.
In the sideJob info pane, check to see that the
machine_typestartswithct. For example,ct6e-standard-1t. This indicates TPU usage.
Troubleshoot your Dataflow job
If you run into problems running your Dataflow job with TPUs, seeTroubleshoot your Dataflow TPUjob.
What's next
- Try the Quickstart examples:Running Dataflow on TPUs.
- Learn more aboutTPU support onDataflow.
- Learn aboutLarge model inference inBeam.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.