Run a pipeline with TPUs

This page explains how to run an Apache Beam pipeline onDataflow with TPUs. Jobs that use TPUs incur charges as specifiedin the Dataflowpricing page.

For more information about using TPUs with Dataflow, seeDataflow support for TPUs.

Optional: Make a specific reservation to use accelerators

While you can use TPUs on-demand, we strongly recommend you useDataflow TPUs withspecifically targeted Google Cloud Platformreservations.This helps to ensure you have access to available accelerators and quick workerstartup times. Pipelines that consume a TPU reservation don't require additional TPUquota.

If you don't make a reservation and choose to use TPUs on-demand,provision TPU quota before you run your pipeline.

Optional: Provision TPU quota

You can use TPUs in an on-demand capacity or using a reservation. If you want touse TPUs on-demand, you must provision TPU quota before you do. If you use aspecifically targeted reservation, you can skip this section.

To use TPUs on-demand without a reservation, check the limit andcurrent usage of your Compute Engine API quota for TPUs as follows:

Console

  1. Go to theQuotas page in the Google Cloud console:

    Go to Quotas

  2. In theFilter box,do the following:

    1. Use the following table to select and copy the property of the quotabased on the TPU version and machine type. For example, if you plan tocreate on-demand TPU v5e nodes whose machine type begins withct5lp-,enterName: TPU v5 Lite PodSlice chips.

      TPU version, machine type begins withProperty and name of the quota for on-demand instances
      TPU v5e,
      ct5lp-
      Name:
      TPU v5 Lite PodSlice chips
      TPU v5p,
      ct5p-
      Name:
      TPU v5p chips
      TPU v6e,
      ct6e-
      Dimensions (e.g. location):
      tpu_family:CT6E
    2. Select theDimensions (e.g. location) property and enterregion: followed by the name of the region in which you plan tostart your pipeline. For example, enterregion:us-west4 if you planto use the zoneus-west4-a. TPU quota is regional, so all zoneswithin the same region consume the same TPU quota.

Configure a custom container image

To interact with TPUs in Dataflow pipelines, you need to providesoftware that can operate on XLA devices in your pipeline runtime environment.This requires installing TPU libraries based on your pipeline needs andconfiguring environment variables based on the TPU device you use.

To customize the container image, install Apache Beam into anoff-the-shelf base imagethat has the necessary TPU libraries. Alternatively,install the TPU softwareinto the images publishedwith Apache Beam SDK releases.

To provide a custom container image, use thesdk_container_image pipelineoption. For more information, seeUse custom containers inDataflow.

When you use a TPU accelerator, you need to set the following environmentvariables in the container image.

ENVTPU_SKIP_MDS_QUERY=1# Don't query metadataENVTPU_HOST_BOUNDS=1,1,1# There's only one hostENVTPU_WORKER_HOSTNAMES=localhostENVTPU_WORKER_ID=0# Always 0 for single-host TPUs

Depending on the accelerator you use, the variables in the following table alsoneed to be set.

typetopologyRequired Dataflowworker_machine_typeadditional environment variables
tpu-v5-lite-podslice1x1ct5lp-hightpu-1t
TPU_ACCELERATOR_TYPE=v5litepod-1
TPU_CHIPS_PER_HOST_BOUNDS=1,1,1
tpu-v5-lite-podslice2x2ct5lp-hightpu-4t
TPU_ACCELERATOR_TYPE=v5litepod-4
TPU_CHIPS_PER_HOST_BOUNDS=2,2,1
tpu-v5-lite-podslice2x4ct5lp-hightpu-8t
TPU_ACCELERATOR_TYPE=v5litepod-8
TPU_CHIPS_PER_HOST_BOUNDS=2,4,1
tpu-v6e-slice1x1ct6e-standard-1t
TPU_ACCELERATOR_TYPE=v6e-1
TPU_CHIPS_PER_HOST_BOUNDS=1,1,1
tpu-v6e-slice2x2ct6e-standard-4t
TPU_ACCELERATOR_TYPE=v6e-4
TPU_CHIPS_PER_HOST_BOUNDS=2,2,1
tpu-v6e-slice2x4ct6e-standard-8t
TPU_ACCELERATOR_TYPE=v6e-8
TPU_CHIPS_PER_HOST_BOUNDS=2,4,1
tpu-v5p-slice2x2x1ct5p-hightpu-4t
TPU_ACCELERATOR_TYPE=v5p-8
TPU_CHIPS_PER_HOST_BOUNDS=2,2,1

A sample Dockerfile for the custom container image might look like the followingexample:

FROMpython:3.11-slimCOPY--from=apache/beam_python3.11_sdk:2.66.0/opt/apache/beam/opt/apache/beam# Configure the environment to access TPU deviceENVTPU_SKIP_MDS_QUERY=1ENVTPU_HOST_BOUNDS=1,1,1ENVTPU_WORKER_HOSTNAMES=localhostENVTPU_WORKER_ID=0# Configure the environment for the chosen accelerator.# Adjust according to the accelerator you use.ENVTPU_ACCELERATOR_TYPE=v5litepod-1ENVTPU_CHIPS_PER_HOST_BOUNDS=1,1,1# Install TPU software stack.RUNpipinstalljax[tpu]apache-beam[gcp]==2.66.0-fhttps://storage.googleapis.com/jax-releases/libtpu_releases.htmlENTRYPOINT["/opt/apache/beam/boot"]

Run your job with TPUs

The considerations for running a Dataflow job with TPUs includethe following:

  • Because TPU containers can be large, to avoid running out of disk space,increase the default boot disk size to 50 gigabytes or an appropriate sizeas required by your container image by using the--disk_size_gb pipeline option.
  • Limitintra-worker parallelism.

TPUs and worker parallelism

In the default configuration, Dataflow Python pipelines launchone Apache Beam SDK process per VMcore. TPU machine types have alarge number of vCPU cores, but only one process may perform computations on aTPU device. Additionally, a TPU device might be reserved by a process for thelifetime of the process. Therefore, you must limit intra-worker parallelism whenrunning a Dataflow TPU pipeline. To limit worker parallelism, usethe following guidance:

  • If your use case involves running inferences on a model, use the BeamRunInference API. For more information, seeLarge Language ModelInference inBeam.
  • If you cannot use the BeamRunInference API, use Beam'smulti-processsharedobjects to restrict certain operations to a single process.
  • If you cannot use the preceding recommendations and prefer to launchonlyone Python process perworker, set the--experiments=no_use_multiple_sdk_containers pipeline option.
  • You can further reduce the number of threads by usingthe--number_of_worker_harness_threads pipeline option if thatachieves better performance.

The following table lists the total compute resources per worker for each TPUconfiguration.

TPU typetopologymachine typeTPU chipsvCPURAM (GB)
tpu-v5-lite-podslice1x1ct5lp-hightpu-1t12448
tpu-v5-lite-podslice2x2ct5lp-hightpu-4t4112192
tpu-v5-lite-podslice2x4ct5lp-hightpu-8t8224384
tpu-v6e-slice1x1ct6e-standard-1t144176
tpu-v6e-slice2x2ct6e-standard-4t4180720
tpu-v6e-slice2x4ct6e-standard-8t83601440
tpu-v5p-slice2x2x1ct5p-hightpu-4t4208448

Run a pipeline with TPUs

To run a Dataflow job with TPUs, use the following command.

pythonPIPELINE \--runner"DataflowRunner" \--project"PROJECT" \--temp_location"gs://BUCKET/tmp" \--region"REGION" \--dataflow_service_options"worker_accelerator=type:TPU_TYPE;topology:TPU_TOPOLOGY" \--worker_machine_type"MACHINE_TYPE" \--disk_size_gb"DISK_SIZE_GB" \--sdk_container_image"IMAGE" \--number_of_worker_harness_threadsNUMBER_OF_THREADS

Replace the following:

  • PIPELINE: Your pipeline source code file.
  • PROJECT: The Google Cloud project name.
  • BUCKET: The Cloud Storage bucket.
  • REGION: A Dataflow region, for example,us-central1.
  • TPU_TYPE: A supported TPU type, for example,tpu-v5-lite-podslice. For a full list of types and topologies, seeSupported TPU accelerators.
  • TPU_TOPOLOGY: The TPU topology, for example,1x1.
  • MACHINE_TYPE: The corresponding machine type, for example,ct5lp-hightpu-1t.
  • DISK_SIZE_GB: The size of the boot disk for each worker VM, for example,100.
  • IMAGE: The Artifact Registry path for your Docker image.
  • NUMBER_OF_THREADS: Optional. The number of worker harness threads.

Verify your Dataflow job

To confirm that the job uses worker VMs with TPUs, follow these steps:

  1. In the Google Cloud console, go to theDataflow>Jobs page.

    Go to Jobs

  2. Select a job.

  3. Click theJob metrics tab.

  4. In theAutoscaling section, confirm that there's at least oneCurrentworkers VM.

  5. In the sideJob info pane, check to see that themachine_type startswithct. For example,ct6e-standard-1t. This indicates TPU usage.

Troubleshoot your Dataflow job

If you run into problems running your Dataflow job with TPUs, seeTroubleshoot your Dataflow TPUjob.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.