About Dataflow ML

You can use Dataflow ML's scale data processing abilities forprediction and inference pipelines and fordata preparation fortraining.

Diagram of the Dataflow MLworkflow.

Figure 1. The complete Dataflow ML workflow.

Requirements and limitations

  • Dataflow ML supports batch and streaming pipelines.
  • TheRunInference API is supported in Apache Beam 2.40.0 and laterversions.
  • TheMLTransform API is supported in Apache Beam 2.53.0 and laterversions.
  • Model handlers are available for PyTorch, scikit-learn, TensorFlow,ONNX, and TensorRT. For unsupported frameworks, you can use a custom modelhandler.

Data preparation for training

Prediction and inference pipelines

Dataflow ML combines the power of Dataflow withApache Beam'sRunInferenceAPI. With theRunInference API, you define the model's characteristics and properties andpass that configuration to theRunInference transform. This feature allowsusers to run the model within their Dataflow pipelines without needingto know the model's implementation details. You can choose the framework thatbest suits your data, such as TensorFlow and PyTorch.

Run multiple models in a pipeline

Use theRunInference transform to add multiple inference models to yourDataflow pipeline. For more information, including code details, seeMulti-model pipelinesin the Apache Beam documentation.

Build a cross-language pipeline

To use RunInference with a Java pipeline,create a cross-language Pythontransform.The pipeline calls the transform, which does the preprocessing, postprocessing,and inference.

For detailed instructions and a sample pipeline, seeUsing RunInference fromthe JavaSDK.

Use GPUs with Dataflow

For batch or streaming pipelines that require the use of accelerators, you canrun Dataflow pipelines on NVIDIA GPU devices. For more information, seeRun a Dataflow pipeline with GPUs.

Troubleshoot Dataflow ML

This section provides troubleshooting strategies and links that you might findhelpful when using Dataflow ML.

Stack expects each tensor to be equal size

If you provide images of different sizes or word embeddings of different lengthswhen using theRunInference API, the following error might occur:

File "/beam/sdks/python/apache_beam/ml/inference/pytorch_inference.py", line 232, in run_inference batched_tensors = torch.stack(key_to_tensor_list[key]) RuntimeError: stack expects each tensor to be equal size, but got [12] at entry 0 and [10] at entry 1 [while running 'PyTorchRunInference/ParDo(_RunInferenceDoFn)']

This error occurs because theRunInference API can't batch tensor elements ofdifferent sizes. For workarounds, seeUnable to batch tensorelementsin the Apache Beam documentation.

Avoid out-of-memory errors with large models

When you load a medium or large ML model, your machine might run out of memory.Dataflow provides tools to help avoid out-of-memory (OOM) errorswhen loading ML models. For more information, seeRunInference transform bestpractices.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.