Vertical Autoscaling

Vertical Autoscaling is a feature that enablesDataflow Prime to dynamicallyscale up or scale down the memory available to workers to fit the requirementsof the job. The feature is designed to make jobs resilient to out-of-memory(OOM) errors and to maximize pipeline efficiency. Dataflow Prime monitorsyour pipeline, detects situations where the workers lack or exceed availablememory, and then replaces those workers with new workers with more or lessmemory.

Important: Because Vertical Autoscaling replaces existing workers with newworkers, we strongly recommendusing customcontainers to improve thelatency that might arise from resizing the workers.

Streaming

Vertical Autoscaling is enabled by default for all new streaming jobs thatuseDataflow Prime.

If you are launching a job from a template through the command line interface,you can disable Vertical Autoscaling by passing the--additional_experiments=disable_vertical_memory_autoscaling flag.

All Dataflow Prime streaming Java and Python pipelines support VerticalAutoscaling. You can use Dataflow Prime streaming Java pipelines withoutStreaming Engine. However, for the best experience with Vertical Autoscaling,enabling Streaming Engine is recommended.

Batch

For Dataflow Prime batch jobs, Vertical Autoscaling only scales up after fourout-of-memory errors occur.

  • Vertical Autoscaling scales up to prevent job failures and does not scaledown.
  • The entire pool scales up for the remainder of the job.
  • If resource hints are used and multiple pools are created, each pool scalesup separately.

For batch jobs, Vertical Autoscaling is not enabled by default. To enableVertical Autoscaling for batch jobs, set the following pipeline options:

  • --experiments=enable_batch_vmr
  • --experiments=enable_vertical_memory_autoscaling

To disable Vertical Autoscaling for batch jobs, do one of the following:

  • Do not set the--experiments=enable_batch_vmr pipeline option.
  • Set the--experiments=disable_vertical_memory_autoscaling pipeline option.

Limitations

  • Only the memory of the workers scales vertically.
  • By default, memory scaling has an upper limit of 16 GiB and a lower limit of6 GiB. When youuse GPUs, memory scaling has an upperlimit of 26 GiB and a lower limit of 12 GiB. You can change both the upperand lower limits by providing a resource hint.
  • Vertical Autoscaling is not supported for pools using A100 GPUs.
  • For batch jobs, bundles that include a failing item might be retried morethan 4 times before the pipeline fails completely.
  • Vertical Autoscaling isn't supported withVPC Service Controls. If you enableDataflow Prime and launch a new job within a VPC Service Controlsperimeter, the job usesDataflow Prime without VerticalAutoscaling.
  • When you use right fitting with Vertical Autoscaling, only batch pipelinesare supported.

Monitor Vertical Autoscaling

Vertical Autoscaling operations are published to the job and worker logs. Toview these logs, seeDataflow jobmetrics.

Effect on Horizontal Autoscaling

InDataflow Prime, VerticalAutoscaling works alongsideHorizontalAutoscaling. This combination enablesDataflow Prime to seamlessly scale workers up or down to best fit the needsof your pipeline and maximize the utilization of the compute capacity.

By design, Vertical Autoscaling (which adjusts the worker memory) occurs at alower frequency than Horizontal Autoscaling (which adjusts the number ofworkers). Horizontal Autoscaling is deactivated during and up to 10 minutesafter an update is triggered by Vertical Autoscaling. If there exists asignificant backlog of input data after this 10-minute mark, HorizontalAutoscaling is likely to occur to clear that backlog. To learn about HorizontalAutoscaling for streaming pipelines, seeStreamingautoscaling.

Troubleshooting

This section provides instructions for troubleshooting common issues related tovertical autoscaling.

Vertical Autoscaling does not seem to work

If Vertical Autoscaling isn't working, check the following job details.

  • Check for the following job message to verify that Vertical Autoscaling isactive:Vertical Autoscaling is enabled. This pipeline is receiving recommendationsfor resources allocated per worker.

    The absence of this message indicates that Vertical Autoscaling is notrunning.

  • For streaming pipelines, verify that theenable_vertical_memory_autoscaling flag is set. For batch pipelines,verify that theenable_vertical_memory_autoscaling and theenable_batch_vmr flags are set.

  • Verify that you enabled the Cloud Autoscaling API for your Google Cloud project.Enable the API

  • Verify that your job is running Dataflow Prime. For more information, seeEnablingDataflow Prime.

Job observes high backlog and high watermark

These instructions only apply to streaming jobs. If the vertical reshaping ofworkers takes longer than a few minutes, yourjob might exhibit a high backlog of the input data and a high watermark. Toaddress this issue in Python pipelines, we strongly recommend that youuse custom containers, becausethey can improve the latency that might arise from reshaping the workers. Toaddress this issue in Java pipelines, we strongly recommend that you enableStreaming EngineandRunner v2.If the issue persists after enabling these features, contactCustomer Care.

Vertical Autoscaling has reached the memory capacity.

By default, if no resource hints are provided, Vertical Autoscaling does notscale memory beyond 16 GiB per worker (26 GiB when using GPUs) or lessthan 6  GiB per worker (12 GiB when using GPUs). When these limits arereached, one of the following log messages is generated in Cloud Logging.

Streaming jobs:

Vertical Autoscaling has a desire to upscale memory, but we have hit the memoryscaling limit of X GiB. This is only a problem if the pipeline continues to seememory throttling and/or OOMs.

Batch jobs:

Vertical Autoscaling has a desire to upscale memory, but we have hit the memoryscaling limit of 16.0 GiB. Job will fail because we have upsized to maximumsize, and the pipeline is still OOMing.

If your pipeline continues to see out-of-memory errors, you can userightfitting (resource hints) to define memoryrequirements for your transform by specifyingmin_ram="numberXB". This settingallows Dataflow to select an initial configuration for yourworkers that can support a higher memory capacity. However, changing thisinitial configuration can increase the latent parallelism available to yourpipeline. If you have a memory-hungry transform, this might result in yourpipeline using more memory than before due to the increased availableparallelism. In such cases, it might be necessary to optimize your transform toreduce its memory footprint.

Note: Vertical Autoscaling does not prevent OOM errors from appearing in theworker logs. If an OOM error occurs, it is visible in the worker logs, becauseVertical Autoscaling finds and tracks the OOM events.

Worker memory limit doesn't stabilize and goes up and down over time despite constant memory use

These instructions only apply to streaming jobs.For Java pipelines, enableStreaming EngineandRunner v2.If the issue persists or if you observe this behavior in Python pipelines,contactCustomer Care.

Common log messages

This section describes the common log messages generated when you enableVertical Autoscaling.

Vertical Autoscaling is enabled. This pipeline is receiving recommendations for resources allocated per worker.

This message indicates that Vertical Autoscaling is active. The absence of thismessage indicates that Vertical Autoscaling is not operating on the worker pool.

If Vertical Autoscaling is not active, seeVertical Autoscaling does not seemto work. What should Icheck?for troubleshooting instructions.

Vertical Autoscaling update triggered to change per worker memory limit for pool from X GiB to Y GiB.

This message indicates that Vertical Autoscaling has triggered a resize of theworker pool memory.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.