Monitoring pipeline performance using Cloud Profiler

Cloud Profiler is a statistical, low-overhead profiler that continuouslygathers CPU usage and memory allocation information from your productionapplications. For more details, seeProfiling concepts.To troubleshoot or monitor pipeline performance, use Dataflowintegration with Cloud Profiler to identify the parts of the pipeline codeconsuming the most resources.

For troubleshooting tips and debugging strategies for building or running yourDataflow pipeline, seeTroubleshooting and debugging pipelines.

Before you begin

Understandprofiling concepts andfamiliarize yourself with the Profiler interface.For information about how to get started with the Profilerinterface, seeSelect the profiles to analyze.

The Cloud Profiler API must be enabled for your project before your job is started.It is enabled automatically the first time youvisit the Profilerpage.Alternatively, you can enable the Cloud Profiler API by using theGoogle Cloud CLIgcloud command-line tool or the Google Cloud console.

To use Cloud Profiler, your project must have enoughquota.In addition, theworker service accountfor the Dataflow job must haveappropriate permissions for Profiler. For example, to createprofiles, the worker service account must have thecloudprofiler.profiles.createpermission, which is included in the Cloud Profiler Agent(roles/cloudprofiler.agent) IAM role.For more information, seeAccess control with IAM.

Enable Cloud Profiler for Dataflow pipelines

Cloud Profiler is available for Dataflow pipelines written inApache Beam SDK for Java and Python, version 2.33.0 or later. Pythonpipelines must use Dataflow Runner v2. Cloud Profiler can beenabled at pipeline start time. The amortized CPU and memory overhead isexpected to be less than 1% for your pipelines.

Java

To enable CPU profiling, start the pipeline with the following option.

--dataflowServiceOptions=enable_google_cloud_profiler

To enable heap profiling, start the pipeline with the following options. Heap profilingrequires Java 11 or higher.

--dataflowServiceOptions=enable_google_cloud_profiler

--dataflowServiceOptions=enable_google_cloud_heap_sampling

Note: The pipeline option--dataflowServiceOptions is the Dataflow preferred way to enable Dataflow features. Alternatively, you can use--experiments.

Python

To use Cloud Profiler, your Python pipeline must run with DataflowRunner v2.

To enable CPU profiling, start the pipeline with the following option. Heap profiling is not yet supported for Python.

--dataflow_service_options=enable_google_cloud_profiler

Note: The pipeline option--dataflow_service_options is the Dataflow preferred way to enable Dataflow features. Alternatively, you can use--experiments.

Go

To enable CPU and heap profiling, start the pipeline with the following option.

--dataflow_service_options=enable_google_cloud_profiler

Note: The pipeline option--dataflow_service_options is the Dataflow preferred way to enable Dataflow features. Alternatively, you can use--experiments.

If you deploy your pipelines fromDataflow templates and want to enable Cloud Profiler,specify theenable_google_cloud_profiler andenable_google_cloud_heap_sampling flags as additional experiments.

Console

If you use a Google-providedtemplate, you can specify the flags on the DataflowCreatejob from template page in theAdditional experiments field.

gcloud

If you use the Google Cloud CLI to runtemplates, eithergclouddataflow jobs run orgcloud dataflow flex-template run, depending onthe template type, use the--additional-experiments option to specify the flags.

API

If you use the RESTAPI to run templates, depending on the template type, specify the flags using theadditionalExperiments field of the runtime environment, eitherRuntimeEnvironment orFlexTemplateRuntimeEnvironment.

View the profiling data

If Cloud Profiler is enabled, a link to the Profiler page isshown on the job page.

The Job page with a link to the Profiler page.

On the Profiler page, you can also find the profiling data foryour Dataflow pipeline. TheService is your job name and theVersion is your job ID.

Shows the Service and Version values for profiling a Dataflow job.

Using the Cloud Profiler

The Profiler page contains aflame graph which displays statistics for each frame running on a worker.In the horizontal direction, you can see how long each frame took to execute in terms of CPU time.In the vertical direction, you can see stack traces and code running in parallel.The stack traces are dominated by runner infrastructure code. For debugging purposes we are usually interested in user code execution, which is typically found near the bottom tips of the graph.User code can be identified by looking formarker frames, which represent runner code that is known to only call into user code.In the case of the Beam ParDo runner, a dynamic adapter layer is created to invoke the user-supplied DoFn method signature.This layer can be identified as a frame with theinvokeProcessElement suffix.The following image shows a demonstration of finding amarker frame.

An example Profiler flame graph showing a marker frame.

After clicking on an interesting marker frame, the flame graph focuses on that stack trace, giving a good sense of long running user code.The slowest operations can indicate where bottlenecks have formed and present opportunities for optimization.In the following example, it is possible to see that Global Windowing is being used with a ByteArrayCoder.In this case, the coder might be a good area for optimization because it is taking up significant CPU time compared to the ArrayList and HashMap operations.

Note: The following filter can be applied to quickly find and isolate all marker frames:Show from frame: invokeProcessElement.

An example marker frame stack trace showing slowest running operations.

Troubleshoot Cloud Profiler

If you enable Cloud Profiler and your pipeline doesn't generate profilingdata, one of the following conditions might be the cause.

The Cloud Profiler agent is installed during Dataflow workerstartup. Log messages generated by Cloud Profiler are available in the logtypedataflow.googleapis.com/worker-startup.

A page showing the Cloud Profiler logs with an open menu highlighting the navigation path: dataflow.googleapis.com/worker-startup.

Sometimes, profiling data exists but Cloud Profiler does not display anyoutput. The Profiler displays a message similar to,There wereprofiles collected for the specified time range, but none match the currentfilters.

To resolve this issue, try the following troubleshooting steps.

  • Make sure that the timespan and end time in the Profiler are inclusive of the job's elapsed time.

  • Confirm that the correct job is selected in the Profiler. TheService is your job name.

  • Confirm that thejob_name pipeline option has the same value as the job name on the Dataflow job page.

  • If you specified aservice-name argument when you loaded the Profiler agent, confirm that the service name is configured correctly.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.