Profile Google Cloud Serverless for Apache Spark resource usage

This document describes how to profile Google Cloud Serverless for Apache Spark resourceusage.Cloud Profiler continuously gathers and reportsapplication CPU usage and memory allocation information. You can enableprofiling when you submit a batch or create a session workloadby using the profiling properties listed in the following table.Google Cloud Serverless for Apache Spark appends related JVM options tothespark.driver.extraJavaOptions andspark.executor.extraJavaOptionsconfigurations used for the workload.

OptionDescriptionValueDefault
dataproc.profiling.enabledEnable profiling of the workloadtrue orfalsefalse
dataproc.profiling.nameProfile name on the Profiler servicePROFILE_NAMEspark-WORKLOAD_TYPE-WORKLOAD_ID, where:

Notes:

  • Serverless for Apache Spark sets the profiler version to eitherthebatch UUIDor thesession UUID.
  • Profiler supports the following Spark workload types:Spark,PySpark,SparkSql, andSparkR.
  • A workload must run for more than three minutes to allow Profilerto collect and upload data to a project.
  • You can override profiling options submitted with a workload by constructing aSparkConf, and then settingextraJavaOptions in your code.Note that settingextraJavaOptions properties when the workload is submitteddoesn't override profiling options submitted with the workload.

For an example of profiler options used with a batch submission, see thePySpark batch workload example.

Enable profiling

Complete the following steps to enable profiling on a workload:

  1. Enable the Profiler.
  2. If you are using acustom VM service account,grant theCloud Profiler Agentrole to the custom VM service account. This role contains requiredProfiler permissions.
  3. Set profiling properties when yousubmit a batch workloadorcreate a session template.

PySpark batch workload example

The following example uses the gcloud CLI to submit a PySpark batchworkload with profiling enabled.

gcloud dataproc batches submit pysparkPYTHON_WORKLOAD_FILE \    --region=REGION \    --properties=dataproc.profiling.enabled=true,dataproc.profiling.name=PROFILE_NAME \    --  other args

Two profiles are created:

  • PROFILE_NAME-driver to profile spark driver tasks
  • PROFILE_NAME-executor to profile spark executor tasks

View profiles

You can view profiles fromProfilerin the Google Cloud console.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.