Profile Google Cloud Serverless for Apache Spark resource usage Stay organized with collections Save and categorize content based on your preferences.
This document describes how to profile Google Cloud Serverless for Apache Spark resourceusage.Cloud Profiler continuously gathers and reportsapplication CPU usage and memory allocation information. You can enableprofiling when you submit a batch or create a session workloadby using the profiling properties listed in the following table.Google Cloud Serverless for Apache Spark appends related JVM options tothespark.driver.extraJavaOptions andspark.executor.extraJavaOptionsconfigurations used for the workload.
| Option | Description | Value | Default |
|---|---|---|---|
dataproc.profiling.enabled | Enable profiling of the workload | true orfalse | false |
dataproc.profiling.name | Profile name on the Profiler service | PROFILE_NAME | spark-WORKLOAD_TYPE-WORKLOAD_ID, where: |
Notes:
- Serverless for Apache Spark sets the profiler version to eitherthebatch UUIDor thesession UUID.
- Profiler supports the following Spark workload types:
Spark,PySpark,SparkSql, andSparkR. - A workload must run for more than three minutes to allow Profilerto collect and upload data to a project.
- You can override profiling options submitted with a workload by constructing a
SparkConf, and then settingextraJavaOptionsin your code.Note that settingextraJavaOptionsproperties when the workload is submitteddoesn't override profiling options submitted with the workload.
For an example of profiler options used with a batch submission, see thePySpark batch workload example.
Enable profiling
Complete the following steps to enable profiling on a workload:
- Enable the Profiler.
- If you are using acustom VM service account,grant theCloud Profiler Agentrole to the custom VM service account. This role contains requiredProfiler permissions.
- Set profiling properties when yousubmit a batch workloadorcreate a session template.
PySpark batch workload example
The following example uses the gcloud CLI to submit a PySpark batchworkload with profiling enabled.
gcloud dataproc batches submit pysparkPYTHON_WORKLOAD_FILE \ --region=REGION \ --properties=dataproc.profiling.enabled=true,dataproc.profiling.name=PROFILE_NAME \ -- other args
Two profiles are created:
PROFILE_NAME-driverto profile spark driver tasksPROFILE_NAME-executorto profile spark executor tasks
View profiles
You can view profiles fromProfilerin the Google Cloud console.
What's next
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.