Cluster metadata

Metadata compared to Labels
  • Custom metadata is available to processes running on your cluster, and can be used by initialization actions.
  • Labels are not readily available to processes running on your cluster, but can be used when searching through resources with the Dataproc API.
If you need a piece of data to be available to your cluster and also used as anAPI search parameter, then add it both as metadata and as a label to your cluster.

Dataproc sets special metadata values for the instances that run in yourcluster:

Metadata keyValue
dataproc-bucketName of the cluster'sstaging bucket
dataproc-regionRegion of the cluster's endpoint
dataproc-worker-countNumber of worker nodes in the cluster. The value is0 forsingle node clusters.
dataproc-cluster-nameName of the cluster
dataproc-cluster-uuidUUID of the cluster
dataproc-roleInstance's role, eitherMaster orWorker
dataproc-masterHostname of the first master node. The value is either[CLUSTER_NAME]-m in a standard or single node cluster, or[CLUSTER_NAME]-m-0 in ahigh-availability cluster, where[CLUSTER_NAME] is the name of your cluster.
dataproc-master-additionalComma-separated list of hostnames for the additional master nodes in a high-availability cluster, for example,[CLUSTER_NAME]-m-1,[CLUSTER_NAME]-m-2 in a cluster that has 3 master nodes.
SPARK_BQ_CONNECTOR_VERSION or SPARK_BQ_CONNECTOR_URLThe version or URL that points to a Spark BigQuery connector version to use in Spark applications, for example,0.42.1 orgs://spark-lib/bigquery/spark-3.5-bigquery-0.42.1.jar. A default Spark BigQuery connector version is pre-installed in Dataproc2.1 and later image version clusters. For more information, seeUse the Spark BigQuery connector.

You can use these values to customize the behavior ofinitialization actions.

You can use the--metadata flag in thegcloud dataproc clusters createcommand to provide your own metadata:

gcloud dataproc clusters createCLUSTER_NAME \    --region=REGION \    --metadata=name1=value1,name2=value2... \    ... other flags ...

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.