gcloud alpha dataproc workflow-templates set-managed-cluster

NAME
gcloud alpha dataproc workflow-templates set-managed-cluster - set a managed cluster for the workflow template
SYNOPSIS
gcloud alpha dataproc workflow-templates set-managed-cluster(TEMPLATE :--region=REGION)[--autoscaling-policy=AUTOSCALING_POLICY][--bucket=BUCKET][--cluster-name=CLUSTER_NAME][--cluster-type=TYPE][--enable-component-gateway][--initialization-action-timeout=TIMEOUT; default="10m"][--initialization-actions=CLOUD_STORAGE_URI,[…]][--labels=[KEY=VALUE,…]][--master-accelerator=[type=TYPE,[count=COUNT],…]][--master-boot-disk-provisioned-iops=MASTER_BOOT_DISK_PROVISIONED_IOPS][--master-boot-disk-provisioned-throughput=MASTER_BOOT_DISK_PROVISIONED_THROUGHPUT][--master-boot-disk-size=MASTER_BOOT_DISK_SIZE][--master-boot-disk-type=MASTER_BOOT_DISK_TYPE][--master-local-ssd-interface=MASTER_LOCAL_SSD_INTERFACE][--master-machine-type=MASTER_MACHINE_TYPE][--master-min-cpu-platform=PLATFORM][--min-secondary-worker-fraction=MIN_SECONDARY_WORKER_FRACTION][--node-group=NODE_GROUP][--num-master-local-ssds=NUM_MASTER_LOCAL_SSDS][--num-masters=NUM_MASTERS][--num-secondary-worker-local-ssds=NUM_SECONDARY_WORKER_LOCAL_SSDS][--num-worker-local-ssds=NUM_WORKER_LOCAL_SSDS][--optional-components=[COMPONENT,…]][--private-ipv6-google-access-type=PRIVATE_IPV6_GOOGLE_ACCESS_TYPE][--properties=[PREFIX:PROPERTY=VALUE,…]][--secondary-worker-accelerator=[type=TYPE,[count=COUNT],…]][--secondary-worker-boot-disk-size=SECONDARY_WORKER_BOOT_DISK_SIZE][--secondary-worker-boot-disk-type=SECONDARY_WORKER_BOOT_DISK_TYPE][--secondary-worker-local-ssd-interface=SECONDARY_WORKER_LOCAL_SSD_INTERFACE][--secondary-worker-machine-types=type=MACHINE_TYPE[,type=MACHINE_TYPE…][,rank=RANK]][--secondary-worker-standard-capacity-base=SECONDARY_WORKER_STANDARD_CAPACITY_BASE][--secondary-worker-standard-capacity-percent-above-base=SECONDARY_WORKER_STANDARD_CAPACITY_PERCENT_ABOVE_BASE][--shielded-integrity-monitoring][--shielded-secure-boot][--shielded-vtpm][--temp-bucket=TEMP_BUCKET][--tier=TIER][--worker-accelerator=[type=TYPE,[count=COUNT],…]][--worker-boot-disk-provisioned-iops=WORKER_BOOT_DISK_PROVISIONED_IOPS][--worker-boot-disk-provisioned-throughput=WORKER_BOOT_DISK_PROVISIONED_THROUGHPUT][--worker-boot-disk-size=WORKER_BOOT_DISK_SIZE][--worker-boot-disk-type=WORKER_BOOT_DISK_TYPE][--worker-local-ssd-interface=WORKER_LOCAL_SSD_INTERFACE][--worker-min-cpu-platform=PLATFORM][--zone=ZONE,-zZONE][--dataproc-metastore=DATAPROC_METASTORE    |--bigquery-metastore--bigquery-metastore-database-location=BIGQUERY_METASTORE_DATABASE_LOCATION--bigquery-metastore-project-id=BIGQUERY_METASTORE_PROJECT_ID][--image=IMAGE    |--image-version=VERSION][--kerberos-config-file=KERBEROS_CONFIG_FILE    |--enable-kerberos--kerberos-root-principal-password-uri=KERBEROS_ROOT_PRINCIPAL_PASSWORD_URI [--kerberos-kms-key=KERBEROS_KMS_KEY :--kerberos-kms-key-keyring=KERBEROS_KMS_KEY_KEYRING--kerberos-kms-key-location=KERBEROS_KMS_KEY_LOCATION--kerberos-kms-key-project=KERBEROS_KMS_KEY_PROJECT]][--kms-key=KMS_KEY :--kms-keyring=KMS_KEYRING--kms-location=KMS_LOCATION--kms-project=KMS_PROJECT][--metadata=KEY=VALUE,[KEY=VALUE,…]--resource-manager-tags=KEY=VALUE,[KEY=VALUE,…]--scopes=SCOPE,[SCOPE,…]--service-account=SERVICE_ACCOUNT--tags=TAG,[TAG,…]--network=NETWORK    |--subnet=SUBNET--reservation=RESERVATION--reservation-affinity=RESERVATION_AFFINITY; default="any"][--no-address    |--public-ip-address][--single-node    |--min-num-workers=MIN_NUM_WORKERS--num-secondary-workers=NUM_SECONDARY_WORKERS--num-workers=NUM_WORKERS--secondary-worker-type=TYPE; default="preemptible"][--worker-machine-type=WORKER_MACHINE_TYPE    |--worker-machine-types=type=MACHINE_TYPE[,type=MACHINE_TYPE…][,rank=RANK]][GCLOUD_WIDE_FLAG]
DESCRIPTION
(ALPHA) Set a managed cluster for the workflow template.
EXAMPLES
To update managed cluster in a workflow template, run:
gcloudalphadataprocworkflow-templatesset-managed-clustermy_template--region=us-central1--no-address--num-workers=10--worker-machine-type=custom-6-23040
POSITIONAL ARGUMENTS
Template resource - The name of the workflow template to set managed cluster.The arguments in this group can be used to specify the attributes of thisresource. (NOTE) Some attributes are not given arguments in this group but canbe set in other ways.

To set theproject attribute:

  • provide the argumenttemplate on the command line with a fullyspecified name;
  • provide the argument--project on the command line;
  • set the propertycore/project.

This must be specified.

TEMPLATE
ID of the template or fully qualified identifier for the template.

To set thetemplate attribute:

  • provide the argumenttemplate on the command line.

This positional argument must be specified if any of the other arguments in thisgroup are specified.

--region=REGION
Dataproc region for the template. Each Dataproc region constitutes anindependent resource namespace constrained to deploying instances into ComputeEngine zones inside the region. Overrides the defaultdataproc/region property value for this command invocation.

To set theregion attribute:

  • provide the argumenttemplate on the command line with a fullyspecified name;
  • provide the argument--region on the command line;
  • set the propertydataproc/region.
FLAGS
--autoscaling-policy=AUTOSCALING_POLICY
ID of the autoscaling policy or fully qualified identifier for the autoscalingpolicy.

To set theautoscaling_policy attribute:

  • provide the argument--autoscaling-policy on the command line.
--bucket=BUCKET
The Google Cloud Storage bucket to use by default to stage job dependencies,miscellaneous config files, and job driver console output when using thiscluster.
--cluster-name=CLUSTER_NAME
The name of the managed dataproc cluster. If unspecified, the workflow templateID will be used.
--cluster-type=TYPE
The type of cluster.TYPE must be one of:standard,single-node,zero-scale.
--enable-component-gateway
Enable access to the web UIs of selected components on the cluster through thecomponent gateway.
--initialization-action-timeout=TIMEOUT; default="10m"
The maximum duration of each initialization action. See $gcloud topic datetimes forinformation on duration formats.
--initialization-actions=CLOUD_STORAGE_URI,[…]
A list of Google Cloud Storage URIs of executables to run on each node in thecluster.
--labels=[KEY=VALUE,…]
List of label KEY=VALUE pairs to add.

Keys must start with a lowercase character and contain only hyphens(-), underscores (_), lowercase characters, andnumbers. Values must contain only hyphens (-), underscores(_), lowercase characters, and numbers.

--master-accelerator=[type=TYPE,[count=COUNT],…]
Attaches accelerators, such as GPUs, to the master instance(s).
type
The specific type of accelerator to attach to the instances, such asnvidia-tesla-t4 for NVIDIA T4. Usegcloud computeaccelerator-types list to display available accelerator types.
count
The number of accelerators to attach to each instance. The default value is 1.
--master-boot-disk-provisioned-iops=MASTER_BOOT_DISK_PROVISIONED_IOPS
Indicates theIOPS toprovision for the disk. This sets the limit for disk I/O operations per second.This is only supported if the bootdisk type ishyperdisk-balanced.
--master-boot-disk-provisioned-throughput=MASTER_BOOT_DISK_PROVISIONED_THROUGHPUT
Indicates thethroughputto provision for the disk. This sets the limit for throughput in MiB per second.This is only supported if the bootdisk type ishyperdisk-balanced.
--master-boot-disk-size=MASTER_BOOT_DISK_SIZE
The size of the boot disk. The value must be a whole number followed by a sizeunit ofKB for kilobyte,MB for megabyte,GB for gigabyte, orTB for terabyte. For example,10GB will produce a 10 gigabyte disk. Theminimum size a boot disk can have is 10 GB. Disk size must be a multiple of 1GB.
--master-boot-disk-type=MASTER_BOOT_DISK_TYPE
The type of the boot disk. The value must bepd-balanced,pd-ssd, orpd-standard.
--master-local-ssd-interface=MASTER_LOCAL_SSD_INTERFACE
Interface to use to attach local SSDs to master node(s) in a cluster.
--master-machine-type=MASTER_MACHINE_TYPE
The type of machine to use for the master. Defaults to server-specified.
--master-min-cpu-platform=PLATFORM
When specified, the VM is scheduled on the host with a specified CPUarchitecture or a more recent CPU platform that's available in that zone. Tolist available CPU platforms in a zone, run:
gcloudcomputezonesdescribeZONE

CPU platform selection may not be available in a zone. Zones that support CPUplatform selection provide anavailableCpuPlatforms field, whichcontains the list of available CPU platforms in the zone (seeAvailabilityof CPU platforms for more information).

--min-secondary-worker-fraction=MIN_SECONDARY_WORKER_FRACTION
Minimum fraction of secondary worker nodes required to create the cluster. If itis not met, cluster creation will fail. Must be a decimal value between 0 and 1.The number of required secondary workers is calculated byceil(min-secondary-worker-fraction * num_secondary_workers). Defaults to 0.0001.
--node-group=NODE_GROUP
The name of the sole-tenant node group to create the cluster on. Can be a shortname ("node-group-name") or in the format"projects/{project-id}/zones/{zone}/nodeGroups/{node-group-name}".
--num-master-local-ssds=NUM_MASTER_LOCAL_SSDS
The number of local SSDs to attach to the master in a cluster.
--num-masters=NUM_MASTERS
The number of master nodes in the cluster.
Number of MastersCluster Mode
1Standard
3High Availability
--num-secondary-worker-local-ssds=NUM_SECONDARY_WORKER_LOCAL_SSDS
The number of local SSDs to attach to each preemptible worker in a cluster.
--num-worker-local-ssds=NUM_WORKER_LOCAL_SSDS
The number of local SSDs to attach to each worker in a cluster.
--optional-components=[COMPONENT,…]
List of optional components to be installed on cluster machines.

The following page documents the optional components that can be installed:https://cloud.google.com/dataproc/docs/concepts/configuring-clusters/optional-components.

--private-ipv6-google-access-type=PRIVATE_IPV6_GOOGLE_ACCESS_TYPE
The private IPv6 Google access type for the cluster.PRIVATE_IPV6_GOOGLE_ACCESS_TYPE must be one of:inherit-subnetwork,outbound,bidirectional.
--properties=[PREFIX:PROPERTY=VALUE,…]
Specifies configuration properties for installed packages, such as Hadoop andSpark.

Properties are mapped to configuration files by specifying a prefix, such as"core:io.serializations". The following are supported prefixes and theirmappings:

PrefixFilePurpose of file
capacity-schedulercapacity-scheduler.xmlHadoop YARN Capacity Scheduler configuration
corecore-site.xmlHadoop general configuration
distcpdistcp-default.xmlHadoop Distributed Copy configuration
hadoop-envhadoop-env.shHadoop specific environment variables
hdfshdfs-site.xmlHadoop HDFS configuration
hivehive-site.xmlHive configuration
mapredmapred-site.xmlHadoop MapReduce configuration
mapred-envmapred-env.shHadoop MapReduce specific environment variables
pigpig.propertiesPig configuration
sparkspark-defaults.confSpark configuration
spark-envspark-env.shSpark specific environment variables
yarnyarn-site.xmlHadoop YARN configuration
yarn-envyarn-env.shHadoop YARN specific environment variables
Seehttps://cloud.google.com/dataproc/docs/concepts/configuring-clusters/cluster-propertiesfor more information.
--secondary-worker-accelerator=[type=TYPE,[count=COUNT],…]
Attaches accelerators, such as GPUs, to the secondary-worker instance(s).
type
The specific type of accelerator to attach to the instances, such asnvidia-tesla-t4 for NVIDIA T4. Usegcloud computeaccelerator-types list to display available accelerator types.
count
The number of accelerators to attach to each instance. The default value is 1.
--secondary-worker-boot-disk-size=SECONDARY_WORKER_BOOT_DISK_SIZE
The size of the boot disk. The value must be a whole number followed by a sizeunit ofKB for kilobyte,MB for megabyte,GB for gigabyte, orTB for terabyte. For example,10GB will produce a 10 gigabyte disk. Theminimum size a boot disk can have is 10 GB. Disk size must be a multiple of 1GB.
--secondary-worker-boot-disk-type=SECONDARY_WORKER_BOOT_DISK_TYPE
The type of the boot disk. The value must bepd-balanced,pd-ssd, orpd-standard.
--secondary-worker-local-ssd-interface=SECONDARY_WORKER_LOCAL_SSD_INTERFACE
Interface to use to attach local SSDs to each secondary worker in a cluster.
--secondary-worker-machine-types=type=MACHINE_TYPE[,type=MACHINE_TYPE…][,rank=RANK]
Types of machines with optional rank for secondary workers to use. Defaults toserver-specified.eg.--secondary-worker-machine-types="type=e2-standard-8,type=t2d-standard-8,rank=0"
--secondary-worker-standard-capacity-base=SECONDARY_WORKER_STANDARD_CAPACITY_BASE
This flag sets the base number of Standard VMs to use forsecondaryworkers. Dataproc will create only standard VMs until it reaches thisnumber, then it will mix Spot and Standard VMs according toSECONDARY_WORKER_STANDARD_CAPACITY_PERCENT_ABOVE_BASE.
--secondary-worker-standard-capacity-percent-above-base=SECONDARY_WORKER_STANDARD_CAPACITY_PERCENT_ABOVE_BASE
When combining Standard and Spot VMs forsecondary-workersonce the number of Standard VMs specified bySECONDARY_WORKER_STANDARD_CAPACITY_BASE hasbeen used, this flag specifies the percentage of the total number of additionalStandard VMs secondary workers will use. Spot VMs will be used for the remainingpercentage.
--shielded-integrity-monitoring
Enables monitoring and attestation of the boot integrity of the cluster's VMs.vTPM (virtual Trusted Platform Module) must also be enabled. A TPM is a hardwaremodule that can be used for different security operations, such as remoteattestation, encryption, and sealing of keys.
--shielded-secure-boot
The cluster's VMs will boot with secure boot enabled.
--shielded-vtpm
The cluster's VMs will boot with the TPM (Trusted Platform Module) enabled. ATPM is a hardware module that can be used for different security operations,such as remote attestation, encryption, and sealing of keys.
--temp-bucket=TEMP_BUCKET
The Google Cloud Storage bucket to use by default to store ephemeral cluster andjobs data, such as Spark and MapReduce history files.
--tier=TIER
Cluster tier.TIER must be one of:premium,standard.
--worker-accelerator=[type=TYPE,[count=COUNT],…]
Attaches accelerators, such as GPUs, to the worker instance(s).
type
The specific type of accelerator to attach to the instances, such asnvidia-tesla-t4 for NVIDIA T4. Usegcloud computeaccelerator-types list to display available accelerator types.
count
The number of accelerators to attach to each instance. The default value is 1.
--worker-boot-disk-provisioned-iops=WORKER_BOOT_DISK_PROVISIONED_IOPS
Indicates theIOPS toprovision for the disk. This sets the limit for disk I/O operations per second.This is only supported if the bootdisk type ishyperdisk-balanced.
--worker-boot-disk-provisioned-throughput=WORKER_BOOT_DISK_PROVISIONED_THROUGHPUT
Indicates thethroughputto provision for the disk. This sets the limit for throughput in MiB per second.This is only supported if the bootdisk type ishyperdisk-balanced.
--worker-boot-disk-size=WORKER_BOOT_DISK_SIZE
The size of the boot disk. The value must be a whole number followed by a sizeunit ofKB for kilobyte,MB for megabyte,GB for gigabyte, orTB for terabyte. For example,10GB will produce a 10 gigabyte disk. Theminimum size a boot disk can have is 10 GB. Disk size must be a multiple of 1GB.
--worker-boot-disk-type=WORKER_BOOT_DISK_TYPE
The type of the boot disk. The value must bepd-balanced,pd-ssd, orpd-standard.
--worker-local-ssd-interface=WORKER_LOCAL_SSD_INTERFACE
Interface to use to attach local SSDs to each worker in a cluster.
--worker-min-cpu-platform=PLATFORM
When specified, the VM is scheduled on the host with a specified CPUarchitecture or a more recent CPU platform that's available in that zone. Tolist available CPU platforms in a zone, run:
gcloudcomputezonesdescribeZONE

CPU platform selection may not be available in a zone. Zones that support CPUplatform selection provide anavailableCpuPlatforms field, whichcontains the list of available CPU platforms in the zone (seeAvailabilityof CPU platforms for more information).

--zone=ZONE,-zZONE
The compute zone (e.g. us-central1-a) for the cluster. If empty and --region isset to a value other thanglobal, the server will pick a zone inthe region. Overrides the defaultcompute/zone property value forthis command invocation.
At most one of these can be specified:
--dataproc-metastore=DATAPROC_METASTORE
Specify the name of a Dataproc Metastore service to be used as an externalmetastore in the format:"projects/{project-id}/locations/{region}/services/{service-name}".
BQMS flags
--bigquery-metastore
Indicates that BigQuery metastore is to be used.
--bigquery-metastore-database-location=BIGQUERY_METASTORE_DATABASE_LOCATION
Location of the BigQuery metastore database to be used as an external metastore.
--bigquery-metastore-project-id=BIGQUERY_METASTORE_PROJECT_ID
The project ID of the BigQuery metastore database to be used as an externalmetastore.
At most one of these can be specified:
--image=IMAGE
The custom image used to create the cluster. It can be the image name, the imageURI, or the image family URI, which selects the latest image from the family.
--image-version=VERSION
The image version to use for the cluster. Defaults to the latest version.
Specifying these flags will enable Kerberos for the cluster.

At most one of these can be specified:

--kerberos-config-file=KERBEROS_CONFIG_FILE
Path to a YAML (or JSON) file containing the configuration for Kerberos on thecluster. If you pass- as the value of the flag the file contentwill be read from stdin.

The YAML file is formatted as follows:

# Optional. Flag to indicate whether to Kerberize the cluster.# The default value is true.enable_kerberos:true# Optional. The Google Cloud Storage URI of a KMS encrypted file# containing the root principal password.root_principal_password_uri:gs://bucket/password.encrypted# Optional. The URI of the Cloud KMS key used to encrypt# sensitive files.kms_key_uri:projects/myproject/locations/global/keyRings/mykeyring/cryptoKeys/my-key# Configuration of SSL encryption. If specified, all sub-fields# are required. Otherwise, Dataproc will provide a self-signed# certificate and generate the passwords.ssl:# Optional. The Google Cloud Storage URI of the keystore file.keystore_uri:gs://bucket/keystore.jks# Optional. The Google Cloud Storage URI of a KMS encrypted# file containing the password to the keystore.keystore_password_uri:gs://bucket/keystore_password.encrypted# Optional. The Google Cloud Storage URI of a KMS encrypted# file containing the password to the user provided key.key_password_uri:gs://bucket/key_password.encrypted# Optional. The Google Cloud Storage URI of the truststore# file.truststore_uri:gs://bucket/truststore.jks# Optional. The Google Cloud Storage URI of a KMS encrypted# file containing the password to the user provided# truststore.truststore_password_uri:gs://bucket/truststore_password.encrypted# Configuration of cross realm trust.cross_realm_trust:# Optional. The remote realm the Dataproc on-cluster KDC will# trust, should the user enable cross realm trust.realm:REMOTE.REALM# Optional. The KDC (IP or hostname) for the remote trusted# realm in a cross realm trust relationship.kdc:kdc.remote.realm# Optional. The admin server (IP or hostname) for the remote# trusted realm in a cross realm trust relationship.admin_server:admin-server.remote.realm# Optional. The Google Cloud Storage URI of a KMS encrypted# file containing the shared password between the on-cluster# Kerberos realm and the remote trusted realm, in a cross# realm trust relationship.shared_password_uri:gs://bucket/cross-realm.password.encrypted# Optional. The Google Cloud Storage URI of a KMS encrypted file# containing the master key of the KDC database.kdc_db_key_uri:gs://bucket/kdc_db_key.encrypted# Optional. The lifetime of the ticket granting ticket, in# hours. If not specified, or user specifies 0, then default# value 10 will be used.tgt_lifetime_hours:1# Optional. The name of the Kerberos realm. If not specified,# the uppercased domain name of the cluster will be used.realm:REALM.NAME
--enable-kerberos
Enable Kerberos on the cluster.
--kerberos-root-principal-password-uri=KERBEROS_ROOT_PRINCIPAL_PASSWORD_URI
Google Cloud Storage URI of a KMS encrypted file containing the root principalpassword. Must be a Cloud Storage URL beginning with 'gs://'.
Key resource - The Cloud KMS (Key Management Service) cryptokey that will beused to protect the password. The 'Compute Engine Service Agent' service accountmust hold permission 'Cloud KMS CryptoKey Encrypter/Decrypter'. The arguments inthis group can be used to specify the attributes of this resource.
--kerberos-kms-key=KERBEROS_KMS_KEY
ID of the key or fully qualified identifier for the key.

To set thekms-key attribute:

  • provide the argument--kerberos-kms-key on the command line.

This flag argument must be specified if any of the other arguments in this groupare specified.

--kerberos-kms-key-keyring=KERBEROS_KMS_KEY_KEYRING
The KMS keyring of the key.

To set thekms-keyring attribute:

  • provide the argument--kerberos-kms-key on the command line with afully specified name;
  • provide the argument--kerberos-kms-key-keyring on the commandline.
--kerberos-kms-key-location=KERBEROS_KMS_KEY_LOCATION
The Google Cloud location for the key.

To set thekms-location attribute:

  • provide the argument--kerberos-kms-key on the command line with afully specified name;
  • provide the argument--kerberos-kms-key-location on the commandline.
--kerberos-kms-key-project=KERBEROS_KMS_KEY_PROJECT
The Google Cloud project for the key.

To set thekms-project attribute:

  • provide the argument--kerberos-kms-key on the command line with afully specified name;
  • provide the argument--kerberos-kms-key-project on the commandline;
  • set the propertycore/project.
Key resource - The Cloud KMS (Key Management Service) cryptokey that will beused to protect the cluster. The 'Compute Engine Service Agent' service accountmust hold permission 'Cloud KMS CryptoKey Encrypter/Decrypter'. The arguments inthis group can be used to specify the attributes of this resource.
--kms-key=KMS_KEY
ID of the key or fully qualified identifier for the key.

To set thekms-key attribute:

  • provide the argument--kms-key on the command line.

This flag argument must be specified if any of the other arguments in this groupare specified.

--kms-keyring=KMS_KEYRING
The KMS keyring of the key.

To set thekms-keyring attribute:

  • provide the argument--kms-key on the command line with a fullyspecified name;
  • provide the argument--kms-keyring on the command line.
--kms-location=KMS_LOCATION
The Google Cloud location for the key.

To set thekms-location attribute:

  • provide the argument--kms-key on the command line with a fullyspecified name;
  • provide the argument--kms-location on the command line.
--kms-project=KMS_PROJECT
The Google Cloud project for the key.

To set thekms-project attribute:

  • provide the argument--kms-key on the command line with a fullyspecified name;
  • provide the argument--kms-project on the command line;
  • set the propertycore/project.
Compute Engine options for Dataproc clusters.
--metadata=KEY=VALUE,[KEY=VALUE,…]
Metadata to be made available to the guest operating system running on theinstances
--resource-manager-tags=KEY=VALUE,[KEY=VALUE,…]
Specifies a list of resource manager tags to apply to each cluster node (masterand worker nodes).
--scopes=SCOPE,[SCOPE,…]
Specifies scopes for the node instances. Multiple SCOPEs can be specified,separated by commas. Examples:
gcloudalphadataprocworkflow-templatesset-managed-clusterexample-cluster--scopeshttps://www.googleapis.com/auth/bigtable.admin
gcloudalphadataprocworkflow-templatesset-managed-clusterexample-cluster--scopessqlservice,bigquery

The followingminimum scopes are necessary for the cluster tofunction properly and are always added, even if not explicitly specified:

https://www.googleapis.com/auth/devstorage.read_writehttps://www.googleapis.com/auth/logging.write

If the--scopes flag is not specified, the followingdefaultscopes are also included:

https://www.googleapis.com/auth/bigqueryhttps://www.googleapis.com/auth/bigtable.admin.tablehttps://www.googleapis.com/auth/bigtable.datahttps://www.googleapis.com/auth/devstorage.full_control

If you want to enable all scopes use the 'cloud-platform' scope.

SCOPE can be either the full URI of the scope or an alias.Defaultscopes are assigned to all instances. Available aliases are:

AliasURI
bigqueryhttps://www.googleapis.com/auth/bigquery
cloud-platformhttps://www.googleapis.com/auth/cloud-platform
cloud-source-reposhttps://www.googleapis.com/auth/source.full_control
cloud-source-repos-rohttps://www.googleapis.com/auth/source.read_only
compute-rohttps://www.googleapis.com/auth/compute.readonly
compute-rwhttps://www.googleapis.com/auth/compute
datastorehttps://www.googleapis.com/auth/datastore
defaulthttps://www.googleapis.com/auth/devstorage.read_only
https://www.googleapis.com/auth/logging.write
https://www.googleapis.com/auth/monitoring.write
https://www.googleapis.com/auth/pubsub
https://www.googleapis.com/auth/service.management.readonly
https://www.googleapis.com/auth/servicecontrol
https://www.googleapis.com/auth/trace.append
gke-defaulthttps://www.googleapis.com/auth/devstorage.read_only
https://www.googleapis.com/auth/logging.write
https://www.googleapis.com/auth/monitoring
https://www.googleapis.com/auth/service.management.readonly
https://www.googleapis.com/auth/servicecontrol
https://www.googleapis.com/auth/trace.append
logging-writehttps://www.googleapis.com/auth/logging.write
monitoringhttps://www.googleapis.com/auth/monitoring
monitoring-readhttps://www.googleapis.com/auth/monitoring.read
monitoring-writehttps://www.googleapis.com/auth/monitoring.write
pubsubhttps://www.googleapis.com/auth/pubsub
service-controlhttps://www.googleapis.com/auth/servicecontrol
service-managementhttps://www.googleapis.com/auth/service.management.readonly
sql (deprecated)https://www.googleapis.com/auth/sqlservice
sql-adminhttps://www.googleapis.com/auth/sqlservice.admin
storage-fullhttps://www.googleapis.com/auth/devstorage.full_control
storage-rohttps://www.googleapis.com/auth/devstorage.read_only
storage-rwhttps://www.googleapis.com/auth/devstorage.read_write
taskqueuehttps://www.googleapis.com/auth/taskqueue
tracehttps://www.googleapis.com/auth/trace.append
userinfo-emailhttps://www.googleapis.com/auth/userinfo.email
DEPRECATION WARNING:https://www.googleapis.com/auth/sqlserviceaccount scope andsql alias do not provide SQL instance managementcapabilities and have been deprecated. Please, usehttps://www.googleapis.com/auth/sqlservice.adminorsql-admin to manage your Google SQL Service instances.
--service-account=SERVICE_ACCOUNT
The Google Cloud IAM service account to be authenticated as.
--tags=TAG,[TAG,…]
Specifies a list of tags to apply to the instance. These tags allow networkfirewall rules and routes to be applied to specified VM instances. Seegcloud computefirewall-rules create(1) for more details.

To read more about configuring network tags, read this guide:https://cloud.google.com/vpc/docs/add-remove-network-tags

To list instances with their respective status and tags, run:

gcloudcomputeinstanceslist--format='table(name,status,tags.list())'

To list instances tagged with a specific tag,tag1, run:

gcloudcomputeinstanceslist--filter='tags:tag1'
At most one of these can be specified:
--network=NETWORK
The Compute Engine network that the VM instances of the cluster will be part of.This is mutually exclusive with --subnet. If neither is specified, this defaultsto the "default" network.
--subnet=SUBNET
Specifies the subnet that the cluster will be part of. This is mutally exclusivewith --network.
Specifies the reservation for the instance.
--reservation=RESERVATION
The name of the reservation, required when--reservation-affinity=specific.
--reservation-affinity=RESERVATION_AFFINITY; default="any"
The type of reservation for the instance.RESERVATION_AFFINITY must be one of:any,none,specific.
At most one of these can be specified:
--no-address
If provided, the instances in the cluster will not be assigned external IPaddresses.

If omitted, then the Dataproc service will apply a default policy to determineif each instance in the cluster gets an external IP address or not.

Note: Dataproc VMs need access to the Dataproc API. This can be achieved withoutexternal IP addresses using Private Google Access(https://cloud.google.com/compute/docs/private-google-access).

--public-ip-address
If provided, cluster instances are assigned external IP addresses.

If omitted, the Dataproc service applies a default policy to determine whetheror not each instance in the cluster gets an external IP address.

Note: Dataproc VMs need access to the Dataproc API. This can be achieved withoutexternal IP addresses using Private Google Access(https://cloud.google.com/compute/docs/private-google-access).

At most one of these can be specified:
--single-node
Create a single node cluster.

A single node cluster has all master and worker components. It cannot have anyseparate worker nodes. If this flag is not specified, a cluster with separateworkers is created.

Multi-node cluster flags
--min-num-workers=MIN_NUM_WORKERS
Minimum number of primary worker nodes to provision for cluster creation tosucceed.
--num-secondary-workers=NUM_SECONDARY_WORKERS
The number of secondary worker nodes in the cluster.
--num-workers=NUM_WORKERS
The number of worker nodes in the cluster. Defaults to server-specified.
--secondary-worker-type=TYPE; default="preemptible"
The type of the secondary worker group.TYPE must be oneof:preemptible,non-preemptible,spot.
At most one of these can be specified:
--worker-machine-type=WORKER_MACHINE_TYPE
The type of machine to use for primary workers. Defaults to server-specified.
--worker-machine-types=type=MACHINE_TYPE[,type=MACHINE_TYPE…][,rank=RANK]
Machinetypes for primary worker nodes to use with optional rank. A lower ranknumber is given higher preference. Based on availablilty, Dataproc tries tocreate primary worker VMs using the worker machine type with the lowest rank,and then tries to use machine types with higher ranks as necessary. Machinetypes with the same rank are given the same preference. Example use:--worker-machine-types="type=e2-standard-8,type=n2-standard-8,rank=0". For moreinformation, seeDataprocFlexible VMs
GCLOUD WIDE FLAGS
These flags are available to all commands:--access-token-file,--account,--billing-project,--configuration,--flags-file,--flatten,--format,--help,--impersonate-service-account,--log-http,--project,--quiet,--trace-token,--user-output-enabled,--verbosity.

Run$gcloud help for details.

NOTES
This command is currently in alpha and might change without notice. If thiscommand fails with API permission errors despite specifying the correct project,you might be trying to access an API with an invitation-only early accessallowlist. These variants are also available:
gclouddataprocworkflow-templatesset-managed-cluster
gcloudbetadataprocworkflow-templatesset-managed-cluster

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-07-22 UTC.