Dataproc service accounts

This page describes service accounts and VM access scopes and how they are usedwith Dataproc.

Security requirement beginning August 3, 2020: Dataproc users are required to haveservice accountActAs permission to deploy Dataproc resources, such as creating clusters and submitting jobs. TheService Account User role contains this permission. SeeRoles for service account authenticationfor information about required Dataproc roles.

Opt-in for existing Dataproc users:Existing Dataproc users as of August 3, 2020 can opt in to this security requirement (seeSecuring Dataproc, Dataflow, and Cloud Data Fusion).

What are service accounts?

Aservice account is a special accountthat can be used by services and applications running on a Compute Enginevirtual machine (VM) instance to interact with other Google Cloud APIs.Applications can use service account credentials to authorize themselves to aset of APIs and perform actions on the VM within the permissions granted to theservice account.

Use your identity instead of a VM service account. SeeDataproc Personal Cluster Authentication to run interactive workloads on a cluster as your user identity.

Dataproc cluster service accounts

The following service accounts must have the permissionsnecessary to perform Dataproc actions in the project whereyour cluster is located.

Dataproc VM service account

The VMs in a Dataproc cluster use a service account forDataprocdata plane operations.TheCompute Engine default service account,project_number-compute@developer.gserviceaccount.com, is used as the VM service account unless youspecify acustom service accountwhen you create a cluster. The VM service account must have theDataproc Workerrole, which includes the permissions required for Dataproc dataplane operations. For more information, seeDataproc roles.

Note: For successful Dataproc cluster operation, theDataproc VM service accountmust have access the Dataprocstaging and temp buckets.If you modify default VM service account IAM permissions or specify acustom VM service accountfor your cluster, you must grant the VM service accountthestorage.* IAM permissions included in theroles/dataproc.worker role. Also seeIAM permissions for Cloud Storage andIAM roles for Cloud Storage.

View VM service account roles

To view the roles granted to the Dataproc VM serviceaccount, do the following:

  1. In the Google Cloud console, go to theIAMpage.

    Go to IAM

  2. ClickInclude Google-provided role grants.

  3. View the roles listed for the VM service account. The following imageshows the requiredDataproc Worker role listed for theCompute Engine default service account (project_number-compute@developer.gserviceaccount.com)that Dataproc uses by default as the VM service account.

  4. You can click the pencil icon displayed on the service account row togrant or remove service account roles.

Dataproc Service Agent service account

Dataproc creates theService Agent service account,service-project_number@dataproc-accounts.iam.gserviceaccount.com, and grants the service accounttheDataproc Service Agentrole in a Google Cloud project. This serviceaccount performsDataproc control planeoperations, such as the creation, update, and deletion ofcluster VMs. You can't replace this service account with a custom VM serviceaccount when you create a cluster.

To view Dataproc service agentroles on a project, click"Include Google-provided role grants" on theIdentity and Access Management page in the Google Cloud console.Note: The Dataproc service agentservice account is not be listed on the IAM page if it does not have anyroles on the project.

Role grant to Service Agent service account in a Shared VPC network

If a Dataproc cluster uses a Shared VPC network, a Shared VPC Admin must granttheDataproc Service Agent service account the role ofNetwork User for the Shared VPChost project. For more information, see the following:

Create a cluster with a custom VM service account

When you create a cluster, you can specify a customVM service accountthat your cluster will use for Dataproc data plane operations insteadof thedefault VM service account(you can't change the VM service account after the clusteris created). Using a VM service account with assignedIAM roleslets you provide your cluster with fine-grained access to project resources.

Objective: This section shows you how to specify a custom VM service account whenyou create a cluster. SeeCreate a cluster with a custom VM service account from another project tospecify a VM service account from a different project.

Preliminary steps

  1. Create the custom VM service accountwithin the project where the cluster will be created.

  2. Grant the custom VM service account theDataproc Workerrole on the project and any additional roles needed by your jobs, such as theBigQueryReader and Writerroles (seeDataproc roles).

    gcloud CLI example:

    • The following sample command grants the custom VM service accountin the cluster project the Dataproc Worker role at the project level:
    gcloud projects add-iam-policy-bindingCLUSTER_PROJECT_ID \    --member=serviceAccount:SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com \    --role="roles/dataproc.worker"
    • Consider a custom role: Instead of granting serviceaccount the predefined Dataproc Worker role (roles/dataproc.worker), you can grant theservice account a custom role that contains Worker role permissionsbut limits thestorage.objects.*permissions.
      • The custom role must at least grant the VM service accountstorage.objects.create,storage.objects.get, andstorage.objects.update permissions on the objects in theDataproc staging and temp bucketsand on any additional buckets needed by jobs that will run on thecluster.

Create the cluster

  • Create the cluster in your project.

gcloud Command

Use thegcloud dataproc clusters createcommand to create a cluster with the custom VM service account.

gcloud dataproc clusters createCLUSTER_NAME \    --region=REGION \    --service-account=SERVICE_ACCOUNT_NAME@PROJECT_ID.iam.gserviceaccount.com \    --scopes=SCOPE

Replace the following:

  • CLUSTER_NAME: The cluster name, which must be unique within a project. The name must start with a lowercase letter, and can contain up to 51 lowercase letters, numbers, and hyphens. It cannot end with a hyphen. The name of a deleted cluster can be reused.
  • REGION: Theregion where the cluster will be located.
  • SERVICE_ACCOUNT_NAME: The service account name.
  • PROJECT_ID: The Google Cloud project ID of the project containing yourVM service account. This will be the ID ofthe project where your cluster will be created or the ID of another project if you arecreating a cluster with a custom VM service account in another project.
  • SCOPE:Access scope(s) for cluster VM instances (for example,https://www.googleapis.com/auth/cloud-platform).

REST API

When completing theGceClusterConfigas part of theclusters.createAPI request, set the following fields:

Console

Setting a Dataproc VM service account in the Google Cloud console is not supported. You can set thecloud-platformaccess scope on cluster VMs when you create the cluster by clicking "Enables the cloud-platform scope for this cluster" in theProject access section of theManage security panel on the DataprocCreate a cluster page in the Google Cloud console.

Create a cluster with a custom VM service account from another project

When you create a cluster, you can specify a customVM service accountthat your cluster will use for Dataproc data plane operations insteadof using thedefault VM service account(you can't specify a custom VM service account after the clusteris created). Using a custom VM service account with assignedIAM rolesallows you to provide your cluster with fine-grained access to project resources.

Objective: This section shows you how to specify a custom VM service account for yourcluster from a project that is different from the project where your cluster will be created.We refer to this project as the "service account project" to distinguish it from theproject where your cluster will be created (the "cluster project"). SeeCreate a cluster with a custom VM service accountto use a custom VM service account from the project where the cluster will be created .

Preliminary steps

  1. In the service account project (the project where the custom VM service account is located):

    1. Enable service accounts to be attached across projects.

    2. Enable the Dataproc API.

      Roles required to enable APIs

      To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

      Enable the API

  2. Grant to your email account (the user who is creating the cluster) theService Account User roleon either the service account project or, for more granular control,the custom VM service account in the service account project.

    For more information: SeeManage access to projects, folders, and organizationsto grant roles at the project level andManage access to service accountsgrant roles at the service account level.

    gcloud CLI examples:

    • The following sample command grants to the user the Service Account Userrole at the project level:
    gcloud projects add-iam-policy-bindingSERVICE_ACCOUNT_PROJECT_ID \    --member=USER_EMAIL \    --role="roles/iam.serviceAccountUser"

    Notes:USER_EMAIL: Provide your user account email address, in the format:user:user-name@example.com.

    • The following sample command grants to the user the Service Account Userrole at the service account level:
    gcloud iam service-accounts add-iam-policy-bindingVM_SERVICE_ACCOUNT_EMAIL \    --member=USER_EMAIL \    --role="roles/iam.serviceAccountUser"

    Notes:USER_EMAIL: Provide your user account email address, in the format:user:user-name@example.com.

  3. Grant the custom VM service account theDataproc Workerrole on the cluster project.

    gcloud CLI example:

    gcloud projects add-iam-policy-bindingCLUSTER_PROJECT_ID \    --member=serviceAccount:SERVICE_ACCOUNT_NAME@SERVICE_ACCOUNT_PROJECT_ID.iam.gserviceaccount.com \    --role="roles/dataproc.worker"
  4. Grant theDataproc Service Agent service accountin the cluster project theService Account Userand theService Account Token Creatorroles on either the service account project or, for more granular control,the custom VM service account in the service account project. By doing this,you allow the Dataproc service agent service account in the cluster project tocreate tokens for the custom Dataproc VM service account in the service account project.

    For more information: SeeManage access to projects, folders, and organizationsto grant roles at the project level andManage access to service accountsgrant roles at the service account level.

    gcloud CLI examples:

    • The following sample commands grant the Dataproc Service Agentservice account in the cluster project the Service Account User and Service Account TokenCreator roles at the project level:
    gcloud projects add-iam-policy-bindingSERVICE_ACCOUNT_PROJECT_ID \    --member=serviceAccount:service-CLUSTER_PROJECT_NUMBER@dataproc-accounts.iam.gserviceaccount.com \    --role="roles/iam.serviceAccountUser"
    gcloud projects add-iam-policy-bindingSERVICE_ACCOUNT_PROJECT_ID \    --member=serviceAccount:service-CLUSTER_PROJECT_NUMBER@dataproc-accounts.iam.gserviceaccount.com \    --role="roles/iam.serviceAccountTokenCreator"
    • The following sample commands grant the Dataproc Service Agentservice account in the cluster project the Service Account User and ServiceAccount Token Creator roles at the VM service account level:
    gcloud iam service-accounts add-iam-policy-bindingVM_SERVICE_ACCOUNT_EMAIL \    --member=serviceAccount:service-CLUSTER_PROJECT_NUMBER@dataproc-accounts.iam.gserviceaccount.com \    --role="roles/iam.serviceAccountUser"
    gcloud iam service-accounts add-iam-policy-bindingVM_SERVICE_ACCOUNT_EMAIL \    --member=serviceAccount:service-CLUSTER_PROJECT_NUMBER@dataproc-accounts.iam.gserviceaccount.com \    --role="roles/iam.serviceAccountTokenCreator"
    Note: Note: In clusters created with Dataproc image versions prior to2.0.58, 2.1.4, or 2.2.1 (released in 2023), you must also grant theGoogle APIs Service Agent service account(PROJECT_NUMBER@cloudservices.gserviceaccount.com)in the cluster project theService Account Token Creatorrole on either the service account project or the custom VM service account in the service account project.
  5. Grant theCompute Engine Service Agent service account in the cluster projecttheService Account Token Creatorrole on either the service account project or, for more granular control,the custom VM service account in the service account project.By doing this, you grant the Compute Agent Service Agent service account in the clusterproject the ability to create tokens for the custom Dataproc VM serviceaccount in the service account project.

    For more information: SeeManage access to projects, folders, and organizationsto grant roles at the project level andManage access to service accountsgrant roles at the service account level.

    gcloud CLI examples:

    • The following sample command grants the Compute Engine Service Agent service accountin the cluster project the Service Account Token Creator role at the project level:
    gcloud projects add-iam-policy-bindingSERVICE_ACCOUNT_PROJECT_ID \    --member=serviceAccount:service-CLUSTER_PROJECT_NUMBER@compute-system.iam.gserviceaccount.com \    --role="roles/iam.serviceAccountTokenCreator"
    • The following sample command grants the Compute Engine Service Agent service accountin the cluster project the Service Account Token Creator role at the VM service account level:
    gcloud iam service-accounts add-iam-policy-bindingVM_SERVICE_ACCOUNT_EMAIL \    --member=serviceAccount:service-CLUSTER_PROJECT_NUMBER@compute-system.iam.gserviceaccount.com \    --role="roles/iam.serviceAccountTokenCreator"

Create the cluster

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.