Troubleshoot service accounts in GKE

Misconfigured or missing permissions for Google Kubernetes Engine (GKE) serviceaccounts can lead to various issues, such as nodes failing to register orworkloads being unable to access Google Cloud services.

Use this document to mitigate issues caused by misconfigured, disabled, ordeleted service accounts.

This information is important for Platform admins and operators andSecurity engineers who configure and manage project-level IAMpermissions for GKE nodes and core GKE components.For more information about the common roles and example tasks that we referencein Google Cloud content, seeCommon GKE user roles andtasks.

Grant the required role for GKE to node service accounts

For GKE clusters using Kubernetes version 1.33 or earlier, theIAM service accounts that your GKE nodes usemust have all of the permissions that are included in theKubernetes Engine Default Node Service Account(roles/container.defaultNodeServiceAccount) IAM role. Ifa GKE node service account is missing one or more of thesepermissions, GKE can't perform system tasks like the following:

Node service accounts might not have certain required permissions forreasons like the following:

If your node service account is missing the permissions that GKErequires, you might see errors and notices like the following:

  • In the Google Cloud console, on theKubernetes clusters page, aGrant critical permissions errormessage appears in theNotifications column for a specific cluster.
  • In the Google Cloud console, on the cluster details page for a specificcluster, the following error message appears:

    Grant roles/container.defaultNodeServiceAccount role to Node service account to allow for non-degraded operations.
  • In Cloud Audit Logs, Admin Activity logs for Google Cloud APIs likemonitoring.googleapis.com have the following values if the correspondingpermissions to access those APIs are missing from the node service account:

    • Severity:ERROR
    • Message:Permission denied (or the resource may not exist)
  • Logs for specific nodes are missing from Cloud Logging and the Pod logs forthe logging agent on those nodes show401 errors. To get these Pod logs,run the following command:

    [[$(kubectllogs-lk8s-app=fluentbit-gke-nkube-system-cfluentbit-gke|grep-cw"Received 401")-gt0]] &&echo"true"||echo"false"

    If the output istrue, then the system workload is experiencing401errors, which indicate a lack of permissions.

To resolve this issue, grant the Kubernetes Engine Default Node Service Account(roles/container.defaultNodeServiceAccount) role on the project to the serviceaccount that's causing the errors. Select one of the following options:

console

To find the name of the service account that your nodes use, do the following:

  1. Go to theKubernetes clusters page:

    Go to Kubernetes clusters

  2. In the cluster list, click the name of the cluster that you want to inspect.

  3. Find the name of the node service account. You need this name later.

    • For Autopilot mode clusters, in theSecurity section,find theService account field.
    • For Standard mode clusters, do the following:
    1. Click theNodes tab.
    2. In theNode pools table, click a node pool name. TheNode pool details page opens.
    3. In theSecurity section, find theService account field.

    If the value in theService account field isdefault, your nodes usethe Compute Engine default service account. If the value in thisfield isnotdefault, your nodes use a custom service account.

To grant theKubernetes Engine Default Node Service Account role to theservice account, do the following:

  1. Go to theWelcome page:

    Go to Welcome

  2. In theProject number field, clickCopy to clipboard.

  3. Go to theIAM page:

    Go to IAM

  4. ClickGrant access.

  5. In theNew principals field, specify the name of your node serviceaccount. If your nodes use the default Compute Engine serviceaccount, specify the following value:

    PROJECT_NUMBER-compute@developer.gserviceaccount.com

    ReplacePROJECT_NUMBER with the project numberthat you copied.

  6. In theSelect a role menu, select theKubernetes Engine Default NodeService Account role.

  7. ClickSave.

To verify that the role was granted, do the following:

  1. In theIAM page, click theView by roles tab.
  2. Expand theKubernetes Engine Default Node Service Account section.A list of principals that have this role is displayed.
  3. Find your node service account in the list of principals.

gcloud

  1. Find the name of the service account that your nodes use:

    • For Autopilot mode clusters, run the following command:
    gcloudcontainerclustersdescribeCLUSTER_NAME\--location=LOCATION\--flatten=autoscaling.autoprovisioningNodePoolDefaults.serviceAccount
    • For Standard mode clusters, run the following command:
    gcloudcontainerclustersdescribeCLUSTER_NAME\--location=LOCATION\--format="table(nodePools.name,nodePools.config.serviceAccount)"

    If the output isdefault, your nodes use the Compute Enginedefault service account. If the output isnotdefault, your nodes usea custom service account.

  2. Find your Google Cloud project number:

    gcloudprojectsdescribePROJECT_ID\--format="value(projectNumber)"

    ReplacePROJECT_ID with your project ID.

    The output is similar to the following:

    12345678901
  3. Grant theroles/container.defaultNodeServiceAccount role tothe service account:

    gcloudprojectsadd-iam-policy-bindingPROJECT_ID\--member="SERVICE_ACCOUNT_NAME"\--role="roles/container.defaultNodeServiceAccount"

    ReplaceSERVICE_ACCOUNT_NAME with the name of theservice account, which you found in the previous step. If your nodesuse the Compute Engine default service account, specify thefollowing value:

    serviceAccount:PROJECT_NUMBER-compute@developer.gserviceaccount.com

    ReplacePROJECT_NUMBER with the project numberfrom the previous step.

  4. Verify that the role was granted successfully:

    gcloudprojectsget-iam-policyPROJECT_ID\--flatten="bindings[].members"--filter=bindings.role:roles/container.defaultNodeServiceAccount\--format='value(bindings.members)'

    The output is the name of your service account.

Identify clusters that have node service accounts with missing permissions

Use GKE recommendations of theNODE_SA_MISSING_PERMISSIONSrecommender subtype toidentify Autopilot and Standard clusters that have nodeservice accounts with missing permissions. Recommender identifiesonly clusters that were created on or after January 1, 2024. To find and fix themissing permissions by using Recommender, do the following:

  1. Find active recommendations in your project for theNODE_SA_MISSING_PERMISSIONS recommender subtype:

    gcloudrecommenderrecommendationslist\--recommender=google.container.DiagnosisRecommender\--locationLOCATION\--projectPROJECT_ID\--formatyaml\--filter="recommenderSubtype:NODE_SA_MISSING_PERMISSIONS"

    Replace the following:

    • LOCATION: the location to find recommendations in.
    • PROJECT_ID: your Google Cloud project ID.

    The output is similar to the following, which indicates that a cluster has anode service account with missing permissions:

    associatedInsights:# lines omitted for clarityrecommenderSubtype: NODE_SA_MISSING_PERMISSIONSstateInfo:  state: ACTIVEtargetResources:- //container.googleapis.com/projects/12345678901/locations/us-central1/clusters/cluster-1

    It might take up to 24 hours for the recommendation to appear. For detailedinstructions, seeview insights and recommendations.

  2. For every cluster that's in the output of the previous step, find theassociated node service accounts and grant the required role to thoseservice accounts. Fordetails, see the instructions in theGrant node service accounts the required role for GKE section.

    After you grant the required role to the identified node service accounts,the recommendation might persist for up to 24 hours unless you manuallydismiss it.

Identify all node service accounts with missing permissions

You can run a script that searches node pools in your project's Standard and Autopilot clusters for any node service accounts that don't have the required permissions for GKE. This script uses the gcloud CLI and thejq utility. To view the script, expand the following section:

View the script

#!/bin/bash# Set your project IDproject_id=PROJECT_IDproject_number=$(gcloudprojectsdescribe"$project_id"--format="value(projectNumber)")declare-aall_service_accountsdeclare-asa_missing_permissions# Function to check if a service account has a specific permission# $1: project_id# $2: service_account# $3: permissionservice_account_has_permission(){localproject_id="$1"localservice_account="$2"localpermission="$3"localroles=$(gcloudprojectsget-iam-policy"$project_id"\--flatten="bindings[].members"\--format="table[no-heading](bindings.role)"\--filter="bindings.members:\"$service_account\"")forrolein$roles;doifrole_has_permission"$role""$permission";thenecho"Yes"# Has permissionreturnfidoneecho"No"# Does not have permission}# Function to check if a role has the specific permission# $1: role# $2: permissionrole_has_permission(){localrole="$1"localpermission="$2"gcloudiamrolesdescribe"$role"--format="json"|\jq-r".includedPermissions"|\grep-q"$permission"}# Function to add $1 into the service account array all_service_accounts# $1: service accountadd_service_account(){localservice_account="$1"all_service_accounts+=(${service_account})}# Function to add service accounts into the global array all_service_accounts for a Standard GKE cluster# $1: project_id# $2: location# $3: cluster_nameadd_service_accounts_for_standard(){localproject_id="$1"localcluster_location="$2"localcluster_name="$3"whilereadnodepool;donodepool_name=$(echo"$nodepool"|awk'{print $1}')if[["$nodepool_name"==""]];then# skip the empty line which is from running `gcloud container node-pools list` in GCP consolecontinuefiwhilereadnodepool_details;doservice_account=$(echo"$nodepool_details"|awk'{print $1}')if[["$service_account"=="default"]];thenservice_account="${project_number}-compute@developer.gserviceaccount.com"fiif[[-n"$service_account"]];thenprintf"%-60s| %-40s| %-40s| %-10s| %-20s\n"$service_account$project_id$cluster_name$cluster_location$nodepool_nameadd_service_account"${service_account}"elseecho"cannot find service account for node pool$project_id\t$cluster_name\t$cluster_location\t$nodepool_details"fidone<<<"$(gcloudcontainernode-poolsdescribe"$nodepool_name"--cluster"$cluster_name"--zone"$cluster_location"--project"$project_id"--format="table[no-heading](config.serviceAccount)")"done<<<"$(gcloudcontainernode-poolslist--cluster"$cluster_name"--zone"$cluster_location"--project"$project_id"--format="table[no-heading](name)")"}# Function to add service accounts into the global array all_service_accounts for an Autopilot GKE cluster# Autopilot cluster only has one node service account.# $1: project_id# $2: location# $3: cluster_nameadd_service_account_for_autopilot(){localproject_id="$1"localcluster_location="$2"localcluster_name="$3"whilereadservice_account;doif[["$service_account"=="default"]];thenservice_account="${project_number}-compute@developer.gserviceaccount.com"fiif[[-n"$service_account"]];thenprintf"%-60s| %-40s| %-40s| %-10s| %-20s\n"$service_account$project_id$cluster_name$cluster_location$nodepool_nameadd_service_account"${service_account}"elseecho"cannot find service account"forcluster"$project_id\t$cluster_name\t$cluster_location\t"fidone<<<"$(gcloudcontainerclustersdescribe"$cluster_name"--location"$cluster_location"--project"$project_id"--format="table[no-heading](autoscaling.autoprovisioningNodePoolDefaults.serviceAccount)")"}# Function to check whether the cluster is an Autopilot cluster or not# $1: project_id# $2: location# $3: cluster_nameis_autopilot_cluster(){localproject_id="$1"localcluster_location="$2"localcluster_name="$3"autopilot=$(gcloudcontainerclustersdescribe"$cluster_name"--location"$cluster_location"--format="table[no-heading](autopilot.enabled)")echo"$autopilot"}echo"--- 1. List all service accounts in all GKE node pools"printf"%-60s| %-40s| %-40s| %-10s| %-20s\n""service_account""project_id""cluster_name""cluster_location""nodepool_name"whilereadcluster;docluster_name=$(echo"$cluster"|awk'{print $1}')cluster_location=$(echo"$cluster"|awk'{print $2}')# how to find a cluster is a Standard cluster or an Autopilot clusterautopilot=$(is_autopilot_cluster"$project_id""$cluster_location""$cluster_name")if[["$autopilot"=="True"]];thenadd_service_account_for_autopilot"$project_id""$cluster_location""$cluster_name"elseadd_service_accounts_for_standard"$project_id""$cluster_location""$cluster_name"fidone<<<"$(gcloudcontainerclusterslist--project"$project_id"--format="value(name,location)")"echo"--- 2. Check if service accounts have permissions"unique_service_accounts=($(echo"${all_service_accounts[@]}"|tr' ''\n'|sort-u|tr'\n'' '))echo"Service accounts:${unique_service_accounts[@]}"printf"%-60s| %-40s| %-40s| %-20s\n""service_account""has_logging_permission""has_monitoring_permission""has_performance_hpa_metric_write_permission"forsain"${unique_service_accounts[@]}";dologging_permission=$(service_account_has_permission"$project_id""$sa""logging.logEntries.create")time_series_create_permission=$(service_account_has_permission"$project_id""$sa""monitoring.timeSeries.create")metric_descriptors_create_permission=$(service_account_has_permission"$project_id""$sa""monitoring.metricDescriptors.create")if[["$time_series_create_permission"=="No"||"$metric_descriptors_create_permission"=="No"]];thenmonitoring_permission="No"elsemonitoring_permission="Yes"fiperformance_hpa_metric_write_permission=$(service_account_has_permission"$project_id""$sa""autoscaling.sites.writeMetrics")printf"%-60s| %-40s| %-40s| %-20s\n"$sa$logging_permission$monitoring_permission$performance_hpa_metric_write_permissionif[["$logging_permission"=="No"||"$monitoring_permission"=="No"||"$performance_hpa_metric_write_permission"=="No"]];thensa_missing_permissions+=(${sa})fidoneecho"--- 3. List all service accounts that don't have the above permissions"if[["${#sa_missing_permissions[@]}"-gt0]];thenprintf"Grant roles/container.defaultNodeServiceAccount to the following service accounts: %s\n""${sa_missing_permissions[@]}"elseecho"All service accounts have the above permissions"fi

This script applies to all of the GKE clusters in your project.

After you identify the names of the service accounts with missing permissions,grant them the required role. For details, see the instructions in theGrant node service accounts the required role for GKEsection.

Restore the default service account to your Google Cloud project

GKE's default service account,container-engine-robot, canaccidentally become unbound from a project. TheKubernetes Engine Service Agent role(roles/container.serviceAgent) is anIdentity and Access Management (IAM) rolethat grants the service account the permissions to manage cluster resources. Ifyou remove this role binding from the service account, the default serviceaccount becomes unbound from the project, which can prevent you from deployingapplications and performing other cluster operations.

To see if the service account is removed from your project, you canuse the Google Cloud console or Google Cloud CLI.

Console

gcloud

  • Run the following command:

    gcloudprojectsget-iam-policyPROJECT_ID

    ReplacePROJECT_ID with your project ID.

If the dashboard or the command doesn't displaycontainer-engine-robot amongyour service accounts, the role is unbound.

To restore theKubernetes Engine Service Agent role(roles/container.serviceAgent) binding, run the following commands:

PROJECT_NUMBER=$(gcloudprojectsdescribe"PROJECT_ID"\--format'get(projectNumber)')\gcloudprojectsadd-iam-policy-bindingPROJECT_ID\--member"serviceAccount:service-${PROJECT_NUMBER}@container-engine-robot.iam.gserviceaccount.com"\--roleroles/container.serviceAgent

Confirm that the role binding is restored:

gcloudprojectsget-iam-policyPROJECT_ID

If you see the service account name along with thecontainer.serviceAgentrole, the role binding is restored. For example:

- members:  - serviceAccount:service-1234567890@container-engine-robot.iam.gserviceaccount.com  role: roles/container.serviceAgent

Enable the Compute Engine default service account

The service account used for the node pool is usually theCompute Engine default service account.If this default service account is deactivated, your nodes might fail to register withthe cluster.

To see if the service account is deactivated in your project, you can use theGoogle Cloud console or gcloud CLI.

Console

gcloud

  • Run the following command:
gcloudiamservice-accountslist--filter="NAME~'compute' AND disabled=true"

If the service account is deactivated, run the following commands to enable theservice account:

  1. Find your Google Cloud project number:

    gcloudprojectsdescribePROJECT_ID\--format="value(projectNumber)"

    ReplacePROJECT_ID with your project ID.

    The output is similar to the following:

    12345678901
  2. Enable the service account:

    gcloudiamservice-accountsenablePROJECT_NUMBER-compute@developer.gserviceaccount.com

    ReplacePROJECT_NUMBER with your project numberfrom the output of the preceding step.

For more information, seeTroubleshoot node registration.

Error 400/403: Missing edit permissions on account

If your service account is deleted, you might see a missing edit permissionserror. To learn how to troubleshoot this error, seeError 400/403: Missing edit permissions on account.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.