Troubleshoot policy and access problems
This document provides an overview of Google Cloud access policyenforcement controls and the tools that are available to help troubleshootaccess problems. This document is for support teams who want to help customersin their organization to resolve issues related to accessing theirGoogle Cloud resources.
Google Cloud access policy enforcement controls
This section describes the policies that you or your organization administratorcan implement that affect access to your Google Cloud resources. Youimplement access policies by using all or some of the following products andtools.
Labels, tags, and network tags
Google Cloud offers severalways to label and group resources.You can use labels, tags, and network tags to help enforce policies.
Labels are key-value pairs that help you organize your Google Cloud resources.Many Google Cloud services support labels. You can also use labels to filter and group resources for otheruse cases, for example, to identify all the resources that are in a testenvironment as opposed to resources that are in production. In the context ofpolicy enforcement, labels can identify where resources should be located. Forexample, the access policies that you apply to resources that are labeled astest are different from the access policies that you apply to resources that arelabeled as production resources.
Tags are key-value pairs that provide a mechanism for identifying resources andapplying policy. You can attach tags to an organization, folder, or project. Atag applies to all resources at the hierarchy level that the tag is applied to.You can use tags to conditionally allow or deny access policies based on whethera resource has a specific tag. You can alsouse tags with firewall policies to controltraffic in a Virtual Private Cloud (VPC) network.Understanding how tags are inherited and combined with access and firewallpolicies is important in troubleshooting.
Network tags are different from the preceding resource manager tags. Network tags apply to VMinstances, and they are another way that you can control network traffic to andfrom a VM. On Google Cloud networks, network tags identify which VMs aresubject to firewall rules and network routes. You can use network tags as sourceand destination values in firewall rules. You can also use network tags toidentify which VMs a certain route applies to. Understanding network tags canhelp you to troubleshoot access problems because network tags are used to definenetwork and routing rules.
VPC firewall rules
You can configureVPC firewall rules to allow or deny traffic to and from your virtual machine (VM) instancesand products built on VMs. Every VPC network functions as adistributed firewall. Although VPC firewall rules are defined atthe network level, connections are allowed or denied on a per-instance basis.You can apply VPC firewall rules to the VPCnetwork, VMs grouped by tags, and VMs grouped by service accounts.
VPC Service Controls
VPC Service Controls provides a perimeter security solution that helps mitigate data exfiltrationfrom Google Cloud services such as Cloud Storage and BigQuery.You create a service perimeter that creates a security boundary aroundGoogle Cloud resources, and you can manage what is allowed in and out ofthe perimeter. VPC Service Controls also provides context-aware accesscontrols by implementing policies based on contextual attributes such as IPaddress and identity.
Resource Manager
You useResource Manager to set up anorganization resource. Resource Manager provides tools that let you map your organizationand the way you develop applications to aresource hierarchy.Along with helping you to group resources logically, Resource Manager providesattach points and inheritance for access control and organization policies.
Identity and Access Management
Identity and Access Management (IAM) lets you definewho (identity) has whataccess (role) forwhich resource.AnIAM policy is a collection of statements that defines who has what type of access, such asread or write access. The IAM policy is attached to a resourceand the policy enforces access control whenever a user attempts to access theresource.
A feature of IAM isIAM Conditions.When you implement IAM Conditions as part of your policy definition, you canchoose to grant resource access to identities (principals) only if configuredconditions are met. For example, you can use IAM Conditions to limit accessto resources only for employees making requests from your corporate office.
Organization Policy Service
TheOrganization Policy Service lets you enforce constraints onsupported resources across your organization hierarchy. Each resource that the Organization Policysupports has a set of constraints that describes the ways that the resource canbe restricted. You define a policy that defines specific rules that restrictresource configuration.
The Organization Policy Service lets you as an authorized administrator override the defaultorganization policies at the folder or project level as required. Organizationpolicies focus on how you configure resources, while IAM policiesfocus on what permissions your identities have been granted to thoseresources.
Quotas
Google Cloud enforces quotas on resources, which sets a limit on how much of a particularGoogle Cloud resource your project can use. The number of projects thatyou have is also subject to a quota. The following types of resource usage arelimited by quotas:
- Rate quota, such as API requests per day. This quota resets after aspecified time, such as a minute or a day.
- Allocation quota, such as the number of virtual machines or loadbalancers used by your project. This quota doesn't reset over time. Anallocation quota must be explicitly released when you no longer want to usethe resource, for example, by deleting a Google Kubernetes Engine (GKE) cluster.
If you reach an allocation quota limit, you can't start new resources. If youreach a rate quota, you can't complete API requests. Both of these issues canlook like an access-related issue.
Chrome Enterprise Premium
Chrome Enterprise Premium uses various Google Cloud products to enforce granular access controlbased on a user's identity and context of the request. You can configureChrome Enterprise Premium torestrict access to the Google Cloud console and to Google Cloud APIs.
Chrome Enterprise Premium access protection works by using the following Google Cloud services:
- Identity-Aware Proxy (IAP):A service that verifies user identity and uses context to determine whethera user should be granted access to a resource.
- IAM:The identity management and authorization service for Google Cloud.
- Access Context Manager:A rules engine that enables fine-grained access control.
- Endpoint Verification:A Google Chrome extension that collects user device details.
IAM Recommender
IAM includesPolicy Intelligence tools that provide you with a comprehensive set of proactive guidance to helpyou to be more efficient and secure when using Google Cloud. Recommendedactions are provided to you through notifications in the console, which you canapply directly or by using an event sent to a Pub/Sub topic.
IAM Recommender is part of the Policy Intelligence suite, and you can use it to helpapply the principle of least privilege. Recommender comparesproject-level role grants with the permissions that each principal used duringthe past 90 days. If you grant a project-level role to a principal, and theprincipal doesn't use all of that role's permissions, thenRecommender might recommend that you revoke the role. Ifnecessary, Recommender also recommends less permissive roles as areplacement.
If you automatically apply a recommendation, you can inadvertently cause a useror service account to be denied access to a resource. If you decide to useautomations, use theIAM Recommender best practices to help you decide how much automation you are comfortable with.
Kubernetes namespaces and RBAC
Kubernetes is operated as a managed service on Google Cloud asGoogle Kubernetes Engine (GKE). GKE can enforce policies thatare consistent no matter where your GKE cluster is running.The policies that affect access to resources are a combination of built-inKubernetes controls and Google Cloud specific controls.
In addition to VPC firewalls and VPC Service Controls,GKE uses namespaces,role-based access control (RBAC), and workload identities to manage policies that affect access to resources.
Namespaces
Namespaces are virtual clusters that are backed by the same physical cluster, and theyprovide a scope for names. Names of resources must be unique within a namespace,but you can use the same name in different namespaces. Namespaces let you useresource quotas to divide cluster resources between multiple users.
RBAC
RBAC includes the following features:
- Fine-grained control over how users access the API resources that arerunning on your cluster.
- Lets you create detailed policies that define which operationsand resources you allow users and service accounts to access.
- Can control access for Google Accounts, Google Cloudservice accounts, and Kubernetes service accounts.
- Lets you create RBAC permissions that apply to your entire cluster or tospecific namespaces within your cluster.
- Cluster-wide permissions are useful for limiting access to specificAPI resources for certain users. These API resources include securitypolicies and secrets.
- Namespace-specific permissions are useful if, for example,you have multiple groups of users who operate within their ownrespective namespaces. RBAC can help you ensure that users only haveaccess to cluster resources within their own namespace.
- A role that can only be used to grant access to resources within asingle namespace.
- A role that contains rules that represent a set of permissions.Permissions are purely additive, and there are no deny rules.
IAM and Kubernetes RBAC are integrated so that users areauthorized to perform actions if they have sufficient permissions according toeither tool.
Figure 1 shows how to use IAM with RBAC andnamespaces to implement policies.
Figure 1 shows the following policy implementations:
- At the project level, IAM defines roles for clusteradministrators to manage clusters and to let container developers accessAPIs within clusters.
- At the cluster level, RBAC defines permissions on individual clusters.
- At the namespace level, RBAC defines permissions on namespaces.
Workload identity
In addition to RBAC and IAM, you also need to understand theimpact of workload identities.Workload Identity lets you configure aKubernetes service account to act as aGoogle service account.Any application that runs as the Kubernetes service account automaticallyauthenticates as the Google service account when accessing Google CloudAPIs. This authentication lets you assign fine-grained identity andauthorization for applications in your cluster.
Workload Identity Federation for GKE relies on IAM permissions to control whatGoogle Cloud APIs your GKE application can access.For example, if IAM permissions change, aGKE application might become unable to write toCloud Storage.
Troubleshooting tools
This section describes the tools that are available to help you troubleshootyour policies. You can use different products and features to apply acombination of policies. For example, you can use firewalls and subnets tomanage communication between resources within your environment and within anysecurity zones that you have defined. You can also use IAM torestrict who can access what within the security zone and anyVPC Service Controls zones that you have defined.
Logs
When a problem occurs, typically the first place to start troubleshooting is tolook at logs. The Google Cloud logs that provide insight intoaccess-related issues are Cloud Audit Logs, Firewall Rules Logging, andVPC Flow Logs.
Cloud Audit Logs
Cloud Audit Logs consists of the following audit log streams for each project, folder, andorganization: Admin Activity, Data Access, and System Event. Google Cloudservices write audit log entries to these logs to help you identify which userperformed an action within your Google Cloud projects, where they did it, and when.
- Admin Activity logs contain log entries for API calls or otheradministrative actions that modify the configuration or metadata ofresources. Admin Activity logs are always enabled. For information aboutAdmin Activity logs pricing and quotas, see theCloud Audit Logs overview.
- Data Access logs record API calls that create, modify, or readuser-provided data. Data Access audit logs are disabled by default, exceptfor BigQuery. The Data Access logs can grow to be large, andcan incur costs. For information about Data Access logs usage limits, seeQuotas and limits.For information about potential costs, seePricing.
- System Event logs contain log entries for when Compute Engineperforms a system event. For example, eachlive migration is recorded as a system event. For information about System Event logspricing and quotas, see theCloud Audit Logs overview.
InCloud Logging,theprotoPayload field contains anAuditLog object that stores the audit logging data. For an example of an audit log entry,see thesample audit log entry.
To viewAdmin Activity audit logs,you must have either the Logs Viewer role (roles/logging.viewer) or the basicViewer role (roles/viewer). Where possible, select the role with the leastprivileges required to complete the task.
Individual audit log entries are stored for aspecified length of time.For longer retention, you can export the log entries to Cloud Storage,BigQuery, or Pub/Sub. To export log entries fromall the projects, folders, and billing accounts of your organization, you canuseaggregated exports.Aggregated exports provide you with a centralized way to review logs across theorganization.
To use your audit logs to help with troubleshooting, do the following:
- Ensure that you have the required IAM roles to view thelogs. If you export the logs, you also need permissions toview the exported logs in the sink.
- Follow thebest practices for using audit logs to meet your audit strategy.
- Select a team strategy to view logs. There areseveral ways to view logs in Cloud Audit Logs, and everyone on your troubleshooting teamshould use the same method.
- Use theGoogle Cloud console Activity page to get a high-level view of your activity logs.
- View exported logs from the sink that they were exported to. Logs thatare outside the retention period are only visible in the sink. You can alsouse exported logs to do a comparison investigation, for example, to a timewhen everything worked as expected.
Firewall Rules Logging
Firewall Rules Logging lets you audit, verify, and analyze the effects of your firewall rules. Forexample, you can determine if a firewall rule that is designed to deny trafficis functioning as intended.
You enable Firewall Rules Logging individually for each firewall rulewhose connections you need to log. Firewall Rules Logging is an optionfor any firewall rule, regardless of the action (allow or deny) or direction(ingress or egress) of the rule. Firewall Rules Logging can generate alot of data.Firewall Rules Logging has a charge associated with it, so you need to carefully plan what connections you want tolog.
Determine where you want to store your firewall logs. If you want anorganization-wide view of your logs, export the firewall logs to the same sinkas your audit logs. Usefilters to search for specific firewall events.
Firewall Insights
Firewall Insights provides reports that contain information about firewall usage and the impact ofvarious firewall rules on your VPC network. You can useFirewall Insights to verify that firewall rules allow or blocktheir intended connections.
You can also use Firewall Insights to detect firewall rules thatare shadowed by other rules. Ashadowed rule is a firewall rule that has allof its relevant attributes, such as IP address range and ports, overlapped byattributes from one or more other firewall rules that have higher or equalpriority. Shadowed rules are calculated within 24 hours after you enableFirewall Rules Logging.
When you enable Firewall Rules Logging, Firewall Insightsanalyzes logs to suggest insights for any deny rule that is used in theobservation period that you specify (by default, the last 24 hours). The denyrule insights provide you with firewall packet-drop signals. You can use thepacket-drop signals to verify that the dropped packets are expected due tosecurity protections, or that dropped packets are unexpected due to issues suchas network misconfigurations.
VPC Flow Logs
VPC Flow Logs records a sample of network flows sent from and received by VM instances.VPC Flow Logs covers traffic that affects a VM. All egress(outgoing) traffic is logged, even if an egress deny firewall rule blocks thetraffic. Ingress (incoming) traffic is logged if an ingress allow firewall rulepermits the traffic. Ingress traffic isn't logged if an ingress deny firewallrule blocks the traffic.
Flow logs are collected for each VM connection at specific intervals. All thesampled packets collected for a given interval for a given connection—an aggregationinterval—are aggregated into a single flow log entry. The log flow entryis then sent to Cloud Logging.
VPC Flow Logs is enabled or disabled for each VPC subnet. When youenable VPC Flow Logs, it generates a lot of data. We recommendthat you carefully manage the subnets that you enable VPC Flow Logson. For example, we recommend that you don't enable flow logs for a sustainedperiod on subnets that are used by development projects. You canquery VPC Flow Logs directly by using Cloud Logging or the exported sink. When you troubleshootperceived traffic-related issues, you can use VPC Flow Logs to seewhether traffic is leaving or entering a VM through the expected port.
Alerting
Alerts let you get timely notification of any out-of-policy events that mightaffect access to your Google Cloud resources.
Real-time notifications
Cloud Asset Inventory keeps a five-week history of Google Cloudasset metadata. An asset is asupported Google Cloud resource.Supported resources include IAM, Compute Engine withassociated network features such as firewall rules and GKEnamespaces, and role and cluster role bindings. All the preceding resourcesaffect access to Google Cloud resources.
To monitor deviations from your resource configurations, such as firewall rulesand forwarding rules, you can subscribe toreal-time notifications.If your resource configurations change, real-time notifications immediately senda notification throughPub/Sub.Notifications can alert you to any issues early, preempting a support call.
Cloud Audit Logs and Cloud Run functions
To complement the use of real-time notifications, you can monitorCloud Logging and alert on calls to sensitive actions. For example, you cancreate aCloud Logging sink thatfilters calls to theSetIamPolicy at the organization level. The sink sends logs to aPub/Sub topic that you can use to triggerCloud Run function.
Connectivity Tests
To determine if an access problem is network-related or permission-related, usethe Network Intelligence CenterConnectivity Tests tool. Connectivity Tests is a static configuration analyzer anddiagnostics tool that lets you check connectivity between a source anddestination endpoint. Connectivity Tests helps you identify the rootcause for network-related access problems that are associated with yourGoogle Cloud network configuration.
Connectivity Tests performs tests that include your VPC network,VPC Network Peering, and VPN tunnels to your on-premises network. For example,Connectivity Tests might identify that a firewall rule is blockingconnectivity. For more information, seeCommon use cases.
Policy Troubleshooter
Many tasks in Google Cloud require an IAM role andassociated permissions. We recommend that you check what permissions arecontained within a role and check for each permission that's required tocomplete a task. For example, to use Compute Engine images to create aninstance, a user needs thecompute.imageUser role, which includes nine permissions. Therefore, the user must have acombination of roles and permissions that include all nine permissions.
Policy Troubleshooter is a Google Cloud console tool that helps you debug why a user or serviceaccount doesn't have permission to access a resource. To troubleshoot accessproblems, you use the IAM part of thePolicy Troubleshooter.
For example, you might want to check why a particular user can create objectsin buckets in a project while another user can't. ThePolicy Troubleshooter can help you see what permissions the firstuser has that the second user doesn't have.
The Policy Troubleshooter requires the following inputs:
- Principal (individual user, service account, or groups)
- Permission (note that these are the underlying permissions, not theIAM roles)
- Resource
IAM Recommender
Although IAM Recommender is a policy enforcementcontrol as described in the previousRecommender section, you can alsouse it as a troubleshooting tool. Recommender runs a daily jobthat analyzes IAM access log data and the permissions grantedfrom the previous 60 days. You can use Recommender to checkwhether a recommendation was approved and applied recently that could haveaffected a user's access to a previously allowed resource. In this case, you cangrant the permissions that were removed.
Escalating to Customer Care
When you troubleshoot access-related problems, it's important to have a goodinternal support process and a well-defined process for escalating toCloud Customer Care.This section describes an example support setup and how you can communicate withCustomer Care to help them resolve your issues quickly.
If you're unable to resolve a problem by using the tools described in thisdocument, a clearly defined support process helps Customer Care totroubleshoot your issues. We recommend that you have a systematic approach totroubleshooting, as described in theeffective troubleshooting chapter of Google'sSite Reliability Engineering (SRE) book.
We recommend that your internal support process does the following:
- Detail the procedures to be followed if there is a problem.
- Have a clearly defined escalation path.
- Set up an on-call process.
- Create an incident response plan.
- Set up a bug tracking or help desk system.
- Ensure that your support personnel have been authorized to communicatewith Customer Care and are named contacts.
- Communicate support processes to internal staff, including how tocontact Google Cloud named contacts.
- Regularly analyze support issues, iterate, and improve based on things thatyou learned.
- Include a standardized retrospective form.
If you need to escalate to Customer Care, have the followinginformation available to share with Customer Care when troubleshootingaccess issues:
- The identity (user or service account email) that is requesting access.
- Whether this issue impacts all identities or only some.
- If only some identities are impacted, provide an example identity thatworks and an example identity that fails.
- Whether the identity was recently recreated.
- The resource that the user is attempting to access (include project ID).
- The request or method that is being called.
- Provide a copy of the request and response.
- The permissions that were granted to the identity for this access.
- Provide a copy of the IAM policy.
- The source (location) from which the identity is attempting to accessresources. For example, if they are attempting access from aGoogle Cloud resource (such as a Compute Engine instance), theGoogle Cloud console, the Google Cloud CLI, Cloud Shell, or from anexternal source such as on-premises or internet.
- If the source is from another project, provide the source project ID.
- The time (timestamp) when the error first occurred and whether it'sstill an issue.
- The last known time that the identity successfully accessed the resource(include timestamps).
- Any changes that were made before the issue started (include timestamps).
- Any errors that are recorded in Cloud Logging. Before you share withCustomer Care, make sure that you redact sensitive data such asaccess tokens, credentials, credit card numbers.
What's next
For more reference architectures, diagrams, and best practices, explore theCloud Architecture Center.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-17 UTC.