Troubleshoot proxyless gRPC deployments

This document provides information to help you resolve configuration issueswhen you deploy proxyless gRPC services with Cloud Service Mesh. Forinformation about how to use the Client Status Discovery Service (CSDS) API tohelp you investigate issues with Cloud Service Mesh, seeUnderstanding Cloud Service Mesh client status.

Note: This guide only supports Cloud Service Mesh with Google Cloud APIs anddoes not support Istio APIs. For more information see,Cloud Service Mesh overview.

Troubleshooting RPC failures in a gRPC application

There are two common ways to troubleshoot remote procedure call (RPC) failuresin a gRPC application:

  1. Review the status returned when an RPC fails. Usually, the statuscontains enough information to help you understand the cause of an RPCfailure.

  2. Enable logging in gRPC runtime. Sometimes you need to review the gRPCruntime logs to understand a failure that might not get propagated back to anRPC return status. For example, when an RPC fails with a status indicatingthat the deadline has been exceeded, the logs can help you to understand theunderlying failure that caused the deadline to be exceeded.

    Different language implementations of gRPC have different ways to enablelogging in the gRPC runtime:

    • gRPC in Java: gRPC usesjava.util.logging for logging. Setio.grpc.level to theFINE level to enable sufficient verbose loggingin gRPC runtime. A typical way to enable logging in Java is to load thelogging config from a file and provide the file location to JVM by usinga command-line flag. For example:

      # Create a file called logging.properties with the following contents:handlers=java.util.logging.ConsoleHandlerio.grpc.level=FINEio.grpc.xds.level=FINESTjava.util.logging.ConsoleHandler.level=ALLjava.util.logging.ConsoleHandler.formatter=java.util.logging.SimpleFormatter# Pass the location of the file to JVM by using this command-line flag:-Djava.util.logging.config.file=logging.properties

      To enable logging specific to xDS modules, setio.grpc.xds.level toFINE. To see more detailed logging, set the level toFINER orFINEST.

    • gRPC in Go: Turn on logging bysetting environment variables.

      GRPC_GO_LOG_VERBOSITY_LEVEL=99 GRPC_GO_LOG_SEVERITY_LEVEL=info
    • gRPC in C++: To enable logging with gRPC in C++, see the instructions inTroubleshooting gRPC.To enable logging specific to xDS modules, enable the following tracers byusing theGRPC_TRACEenvironment variable forxds_client,xds_resolver,cds_lb,eds_lb,priority_lb,weighted_target_lb, andlrs_lb.

    • gRPC in Node.js: To enable logging with gRPC in Node.js, see the instructions inTroubleshooting gRPC-JS.To enable logging specific to xDS modules, enable the following tracers byusing theGRPC_TRACEenvironment variable forxds_client,xds_resolver,cds_balancer,eds_balancer,priority,andweighted_target.

Depending on the error in the RPC status or in the runtime logs, your issuemight fall in one of the following categories.

Unable to connect to Cloud Service Mesh

To troubleshoot connection issues, try the following:

  • Check that the server_uri value in the bootstrap file istrafficdirector.googleapis.com:443.
  • Ensure that the environment variableGRPC_XDS_BOOTSTRAP is defined andpointing to the bootstrap file.
  • Ensure that you are usingxds scheme in the URI when you create a gRPCchannel.
  • Make sure that you granted therequiredIAM permissionsfor creating compute instances and modifying a network in a project.
  • Make sure that youEnable the service account to access the Traffic Director API. Under theGoogle Cloud console APIs & services for your project, look for errors in the Traffic Director API.
  • Confirm that theservice account has thecorrect permissions.The gRPC applications running in the VM or the Pod use the service account ofthe Compute Engine VM host or the Google Kubernetes Engine (GKE) nodeinstance.
  • Confirm that the API access scope of the Compute Engine VMs orGKE clusters is set to allow full access to theCompute Engine APIs. Do this byspecifying the following when you create the VMs or cluster:

    --scopes=https://www.googleapis.com/auth/cloud-platform
  • Confirm that you can accesstrafficdirector.googleapis.com:443 from the VM.If there are access issues, the possible reasons include a firewall preventingaccess totrafficdirector.googleapis.com over TCP port443 or DNSresolution issues for thetrafficdirector.googleapis.com hostname.

Hostname specified in the URI cannot be resolved

You might encounter an error message like the following one in your logs:

[Channel<1>: (xds:///my-service:12400)] Failed to resolve name. status=Status{code=UNAVAILABLE, description=NameResolver returned no usable address. addrs=[], attrs={}

To troubleshoot hostname resolution issues, try the following:

  • Ensure that you are using asupported gRPC version and language.
  • Ensure that the port used in the URI to create a gRPC channel matches the portvalue in the forwarding rule used in your configuration. If a port is notspecified in the URI, then the value80 is used to match a forwarding rule.
  • Ensure that the hostname and port used in the URI to create a gRPC channelexactly matches a host rule in the URL map used in your configuration.
  • Ensure that the same host rule is not configured in more than one URL map.
  • Ensure that no wildcards are in use. Host rules containing a* wildcardcharacter are ignored.

RPC fails because the service isn't available

To troubleshoot RPC failures when a service isn't available, try thefollowing:

  • Check the overall status of Cloud Service Mesh and the status of yourbackend services in theGoogle Cloud console:

    • In theAssociated routing rule maps column, ensure that the correct URLmaps reference the backend services. Click the column to check that thebackend services specified in the host matching rules are correct.
    • In theBackends column, check that the backends associated with yourbackend services are healthy.
    • If the backends are unhealthy, click the corresponding backend service andensure that the correct health check is configured. Health checks commonlyfail because of incorrect or missing firewall rules or a mismatch in thetags specified in the VM and in the firewall rules. For more information,seeCreating health checks.
  • For gRPC health checks to work correctly, the gRPC backends must implement thegRPC health checking protocol.If this protocol is not implemented, use a TCP health check instead. Don'tuse an HTTP, HTTPS, or HTTP/2 health check with gRPC services.

  • When you use instance groups, ensure that the named port specified in theinstance group matches the port used in the health check. When you use networkendpoint groups (NEGs), ensure that the GKE service spec hasthe correct NEG annotation, and the health check is configured to use the NEGserving port.

  • Check that the endpoint protocol is configured asGRPC.

RPC fails because the load balancing policy is not supported

You might encounter an error message like one of the following in your logs:

error parsing "CDS" response: resource "cloud-internal-istio:cloud_mp_248715":unexpected lbPolicy RING_HASH in response
error={"description":"errors parsing CDS response","file":"external/com_github_grpc_grpc/src/core/ext/xds/xds_api.cc", "file_line":3304,"referenced_errors":[{"description":"cloud-internal-istio:cloud_mp_248715: LB policy is not supported."
WARNING: RPC failed: Status{code=INTERNAL, description=Panic! This is a bug!, cause=java.lang.NullPointerException: providerat com.google.common.base.Preconditions.checkNotNull(Preconditions.java:910)at io.grpc.internal.ServiceConfigUtil$PolicySelection.<init>(ServiceConfigUtil.java:418)at io.grpc.xds.CdsLoadBalancer2$CdsLbState.handleClusterDiscovered(CdsLoadBalancer2.java:190)

This is because RING_HASH is not supported by the particular language andversion of the client being used. To fix the problem, update the backendservice configuration to use only supported load balancing policies, or upgradethe client to a supported version.For supported client versions, seexDS features in gRPC.

Security configuration is not generated as expected

If you are configuring service security and the security configuration is notgenerated as expected, examine the endpoint policies in your deployment.

Cloud Service Mesh does not support scenarios where there are two or more endpointpolicy resources that match equally to an endpoint, for example, twopolicies with the same labels and ports, or two or more policies with differentlabels that match equally with an endpoint's labels. For more information on howendpoint policys are matched to an endpoint's labels, see theAPIsfor EndpointPolicy.EndpointMatcher.MetadataLabelMatcher.In such situations, Cloud Service Mesh does not generate security configurationfrom any of the conflicting policies.

Troubleshoot the health of your service mesh

This guide provides information to help you resolve Cloud Service Meshconfiguration issues.

Cloud Service Mesh behavior when most endpoints are unhealthy

For better reliability, when 99% of endpoints are unhealthy,Cloud Service Mesh configures the data plane to disregard the healthstatus of the endpoints. Instead, the data plane balances traffic among all ofthe endpoints because it is possible that the serving port is still functional.

Unhealthy backends cause suboptimal distribution of traffic

Cloud Service Mesh uses the information in theHealthCheck resourceattached to a backend service to evaluate the health of your backends.Cloud Service Mesh uses this health status to route traffic to theclosest healthy backend. If some of your backends are unhealthy, traffic mightcontinue to be processed, but with suboptimal distribution. For example, trafficmight flow to a region where healthy backends are still present, but which ismuch farther from the client, introducing latency. To identify and monitor thehealth status of your backends, try the following steps:

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.