Resolving workload startup issues in Cloud Service Mesh

This document explains common Cloud Service Mesh problems and how to resolvethem. If you need additional assistance, seeGetting support.

Gateway fails to start with distroless proxy when a privileged port is exposed

By default distroless proxy starts with non-root permissions which in some casesmight cause bind failures on privileged ports. If you see errors similar tothe following during the proxy startup, then additional securityContext needsto be applied for a gateway deployment.

  Error adding/updating listener(s) 0.0.0.0_80: cannot bind '0.0.0.0:80': Permission denied

The following example is the yaml for an egress gateway deployment:

apiVersion:apps/v1kind:Deploymentmetadata:name:istio-egressgatewayspec:selector:matchLabels:app:istio-egressgatewayistio:egressgatewaytemplate:metadata:annotations:# This is required to tell Anthos Service Mesh to inject the gateway with the# required configuration.inject.istio.io/templates:gatewaylabels:app:istio-egressgatewayistio:egressgatewayspec:containers:-name:istio-proxyimage:auto# The image will automatically update each time the pod starts.resources:limits:cpu:2000mmemory:1024Mirequests:cpu:100mmemory:128Mi# Allow binding to all ports (such as 80 and 443)securityContext:sysctls:-name:net.ipv4.ip_unprivileged_port_startvalue:"0"serviceAccountName:istio-egressgateway

Connection Refused when reaching a Cloud Service Mesh endpoint

You might intermittently experience connection refused (ECONNREFUSED) errorswith communication from your clusters to your endpoints, for exampleMemorystore Redis, Cloud SQL, or any external service your applicationworkload needs to reach.

This can occur when your application workload initiates faster than theistio-proxy (Envoy) container and tries to reach an external endpoint. Becauseat this stage istio-init (initContainer) has already executed, there areiptables rules in place redirecting all outgoing traffic toEnvoy. Sinceistio-proxy is not ready yet, the iptables rules will redirect traffic to asidecar proxy that is not yet started and therefore, the application gets theECONNREFUSED error.

The following steps detail how to check if this is the error you areexperiencing:

  1. Check the stackdriver logs with the following Filter to identify which podshad the problem.

    The following example shows a typical error message:

    Error: failed to create connection to feature-store redis, err=dial tcp   192.168.9.16:19209: connect: connection refused[ioredis] Unhandled error event: Error: connect ECONNREFUSED
  2. Search for an occurrence of the problem. If you are using legacy Stackdriver,then useresource.type="container".

    resource.type="k8s_container"textPayload:"$ERROR_MESSAGE$"
  3. Expand the latest occurrence to obtain the name of the pod and then make noteof thepod_name underresource.labels.

  4. Obtain the first occurrence of the issue for that pod:

    resource.type="k8s_container"resource.labels.pod_name="$POD_NAME$"

    Example output:

    E 2020-03-31T10:41:15.552128897Zpost-feature-service post-feature-service-v1-67d56cdd-g7fvb failed to createconnection to feature-store redis, err=dial tcp 192.168.9.16:19209: connect:connection refused post-feature-service post-feature-service-v1-67d56cdd-g7fvb
  5. Make note of the timestamp of the first error for this pod.

  6. Use the following filter to see the pod startup events.

    resource.type="k8s_container"resource.labels.pod_name="$POD_NAME$"

    Example output:

    I 2020-03-31T10:41:15Z spec.containers{istio-proxy} Container image "docker.io/istio/proxyv2:1.3.3" already present on machine  spec.containers{istio-proxy}I 2020-03-31T10:41:15Z spec.containers{istio-proxy} Created container  spec.containers{istio-proxy}I 2020-03-31T10:41:15Z spec.containers{istio-proxy} Started container  spec.containers{istio-proxy}I 2020-03-31T10:41:15Z spec.containers{APP-CONTAINER-NAME} Created container  spec.containers{APP-CONTAINER-NAME}W 2020-03-31T10:41:17Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503  spec.containers{istio-proxy}W 2020-03-31T10:41:26Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503  spec.containers{istio-proxy}W 2020-03-31T10:41:28Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503  spec.containers{istio-proxy}W 2020-03-31T10:41:31Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503  spec.containers{istio-proxy}W 2020-03-31T10:41:58Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503  spec.containers{istio-proxy}
  7. Use the timestamps of errors and istio-proxy startup events to confirm theerrors are happening whenEnvoy is not ready.

    If the errors occur while the istio-proxy container is not ready yet, it isnormal to obtain connection refused errors. In the preceding example, the podwas trying to connect to Redis as soon as2020-03-31T10:41:15.552128897Zbut by2020-03-31T10:41:58Z istio-proxy was still failing readiness probes.

    Even though the istio-proxy container started first, it is possible that itdid not become ready fast enough before the app was already trying to connectto the external endpoint.

    If this is the problem you are experiencing, then continue through thefollowing troubleshooting steps.

  8. Annotate the config at the pod level. This isonly available at the podlevel and not at a global level.

    annotations:proxy.istio.io/config: '{ "holdApplicationUntilProxyStarts": true }'
  9. Modify the application code so that it checks ifEnvoy is ready before ittries to make any other requests to external services. For example, onapplication start, initiate a loop that makes requests to the istio-proxyhealth endpoint and only continues once a 200 is obtained. The istio-proxyhealth endpoint is as follows:

    http://localhost:15020/healthz/ready

Race condition during sidecar injection between Vault and Cloud Service Mesh

When usingvault for secrets management, sometimesvault injects sidecarbeforeistio, causing that Pods get stuck inInit status. When this happens,the Pods created get stuck in Init status after restarting any deployment ordeploying a new one. For example:

E 2020-03-31T10:41:15.552128897Zpost-feature-service post-feature-service-v1-67d56cdd-g7fvb failed to createconnection to feature-store redis, err=dial tcp 192.168.9.16:19209: connect:connection refused post-feature-service post-feature-service-v1-67d56cdd-g7fvb

This issue is caused by a race condition, both Istio andvault inject thesidecar and Istio must be the last doing this, theistio proxy is not runningduring init containers. Theistio init container sets up iptables rules toredirect all traffic to the proxy. Since it is not running yet, those rulesredirect to nothing, blocking all traffic. This is why the init container mustbe last, so the proxy is up and running immediately after the iptables rules areset up. Unfortunately, the order is not deterministic, so if Istio is injectedfirst it breaks.

To troubleshoot this condition, allow the IP address ofvault so the trafficgoing to the Vault IP is not redirected to the Envoy Proxy which is not readyyet and therefore blocking the communication. To achieve this, a new annotationnamedexcludeOutboundIPRanges should be added.

For managed Cloud Service Mesh, this is only possible at Deployment or Podlevel underspec.template.metadata.annotations, for example:

apiVersion: apps/v1kind: Deployment.........spec:  template:    metadata:      annotations:        traffic.sidecar.istio.io/excludeOutboundIPRanges:

For in-cluster Cloud Service Mesh, there is an option to set it as a globalone with an IstioOperator underspec.values.global.proxy.excludeIPRanges, forexample:

apiVersion: install.istio.io/v1alpha1kind: IstioOperatorspec:  values:    global:      proxy:        excludeIPRanges: ""

After adding the annotation, restart your workloads.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.