Resolving proxy issues in Cloud Service Mesh

This document explains common Cloud Service Mesh problems and how to resolve them.If you need additional assistance, seeGetting support.

Connection Refused when reaching an endpoint with Istio

You might intermittently experience connection refused (ECONNREFUSED) errorswith communication from your clusters to your endpoints, for example MemorystoreRedis, CloudSQL, or any external service your application workload needs toreach.

This can occur when your application workload initiates faster than theistio-proxy (Envoy) container and tries to reach an external endpoint. Becauseat this stage istio-init (initContainer) has already executed, there areiptables rules in place redirecting all outgoing traffic toEnvoy. Sinceistio-proxy is not ready yet, the iptables rules will redirect traffic to asidecar proxy that is not yet started and therefore, the application gets theECONNREFUSED error.

The following steps detail how to check if this is the error you areexperiencing:

  1. Check the stackdriver logs with the following Filter to identify which podshad the problem.

    The following example shows a typical error message:

    Error: failed to create connection to feature-store redis, err=dial tcp   192.168.9.16:19209: connect: connection refused[ioredis] Unhandled error event: Error: connect ECONNREFUSED
  2. Search for an occurrence of the problem. If you are using legacy Stackdriver,then useresource.type="container".

    resource.type="k8s_container"textPayload:"$ERROR_MESSAGE$"
  3. Expand the latest occurrence to obtain the name of the pod and then make noteof thepod_name underresource.labels.

  4. Obtain the first occurrence of the issue for that pod:

    resource.type="k8s_container"resource.labels.pod_name="$POD_NAME$"

    Example output:

    E 2020-03-31T10:41:15.552128897Zpost-feature-service post-feature-service-v1-67d56cdd-g7fvb failed to createconnection to feature-store redis, err=dial tcp 192.168.9.16:19209: connect:connection refused post-feature-service post-feature-service-v1-67d56cdd-g7fvb
  5. Make note of the timestamp of the first error for this pod.

  6. Use the following filter to see the pod startup events.

    resource.type="k8s_container"resource.labels.pod_name="$POD_NAME$"

    Example output:

    I 2020-03-31T10:41:15Z spec.containers{istio-proxy} Container image "docker.io/istio/proxyv2:1.3.3" already present on machine  spec.containers{istio-proxy}I 2020-03-31T10:41:15Z spec.containers{istio-proxy} Created container  spec.containers{istio-proxy}I 2020-03-31T10:41:15Z spec.containers{istio-proxy} Started container  spec.containers{istio-proxy}I 2020-03-31T10:41:15Z spec.containers{APP-CONTAINER-NAME} Created container  spec.containers{APP-CONTAINER-NAME}W 2020-03-31T10:41:17Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503  spec.containers{istio-proxy}W 2020-03-31T10:41:26Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503  spec.containers{istio-proxy}W 2020-03-31T10:41:28Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503  spec.containers{istio-proxy}W 2020-03-31T10:41:31Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503  spec.containers{istio-proxy}W 2020-03-31T10:41:58Z spec.containers{istio-proxy} Readiness probe failed: HTTP probe failed with statuscode: 503  spec.containers{istio-proxy}
  7. Use the timestamps of errors and istio-proxy startup events to confirm theerrors are happening whenEnvoy is not ready.

    If the errors occur while the istio-proxy container is not ready yet, it isnormal to obtain connection refused errors. In the preceding example, the podwas trying to connect to Redis as soon as2020-03-31T10:41:15.552128897Zbut by2020-03-31T10:41:58Z istio-proxy was still failing readiness probes.

    Even though the istio-proxy container started first, it is possible that itdid not become ready fast enough before the app was already trying to connectto the external endpoint.

    If this is the problem you are experiencing, then continue through thefollowing troubleshooting steps.

  8. Annotate the config at the pod level. This isonly available at the podlevel and not at a global level.

    annotations:proxy.istio.io/config:'{ "holdApplicationUntilProxyStarts": true }'
  9. Modify the application code so that it checks ifEnvoy is ready before ittries to make any other requests to external services. For example, onapplication start, initiate a loop that makes requests to the istio-proxyhealth endpoint and only continues once a 200 is obtained. The istio-proxyhealth endpoint is as follows:

    http://localhost:15020/healthz/ready

Race condition during sidecar injection between vault and istio

When usingvault for secrets management, sometimesvault injects sidecarbeforeistio, causing that Pods get stuck inInit status. When this happens,the Pods created get stuck in Init status after restarting any deployment ordeploying a new one. For example:

E 2020-03-31T10:41:15.552128897Zpost-feature-service post-feature-service-v1-67d56cdd-g7fvb failed to createconnection to feature-store redis, err=dial tcp 192.168.9.16:19209: connect:connection refused post-feature-service post-feature-service-v1-67d56cdd-g7fvb

This issue is caused by a race condition, both Istio andvault inject thesidecar and Istio must be the last doing this, theistio proxy is not runningduring init containers. Theistio init container sets up iptables rules toredirect all traffic to the proxy. Since it is not running yet, those rulesredirect to nothing, blocking all traffic. This is why the init container mustbe last, so the proxy is up and running immediately after the iptables rules areset up. Unfortunately, the order is not deterministic, so if Istio is injectedfirst it breaks.

To troubleshoot this condition, allow the IP address ofvault so the trafficgoing to the Vault IP is not redirected to the Envoy Proxy which is not readyyet and therefore blocking the communication. To achieve this, a new annotationnamedexcludeOutboundIPRanges should be added.

For managed Cloud Service Mesh, this is only possible at Deployment or Pod levelunderspec.template.metadata.annotations, for example:

apiVersion: apps/v1kind: Deployment.........spec:  template:    metadata:      annotations:        traffic.sidecar.istio.io/excludeOutboundIPRanges:

For in-cluster Cloud Service Mesh, there is an option to set it as a global one withan IstioOperator underspec.values.global.proxy.excludeIPRanges, for example:

apiVersion: install.istio.io/v1alpha1kind: IstioOperatorspec:  values:    global:      proxy:        excludeIPRanges: ""

After adding the annotation, restart your workloads.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.