Scaling Limits for Cloud Service Mesh on GKE

This document describes the scaling limits of the control plane for managedCloud Service Mesh architectures on GKE so you can makeinformed decisions regarding your deployments.

Overview

The scalability of Cloud Service Mesh on GKE depends on theefficient operation of its two main components, the data plane and the controlplane. This document focuses on scaling limits of the control plane. Refer toScalability Best Practicesfor data plane scalability best practices.

Some of thescaling limits documented are enforced by quota restrictions. Exceeding them willrequire quota increase requests. Others are notstrictly enforced, but can lead to undefined behaviorand performance if exceeded.

To understand how Istio resources are translated to Google Cloud resources,refer to theUnderstanding API resourcesguide first.

Note: The limitations documented in this guide are only applicable to theTRAFFIC_DIRECTOR control plane implementation.

Service scaling limits

Service scaling is limited along two dimensions

Note that once Cloud Service Mesh is enabled for a particular membership (i.eGKE cluster),all kubernetes services in the cluster aretranslated to Cloud Service Mesh services, including those that target workloadswithout a Cloud Service Mesh sidecar. Cloud Service Mesh creates ZonalNetwork Endpoint Groups for all services in the GKE cluster. Ifthe cluster is regional, network endpoint groups are created for all node poolzones in the region.

Note: At larger scales, configuration changes can take longer to propagate to the data plane. Latency increases with the number of configurations being changed at the same time.

Cloud Service Mesh services versus Kubernetes services

Cloud Service Mesh services are not the same as Kubernetes services in thatCloud Service Mesh services are one service per port.

For example, this Kubernetes service is internally translated into twoCloud Service Mesh services, one for each port.

apiVersion:v1kind:Servicemetadata:name:my-servicespec:selector:app:my-appports:-port:80targetPort:80protocol:TCPname:http-port:443targetPort:443protocol:TCPname:https

Cloud Service Mesh service usage can be tracked by looking at the usage for theBackendService Quota in the project

Multi-cluster deployments

When a single mesh is deployed acrossworkloads in different Google Cloudclusters, all Cloud Service Mesh service resources are by default replicated across all clusters. This means that a Kubernetes service deployed in a single cluster creates several Cloud Service Mesh services, one for each cluster.

Destination rule subsets

When configuring theIstio Destination Rule APIwith subsets, each subset may result in the generation of multiple newCloud Service Mesh services.

For example, consider the followingDestinationRule that targets thekubernetes service defined earlier:

apiVersion:networking.istio.io/v1alpha3kind:DestinationRulemetadata:name:my-service-destinationrulespec:host:my-servicesubsets:-name:testversionlabels:version:v3-name:prodversionlabels:version:v2

New synthetic services will be created for each of the subsets defined. If theoriginal Kubernetes service created two Cloud Service Mesh services, theDestinationRule will create 4 additionalCloud Service Mesh services, 2 for each subset, resulting in a total of 6Cloud Service Mesh services.

Multi-project deployments

When a single mesh is deployed across workloads in different Google Cloudprojects, all Cloud Service Mesh service resources are created in the fleethost project. This means they are all subject to the Cloud Service Meshscalability limitations in the fleet host project.

Kubernetes headless services

Kubernetes headless services have a lower limit compared toregular services. Cloud Service Mesh only supports 50 headlessCloud Service Mesh services per cluster.Seethe Kubernetes networkingdocumentationfor an example.

Istio Sidecar Resources

ForIstio Sidecar API, the following limits apply:

  • Sidecar withoutworkloadSelector: 150 per cluster

  • Sidecar withworkloadSelector: 20 per cluster

Endpoint scaling limits

Endpoint scaling limits are typically per the following:

  • Cloud Service Mesh service

  • GKE cluster

Regular Kubernetes services

Endpoints per NEG quotas affectthe maximum number of endpoints that can belong to a single Kubernetes service.

Kubernetes headless services

For Kubernetes headless service, Cloud Service Mesh supportsnot more than 36 endpoints per headless service.Refer tothe Kubernetes networkingdocumentationfor an example.

GKE cluster limits

Cloud Service Mesh supports up to 5000 endpoints (Pod IPs) per cluster.

Gateway scaling limit

Whenusing IstioGateways,especially toterminate HTTPS connections using TLS credentials in Kubernetessecrets,Cloud Service Mesh supports at most the following number of pods:

  • 1500 gateway pods when using Regional GKE clusters

  • 500 gateway pods when using Zonal or Autopilot GKEclusters

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.