From edge to multi-cluster mesh: Globally distributed applications exposed through GKE Gateway and Cloud Service Mesh

This reference architecture describes the benefits of exposing applicationsexternally through Google Kubernetes Engine (GKE) Gateways running on multipleGKE clusters within a service mesh. This guide is intended forplatform administrators.

You can increase the resiliency and redundancy of your services by deployingapplications consistently across multiple GKE clusters, where each clusterbecomes an additional failure domain. For example, a service's compute infrastructure with a servicelevel objective (SLO) of 99.9% when deployed in a single GKE clusterachieves an SLO of 99.9999% when deployed across two GKE clusters(1 - (0.001)2). You can also provide users with an experience whereincoming requests are automatically directed to the least latent and availablemesh ingress gateway.

If you're interested in the benefits of exposingservice-mesh-enabled applications that run on a single cluster, seeFrom edge to mesh: Expose service mesh applications through GKE Gateway.

Architecture

The following architecture diagram shows how data flows through cloud ingressand mesh ingress:

TLS encryption from the client, a load balancer, and from the mesh.

The preceding diagram shows the following data flow scenarios:

  • From the client terminating at the Google Cloud load balancer using its ownGoogle-managed TLS certificate.
  • From the Google Cloud load balancer to the mesh ingress proxyusing its own self-signed TLS certificate.
  • From the mesh ingress gateway proxy to the workload sidecar proxiesusing service mesh-enabled mTLS.

This reference architecture contains the following two ingress layers:

  • Cloud ingress: in this reference architecture, you use theKubernetes Gateway API (and theGKE Gateway controller)to program the external, multi-cluster HTTP(S) load balancing layer. Theload balancer check the mesh ingress proxies across multipleregions, sending requests to the nearest healthy cluster. It alsoimplements aGoogle Cloud Armor security policy.
  • Mesh ingress: In the mesh, you perform health checks on the backendsdirectly so that you can run load balancing and traffic management locally.

When you use the ingress layers together, there are complementary roles foreach layer. To achieve the following goals, Google Cloud optimizes themost appropriate features from thecloud ingress layer and the mesh ingress layer:

  • Provide low latency.
  • Increase availability.
  • Use the security features of the cloud ingress layer.
  • Use the security, authorization, and observability features of the meshingress layer.

Cloud ingress

When paired with mesh ingress, the cloud ingresslayer is best used for edge security and global load balancing. Because thecloud ingress layer is integrated with the following services, it excels atrunning those services at the edge, outside the mesh:

  • DDoS protection
  • Cloud firewalls
  • Authentication and authorization
  • Encryption

The routing logic is typically straightforward at the cloud ingress layer.However, it can be more complex for multi-cluster and multi-regionenvironments.

Because of the critical function of internet-facing load balancers, thecloud ingress layer is likely managed by a platform team that has exclusivecontrol over how applications are exposed and secured on the internet. Thiscontrol makes this layer less flexible and dynamic than a developer-driveninfrastructure. Consider these factors when determining administrative accessrights to this layer and how you provide that access.

Mesh ingress

When paired with cloud ingress, the mesh ingress layer provides a point of entryfor traffic to enter the service mesh. The layer also provides backend mTLS,authorization policies, and flexible regex matching.

Deploying external application load balancing outside of the mesh with a meshingress layer offers significant advantages, especially for internet trafficmanagement. Although service mesh and Istio ingress gateways provide advancedrouting and traffic management in the mesh, some functions are better served atthe edge of the network. Taking advantage of internet-edge networking throughGoogle Cloud's external Application Load Balancer might provide significant performance,reliability, or security-related benefits over mesh-based ingress.

Note: This reference architecture refers to the routing for cloud ingresslayers as north-south routing. The routing between mesh ingress layers andservice layers, or the routing from service layers to service layers, isreferred to as east-west routing.

Products and features used

The following list summarizes of all the Google Cloud products andfeatures that this referenence architecture uses:

  • GKE:A managed Kubernetes service that you can use to deploy and operatecontainerized applications at scale using Google's infrastructure. For thepurpose of this reference architecture, each of the GKE clusters servingan application must be in the same fleet.
  • Fleets andmulti-cluster Gateways:Services that are used to create containerized applications at enterprisescale using Google's infrastructure and GKE.
  • Google Cloud Armor:A service that helps you to protect your applications and websites againstdenial of service and web attacks.
  • Cloud Service Mesh:A fully managed service mesh based on Envoy and Istio
  • Application Load Balancer:A proxy-based L7 load balancer that lets you run and scale your services.
  • Certificate Manager:A service that lets you acquire and manage TLS certificates for use withCloud Load Balancing.

Fleets

To manage multi-cluster deployments, GKEand Google Cloud use fleets to logically group and normalize Kubernetesclusters.

Using one or more fleets can help you uplevel management from individualclusters to entire groups of clusters. To reduce cluster-management friction,use the fleet principle of namespace sameness. For each GKE cluster in afleet, ensure that you configure all mesh ingress gateways the same way.

Also, consistently deploy application services so that the servicebalance-reader in the namespace account relates to an identical service in eachGKE cluster in the fleet. The principles of sameness and trust thatare assumed within a fleet are what let you use the full range offleet-enabled features in GKE and Google Cloud.

East-west routing rules within the service mesh and traffic policies arehandled at the mesh ingress layer. The mesh ingress layer is deployed on everyGKE cluster in the fleet. Configure each mesh ingress gateway in the samemanner, adhering to the fleet's principle of namespace sameness.

Although there's a single configuration cluster forGKE Gateway,you should synchronize your GKE Gateway configurationsacross all GKE clusters in the fleet.

If you need to nominate a new configuration cluster, useConfigSync.ConfigSync helps ensure that all such configurations are synchronized across allGKE clusters in the fleet and helps avoid reconciling with a non-currentconfiguration.

Mesh ingress gateway

Istio 0.8 introduced the meshingress gateway.The gateway provides a dedicated set of proxies whose ports are exposed totraffic coming from outside the service mesh. These mesh ingress proxies let youcontrol network exposure behavior separately from application routingbehavior.

The proxies also let you apply routing and policy to mesh-external trafficbefore it arrives at an application sidecar. Mesh ingress defines the treatmentof traffic when it reaches a node in the mesh, but external components mustdefine how traffic first arrives at the mesh.

To manage external traffic, you need a load balancer that's external to themesh. To automate deployment, this reference architecture usesCloud Load Balancing, which is provisioned through GKEGateway resources.

GKE Gateway and multi-cluster services

There are many ways to provide application access to clients that are outsidethe cluster. GKE Gateway is an implementation of theKubernetes Gateway API.GKE Gatewayevolves and improves the Ingress resource.

As you deploy GKE Gateway resources to yourGKE cluster, the Gateway controller watches the Gateway API resources.The controller reconciles Cloud Load Balancing resources to implementthe networking behavior that's specified by the Gateway resources.

When using GKE Gateway, the type of load balancer you useto expose applications to clients depends largely on the following factors:

  • Whether the backend services are in a single GKE cluster ordistributed across multiple GKE clusters (in the same fleet).
  • The status of the clients (external or internal).
  • The required capabilities of the load balancer, including the capabilityto integrate with Cloud Armor security policies.
  • The spanning requirements of the service mesh. Service meshes can spanmultiple GKE clusters or can be contained in a single cluster.

In Gateway, this behavior is controlled by specifyingthe appropriateGatewayClass.When referring to Gateway classes, those classes which can be used inmulti-cluster scenarios have a class name ending in-mc.

This reference architecture discusses how to expose application servicesexternally through an external Application Load Balancer. However, when usingGateway, you can also create a multi-cluster regional internal Application Load Balancer.

To deploy application services in multi-cluster scenarios you can define theGoogle Cloud load balancer components in the following two ways:

For more information about these two approaches to deploying applicationservices, seeChoose your multi-cluster load balancing API for GKE.

Note: Both multi-cluster Ingress and multi-cluster GKEGateway use multi-cluster services. Multi-cluster services can be deployedacross GKE clusters in a fleet, and are identical for the purposes ofcross-cluster service discovery. However, the manner in which these services areexposed across clusters within the fleet differ.

Multi Cluster Ingress relies on creatingMultiClusterService resources. Multi-cluster Gateway relies on creatingServiceExport resources, and referring toServiceImport resources.

When you use a multi-cluster Gateway, you can enable the additionalcapabilities of the underlying Google Cloud load balancer by creatingPolicies.The deployment guide associated with this reference architecture shows how toconfigure a Google Cloud Armor security policy to help protect backend services fromcross-site scripting.

These policy resources target the backend services in the fleet that areexposed across multiple clusters. In multi-cluster scenarios, all such policiesmust reference theServiceImport resource and API group.

Health checking

One complexity of using two layers of L7 load balancing is health checking. Youmust configure each load balancer to check the health of the next layer. TheGKE Gateway checks the health of the mesh ingressproxies, and the mesh, in return, checks the health of the applicationbackends.

  • Cloud ingress: In this reference architecture, you configure theGoogle Cloud load balancer through GKE Gatewayto check the health of the mesh ingress proxies on their exposed healthcheck ports. If a mesh proxy is down, or if the cluster, mesh, or region isunavailable, the Google Cloud load balancer detects this conditionand doesn't send traffic to the mesh proxy. In this case, traffic would berouted to an alternate mesh proxy in a different GKE cluster or region.
  • Mesh ingress: In the mesh application, you perform health checks onthe backends directly so that you can run load balancing and trafficmanagement locally.

Design considerations

This section provides guidance to help you use this reference architecture todevelop an architecture that meets your specific requirements for security andcompliance, reliability, and cost.

Security, privacy, and compliance

The architecture diagram in this document contains several security elements.The most critical elements are how you configure encryption and deploycertificates. GKE Gateway integrates withCertificate Manager for these security purposes.

Internet clients authenticate against public certificates and connect to theexternal load balancer as the first hop in the Virtual Private Cloud (VPC). Youcan refer to a Certificate ManagerCertificateMap in your Gateway definition.The next hop is between the Google Front End (GFE) and the mesh ingress proxy.That hop isencrypted by default.

Network-level encryption between the GFEs and their backends is appliedautomatically. If your security requirements dictate that the platform ownerretain ownership of the encryption keys, you can enable HTTP/2 with TLSencryption between the cluster gateway (the GFE) and the mesh ingress (the envoyproxy instance).

When you enable HTTP/2 with TLS encryption between the cluster gateway and themesh ingress, you can use a self-signed or a public certificate to encrypttraffic. You can use a self-signed or a public certificate because the GFEdoesn't authenticate against it.This additional layer of encryption is demonstrated in the deployment guideassociated with this reference architecture.

To help prevent the mishandling of certificates, don't reuse publiccertificates. Use separate certificates for each load balancer in the servicemesh.

To help create external DNS entries and TLS certificates, the deployment guidefor this reference architecture usesCloud Endpoints.Using Cloud Endpoints lets you create an externally availablecloud.googsubdomain. In enterprise-level scenarios, use a more appropriate domain name,and create an A record that points to the global Application Load Balancer IPaddress in your DNS service provider.

If the service mesh you're using mandates TLS, then all traffic between sidecarproxies and all traffic to the mesh ingress is encrypted. The architecturediagram shows HTTPS encryption from the client to the Google Cloud loadbalancer, from the load balancer to the mesh ingress proxy, and from the ingressproxy to the sidecar proxy.

Reliability and resiliency

A key advantage of the multi-cluster, multi-regional edge-to-mesh pattern isthat it can use all of the features of service mesh for east-west loadbalancing, like traffic between application services.

This reference architecture uses a multi-cluster GKEGateway to route incoming cloud-ingress traffic to a GKE cluster. The systemselects a GKE cluster based on its proximity to the user (based on latency),and its availability and health. When traffic reaches the Istio ingress gateway(the mesh ingress), it's routed to the appropriate backendsthrough the service mesh.

An alternative approach for handling the east-west traffic is throughmulti-cluster services for all application services deployed acrossGKE clusters. When using multi-cluster services across GKE clusters in afleet, service endpoints are collected together in aClusterSet.If a service needs to call another service, then it can target any healthyendpoint for the second service. Because endpoints are chosen on a rotatingbasis, the selected endpoint could be in a different zone or a different region.

A key advantage of using service mesh for east-west traffic rather than usingmulti-cluster services is that service mesh can use locality load balancing.Locality load balancing isn't a feature of multi-cluster services, but youcan configure it through aDestinationRule.

Once configured, a call from one service to another first tries to reach aservice endpoint in the same zone, then it tries in the same region as thecalling service. Finally, the call only targets an endpoint in another region ifa service endpoint in the same zone or same region is unavailable.

Deployment

To deploy this architecture, seeFrom edge to multi-cluster mesh: Deploy globally distributed applications through GKE Gateway and Cloud Service Mesh.

What's next

Contributors

Authors:

Other contributors:

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2024-06-30 UTC.