Explore GKE networking documentation and use cases
Networking in Google Kubernetes Engine (GKE) covers a broad set of concepts,including Pods, services, DNS, load balancing, security, and IP addressmanagement. Although the documentation explains each feature in detail, it canbe difficult to know where to start when facing a real-world problem.
This document helps you navigate the GKE networking documentationby linking common challenges to the features and sections that solve them.Each use case presents a scenario, identifies the challenge, and points you tothe relevant documentation. This document is for cloud architects, developers,and operations teams who must understand and solve common networking challengesin GKE.
If you're already familiar with common networking challenges and prefer to delvestraight into the technical details, explore the following resources to buildyour foundational knowledge of GKE networking:
- Learn GKE networking fundamentals.
- Learn GKE networking architecture.
- Glossary of GKE networkingterms (for aquick refresher on any unfamiliar terms).
Use case: Design the network foundation for GKE
In this use case, you're a cloud architect who needs to design a scalable,secure, and reliable network foundation for a new GKE platform.
Challenge: Prevent IP address exhaustion
Scenario: your application's complexity and usage are expected to grow, soyou need to design a network that can scale to handle the increased traffic andsupport Pod, service, and node growth. You also need to plan your IP addressallocation to avoid exhaustion
Solution: plan your IP addressing scheme to account for the number of nodes,Pods, and Services you'll need. This plan includes choosing appropriate IPaddress ranges for each, considering Pod density, and avoiding overlaps withother networks. For more information, seeManage IP address migration inGKE.
Challenge: Enforce defense-in-depth security
Scenario: you need to secure your cluster perimeters and enforce zero-trust,Pod-to-Pod rules.
Solution: useFirewall policies for clusterperimeters. For more information, seeControl communication between Pods andServices using network policies.
Challenge: Route traffic to different types of applications
Scenario: you need to make sure that other services and users can reachdifferent types of applications, such as private backends and public HTTP(S)applications.
Solution: use internal load balancers for private backends. For publicHTTP(S) applications, use Ingress or Gateway API. For more information, seeAbout load balancing inGKE.
Challenge: Use observability tools to monitor and troubleshoot workload issues
Scenario: you must fix problems with network traffic, and need to understandand monitor GKE traffic flows to diagnose issues effectively.
Solution: implement observability tools to monitor and troubleshoot networktraffic. For more information, seeObserve your traffic using GKE Dataplane V2observability.
Use case: Expose a new microservice
In this use case, you're a developer deploying a new microservice inGKE. You need to make the microservice accessible to otherservices in the cluster, and later, to external clients.
Challenge: Provide a stable endpoint for Pod-to-Pod communication
Scenario: your application needs Pods to communicate with other Pods, butthe dynamic IP addresses used by Pods make this communication unreliable.
Solution: create a Kubernetes service. A ClusterIP service provides a stablevirtual IP address and DNS name, load-balanced across Pods. For moreinformation, seeUnderstand Kubernetesservices.
Challenge: Expose the service for external access
Scenario: the microservice must be reachable from the internet for a demo.
Solution: create a LoadBalancer service. GKE provisions aregional external passthrough Network Load Balancer with a public IP address.For HTTP(S) traffic, consider using Ingress or Gateway, which provide Layer 7features. For more information, seeAbout LoadBalancerServices.
Challenge: Assign a permanent, user-friendly URL
Scenario: the service needs a stable domain name for clients.
Solution: reserve a static IP address and configure DNS for a custom domain.For more information, seeConfigure domain names with static IPaddresses.
Challenge: Manage advanced traffic routing
Scenario: as your application grows, you need more sophisticated controlover how traffic is routed. For example, you might need to do the following:
- Host multiple websites (like api.example.com and shop.example.com) on asingle load balancer to conserve costs.
- Route requests to different services based on the URL path (for example,sending
/to the frontend workload and/api/v1to the backend workload). - Secure your application with HTTPS by managing TLS certificates.
- Safely deploy new features in stages by using canary releases, where yousend a small portion of traffic to a new version before a full rollout.
Solution: use Gateway API. GKE'simplementation of Gateway API provides a powerful and standardized way tomanage this kind of north-south traffic, supporting advanced features likepath-based routing, header matching, and traffic splitting. For moreinformation, seeAbout Gateway API.
Use case: Scale service discovery for a growing application
As your microservice-based application grows in traffic and complexity, DNSqueries between services increase significantly. Although developers need tounderstand how to build resilient applications in this environment, platform andoperations teams are often responsible for implementing scalable networkingsolutions.
Challenge: Enable service-to-service communication
Scenario: Pods need a reliable way to locate other services.
Solution: GKE provides an in-cluster DNS service (such askube-dns or Cloud DNS) that resolves stable DNS names for Services,enabling reliable Pod-to-Pod communication. For more information, seeServicediscovery and DNS.
Challenge: Improve DNS performance at scale
Scenario: high query volume causes lookup delays.
Solution: enable NodeLocal DNSCache. Each node caches DNS queries locally,reducing latency. For more information, seeSet up NodeLocal DNSCacheoverview.
Challenge: Provide service discovery across the VPC
Scenario: Compute Engine VMs need to access services inside the cluster.
Solution: integrate with Cloud DNS so service DNS records resolve acrossthe VPC. For more information, seeUse Cloud DNS forGKE.
Use case: Secure a multi-tier application
In this use case, you're on a platform engineering team that's deploying athree-tier application (frontend, billing, database), and you must enforcezero-trust communication.
Challenge: Enforce strict traffic rules
Scenario: only specific services should communicate with each other.
Solution: enable network policy enforcement and applydefault denypolicies, then define explicit allow rules (for example, frontend allows trafficto billing, billing allows traffic to database). For more information, seeConfigure network policies for applications.
Challenge: Audit and verify network policies
Scenario: security requires proof of enforcement and visibility.
Solution: enable network policy logging to record allowed and deniedconnections. For more information, seeUse network policylogging.
Challenge: Expose a service privately to consumers
Scenario: a backend service, like a database or API, needs to be accessibleto consumers in other VPC networks without exposing it to thepublic internet or dealing with VPC peering complexities.
Solution: use Private Service Connect to publish the service.Consumers can then create a PSC endpoint in their own VPC toaccess your service privately and securely. For more information, seeExposeservices with Private ServiceConnect.
Use case: Achieve high availability across multiple clusters
In this use case, you're an SRE running workloads for an ecommerce company inmultiple GKE clusters across different regions to improvereliability.
Challenge: Enable cross-cluster communication
Scenario: services in one cluster must discover and call services inanother.
Solution: use GKE multi-cluster Services (MCS) to create aglobal DNS name and route traffic automatically to healthy backends. For moreinformation, seeMulti-clusterServices.
Challenge: Ensure resilient failover
Scenario: if one regional service becomes unavailable, traffic must rerouteautomatically.
Solution: MCS provides health-aware service discovery, allowing clients toresolve a single DNS name to a healthy backend in the nearest available cluster.This approach enables resilient failover. For more information, seeMulti-cluster Services.
Use case: Build a secure and efficient multi-tenant GKE environment
As part of a platform engineering team, you provide GKE clustersto multiple application teams. You need to centralize network control, conserveIP addresses, and enforce strict security.
Challenge: Centralize network control
Scenario: multiple app teams need their own clusters, but networking must becentrally managed.
Solution: use Shared VPC. Networking resources reside in a hostproject, but app clusters run in service projects. For more information, seeConfigure clusters withShared VPC.
Challenge: Efficiently manage limited IP addresses
Scenario: IP address space is limited and needs to be used efficiently.
Solution: adjust maximum Pods per node and, if required, use non-RFC 1918ranges for Pod IP addresses. For more information, seeManage IP addressmigration inGKE.
Challenge: Use a modern, secure dataplane, and provision clusters with the new dataplane
Scenarios:
- The enterprise requires high performance and built-in policy enforcement tosupport demanding workloads and a zero-trust security posture. For example,you might be running large-scale microservices that are sensitive to networklatency, or you might need to enforce strict security boundaries betweenapplications in a multi-tenant cluster to meet regulatory compliancerequirements.
- Clusters must be configured to use a modern networking dataplane for highperformance and security, and they must be deployed within theorganization's centrally managed network structure.
Solution: use GKE Dataplane V2, which is eBPF-based and provides highperformance and built-in network policy enforcement. For more information, seeGKE Dataplane V2.
Use case: Observe and troubleshoot traffic
As an SRE, you're investigating why a checkout service can't connect to apayment service.
Challenge: Resolve connectivity issues
Scenario: packets are dropped, but the cause is unclear.
Solution: enable GKE Dataplane V2 observability. Metrics likehubble_drop_total confirm packets are denied. For more information, seeTroubleshoot with Hubble.
Challenge: Pinpoint root cause of dropped packets
Scenario: after confirming network packets are being dropped (for example,by usinghubble_drop_total), identify which specific network policy isblocking traffic between services.
Solution: use the Hubble command-line interface or UI to trace flows. TheHubble UI provides a visual representation of the traffic, highlighting theexact misconfigured policy that is denying the connection. This visualizationallows the team to quickly pinpoint the root cause of the issue and correct thepolicy. For more information, seeObserve your traffic using GKE Dataplane V2 observability.
End-to-end use case: Deploy and scale a secure retail application
In this end-to-end scenario, a platform engineering team builds a standardizedGKE platform for multiple application teams. The team deploys andoptimizes a three-tier retail application (frontend, billing, database). Thisprocess includes securing, scaling, enhancing performance for machine learningworkloads, and integrating advanced security appliances.
The following diagram illustrates the end-to-end architecture of a secure,multi-tier retail application deployed on GKE. The architectureevolves through several phases:
- Phase 1: build a foundational setup by using Shared VPC andGKE Dataplane V2.
- Phase 2: expose the application by using Gateway API and multi-clusterservices for high availability.
- Phase 3: accelerate ML tasks by using gVNIC and Tier 1 networking.
- Phase 4: deploy advanced security appliances by using multi-networksupport.
Phase 1: Build the platform foundation
Challenge: Centralize networking for multiple application teams and allocatesufficient IP addresses to handle scaling.
Solution:
- UseShared VPCfor centralized control.
- Plan IP addressingto ensure scalability.
- EnableGKE Dataplane V2 fora high-performance and secure data plane.
- UsePrivate Service Connectto securely connect to the GKE control plane.
Phase 2: Deploy and secure the application
Challenge: ensure reliable service-to-service communication and enforcezero-trust security.
Solution:
- CreateClusterIP services forstable internal endpoints.
- Applynetwork policieswith a default-deny baseline and explicit allow rules.
Phase 3: Expose the application and scale for growth
Challenge: provide external access and reduce DNS lookup latency as trafficincreases.
Solution:
- Expose the frontend withGatewayAPI for advanced trafficmanagement.
- Assign astatic IP address withDNS.
- EnableNodeLocalDNSCache for fasterlookups.
Phase 4: Achieve high availability and troubleshoot issues
Challenge: ensure regional failover and debug dropped traffic.
Solution:
- Usemulti-clusterservices forcross-region failover.
- EnableGKE Dataplane V2 observabilitywith Hubble to diagnose and fix misconfigured network policies.
Phase 5: Accelerate machine learning workloads
Challenge: eliminate network bottlenecks for GPU-based model training.
Solution:
- EnablegVNIC for higher bandwidth.
- ConfigureTier 1 networkingon critical nodes for maximum throughput.
Phase 6: Deploy advanced security appliances
Challenge: deploy a third-party firewall and IDS with separate management anddata plane traffic at ultra-low latency.
Solution:
- Enablemulti-network supportto attach multiple interfaces to Pods.
- Configuredevice-mode networking(DPDK).
What's next
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-11-26 UTC.