External Application Load Balancer performance best practices

Cloud Load Balancing providesmechanisms to distribute user traffic to multiple instances of an application.They do this by spreading the load across application instances and deliveringoptimal application performance to end users. This page describes some bestpractices to ensure that the load balancer is optimized for your application. Toensure optimal performance, we recommend benchmarking your application's trafficpatterns.

Place backends close to clients

The closer your users or client applications are to your workloads (loadbalancer backends), the lower the network latency between them. Therefore,create your load balancer backends in the region closest to where you anticipateyour users' traffic to arrive at the Google frontend. In many cases, runningyour backends in multiple regions is necessary to minimize latency to clients indifferent parts of the world.

For more information, see the following topics:

Enable caching with Cloud CDN

Turn on Cloud CDN and caching as part of your default,global external Application Load Balancer configuration. For more information, seeCloud CDN.

When you enable Cloud CDN, it might take a few minutes before responsesbegin to be cached. Cloud CDN caches only responses withcacheablecontent. If responses for a URL aren't beingcached, check which response headers are being returned for that URL, and howcacheability isconfigured for your backend. Formore details, seeCloud CDNtroubleshooting.

Forwarding rule protocol selection

  • For the global external Application Load Balancer and the classic Application Load Balancer,we recommend HTTP/3 which is an internet protocol built on topofIETF QUIC.HTTP/3 is enabled by default in all major browsers,AndroidCronet, andiOS.To use HTTP/3 for your applications, ensure that UDPtraffic is not blocked or rate-limited on your network and that HTTP/3 was notpreviouslydisabled on yourglobal external Application Load Balancers. Clients that don'tyet support HTTP/3, such as older browsers or networking libraries, won't beimpacted. For more information, seeHTTP/3QUIC.

  • For the regional external Application Load Balancer, we supportHTTP/1.1, HTTPS, and HTTP/2. Both HTTPS and HTTP/2 require some upfrontoverhead to set up TLS.

Backend service protocol selection

Your choice of backend protocol (HTTP, HTTPS, or HTTP/2) impacts applicationlatency and the network bandwidth available for your application. For example,using HTTP/2 between the load balancer and the backend instance can requiresignificantly more TCP connections to the instance than HTTP(S). Connectionpooling, an optimization that reduces the number of these connections withHTTP(S), is not available with HTTP/2. As a result, you might see highbackend latencies because backend connections are made more frequently.

The backend service protocol also impacts how the traffic isencrypted intransit. Withexternal HTTP(S) load balancers, all traffic going to backends that residewithin Google Cloud VPC networks is automatically encrypted. This is calledautomatic network-level encryption. However, automatic network-level encryptionis only available for communications with instance groups and zonal NEGbackends. For all other backend types, we recommend you use secureprotocol options such as HTTPS and HTTP/2 to encrypt communication with thebackend service. For details, seeEncryption from the load balancer to thebackends.

Recommended connection duration

Network conditions change and the set of backends might change based on load.For applications which generate a lot of traffic to a single service, a longrunning connection isn't always an optimal setup. Instead of using a singleconnection to the backend indefinitely, we recommend that you choose a maximumconnection lifetime (for example, between 10 and 20 minutes)and/or a maximum number of requests (for example, between 1000 and 2000requests), after which a new connection is used for new requests. Theold connection is closed when all active requests using it are done.

This lets the client application benefit from changes in the set of backends,which include the load balancer's proxies and any network reoptimization that'srequired to serve the clients.

Balancing mode selection criteria

For better performance, consider choosing the backend group for each new requestbased on which backend is the most responsive. This can be achieved by using theRATE balancing mode. In this case, the backend group with the lowest averagelatency over recent requests, or, for HTTP/2 and HTTP/3, the backend group withthe fewest outstanding requests, is chosen.

TheUTILIZATION balancing mode applies only to instance group backends anddistributes traffic based on the utilization of VM instances in an instancegroup.

Configure session affinity

In some cases, it might be beneficial for the same backend to handle requeststhat are from the same end users, or related to the same end user, at least fora short period of time. This can be configured by usingsession affinity, asetting configured on the backend service. Session affinity controls thedistribution of new connections from clients to the load balancer's backends.You can use session affinity to ensure that the same backend handles requestsfrom the same resource, for example, related to the same user account or fromthe same document.

Session affinity is specified for the entire backend service resource, and noton a per backend basis. However, a URL map can point to multiple backendservices. Therefore, you don't have to use just one session affinity type forthe load balancer. Depending on your application, you can use different backendservices with different session affinity settings. For example, if a part ofyour application is serving static content to many users, it is unlikely tobenefit from session affinity. You would use aCloud CDN-enabled backend service to serve cachedresponses instead.

For more information, seesessionaffinity.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.