Backend services overview

A backend service defines how Cloud Load Balancing distributes traffic.The backend service configuration contains a set of values, such as theprotocol used to connect to backends, various distribution and sessionsettings, health checks, and timeouts. These settings provide fine-grainedcontrol over how your load balancer behaves. To get you started,most of the settings have default values that allow for fastconfiguration. A backend service is either global orregional in scope.

Load balancers, Envoy proxies, and proxyless gRPC clients use the configurationinformation in the backend service resource to do the following:

Direct traffic to the correctbackends, which are instance groups ornetwork endpoint groups (NEGs).
Distribute traffic according to abalancing mode, which is a setting foreach backend.
Determine whichhealth check is monitoringthe health of the backends.
Specifysession affinity.
Determine whether other services are enabled, including the followingservices that are only available forcertain loadbalancers:
- Cloud CDN
- Google Cloud Armor security policies
- Identity-Aware Proxy
Designate global and regional backend services as a service inApp Hub applications.

You set these values when you create a backend service or add a backend to thebackend service.

Note: If you're using either the global external Application Load Balancer or theclassic Application Load Balancer, and your backends servestatic content, considerusing backend buckets instead of backend services. Seebackend buckets for global external Application Load Balancerorbackend buckets for classic Application Load Balancer.

The following table summarizes which load balancers use backend services. Theproduct that you are using also determines the maximum number of backendservices, the scope of a backend service, the type of backends supported, and thebackend service'sload balancing scheme. The load balancing scheme is anidentifier that Google uses to classify forwarding rules and backend services. Eachload balancing product uses one load balancing scheme for its forwarding rulesand backend services. Some schemes are shared among products.

**Table:** Backend services and supported backend types
Product	Maximum number of backend services	Scope of backend service	Supported backend types	Load balancing scheme
Global external Application Load Balancer	Multiple	Global	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends¹ All zonal NEGs: One or more`GCE_VM_IP_PORT` type zonal NEGs¹ All hybrid connectivity NEGs: One or more`NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs:`GCE_VM_IP_PORT` and`NON_GCP_PRIVATE_IP_PORT` type NEGs² All serverless NEGs: One or more App Engine, Cloud Run, or Cloud Run functions resources One global internet NEG for an external backend Private Service Connect NEGs: Google APIs: a single Private Service Connect NEG Managed services: one or more Private Service Connect NEGs	EXTERNAL_MANAGED
Classic Application Load Balancer	Multiple	Global³	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more`GCE_VM_IP_PORT` type zonal NEGs All hybrid connectivity NEGs: One or more`NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs:`GCE_VM_IP_PORT` and`NON_GCP_PRIVATE_IP_PORT` type NEGs² All serverless NEGs: One or more App Engine, Cloud Run, or Cloud Run functions resources, or One global internet NEG for an external backend	EXTERNAL⁴
Regional external Application Load Balancer	Multiple	Regional	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends¹ All zonal NEGs: One or more`GCE_VM_IP_PORT` type zonal NEGs¹ All hybrid connectivity NEGs: One or more`NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs:`GCE_VM_IP_PORT` and`NON_GCP_PRIVATE_IP_PORT` type NEGs² A single serverless NEG (for Cloud Run or Cloud Run functions 2nd gen only) A single Private Service Connect NEG All regional internet NEGs for an external backend	EXTERNAL_MANAGED
Cross-region internal Application Load Balancer	Multiple	Global	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends¹ All zonal NEGs: One or more`GCE_VM_IP_PORT` type zonal NEGs¹ All hybrid connectivity NEGs: One or more`NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs:`GCE_VM_IP_PORT` and`NON_GCP_PRIVATE_IP_PORT` type NEGs² A single serverless NEG (for Cloud Run or Cloud Run functions 2nd gen only) Private Service Connect NEGs: Google APIs: a single Private Service Connect NEG Managed services: one or more Private Service Connect NEGs	INTERNAL_MANAGED
Regional internal Application Load Balancer	Multiple	Regional	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends¹ All zonal NEGs: One or more`GCE_VM_IP_PORT` type zonal NEGs¹ All hybrid connectivity NEGs: One or more`NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs:`GCE_VM_IP_PORT` and`NON_GCP_PRIVATE_IP_PORT` type NEGs² A single serverless NEG (for Cloud Run or Cloud Run functions 2nd gen only) A single Private Service Connect NEG All regional internet NEGs for an external backend	INTERNAL_MANAGED
Global external proxy Network Load Balancer	1	Global³	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends¹ All zonal NEGs: One or more`GCE_VM_IP_PORT` type zonal NEGs¹ All hybrid connectivity NEGs: One or more`NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs:`GCE_VM_IP_PORT` and`NON_GCP_PRIVATE_IP_PORT` type NEGs² Private Service Connect NEGs: Google APIs: a single Private Service Connect NEG Managed services: one or more Private Service Connect NEGs	EXTERNAL_MANAGED
Classic proxy Network Load Balancer	1	Global³	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more`GCE_VM_IP_PORT` type zonal NEGs All hybrid connectivity NEGs: One or more`NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs:`GCE_VM_IP_PORT` and`NON_GCP_PRIVATE_IP_PORT` type NEGs²	EXTERNAL
Regional external proxy Network Load Balancer	1	Regional	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends¹ All zonal NEGs: One or more`GCE_VM_IP_PORT` type zonal NEGs¹ All hybrid connectivity NEGs: One or more`NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs:`GCE_VM_IP_PORT` and`NON_GCP_PRIVATE_IP_PORT` type NEGs All regional internet NEGs for an external backend A single Private Service Connect NEG	EXTERNAL_MANAGED
Regional internal proxy Network Load Balancer	1	Regional	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends¹ All zonal NEGs: One or more`GCE_VM_IP_PORT` type zonal NEGs¹ All hybrid connectivity NEGs: One or more`NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs:`GCE_VM_IP_PORT` and`NON_GCP_PRIVATE_IP_PORT` type NEGs All regional internet NEGs for an external backend A single Private Service Connect NEG	INTERNAL_MANAGED
Cross-region internal proxy Network Load Balancer	Multiple	Global	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends¹ All zonal NEGs: One or more`GCE_VM_IP_PORT` type zonal NEGs¹ All hybrid connectivity NEGs: One or more`NON_GCP_PRIVATE_IP_PORT` type NEGs A combination of zonal and hybrid NEGs:`GCE_VM_IP_PORT` and`NON_GCP_PRIVATE_IP_PORT` type NEGs Private Service Connect NEGs: Google APIs: a single Private Service Connect NEG Managed services: one or more Private Service Connect NEGs	INTERNAL_MANAGED
External passthrough Network Load Balancer	1	Regional	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more`GCE_VM_IP` type zonal NEGs	EXTERNAL
Internal passthrough Network Load Balancer	1	Regional, but configurable to be globally accessible	The backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more`GCE_VM_IP` type zonal NEGs One port mapping NEG	INTERNAL
Cloud Service Mesh	Multiple	Global	Each backend service supports one of the following backend combinations: All instance group backends: One or more managed, unmanaged, or a combination of managed and unmanaged instance group backends All zonal NEGs: One or more`GCE_VM_IP_PORT` or`NON_GCP_PRIVATE_IP_PORT` type zonal NEGs One internet NEG of type`INTERNET_FQDN_PORT` One or more service bindings (Preview)	INTERNAL_SELF_MANAGED

¹ These load balancerssupport IPv4-only and dual-stack (IPv4 and IPv6) instance groups and zonal NEGbackends.

² ForGKE deployments, mixed NEG backends are only supported withstandalone NEGs.

³ Backend services used by classic Application Load Balancers andclassic proxy Network Load Balancers are always global in scope, in either Standard orPremium Network Tier. However, in Standard Tier the following restrictionsapply:

Theforwarding rule and its external IP address are regional.
All backends connected to the backend service must be located in the same region as the forwarding rule.

⁴ It is possible to attachEXTERNAL_MANAGED backend services toEXTERNAL forwarding rules. However,EXTERNAL backendservices cannot be attached toEXTERNAL_MANAGED forwarding rules.To take advantage ofnew features availableonly with the global external Application Load Balancer, werecommend that you migrate your existingEXTERNAL resources toEXTERNAL_MANAGED by using the migration process described atMigrateresources from classic to global external Application Load Balancer.

Load balancer naming

For Proxy Network Load Balancers and Passthrough Network Load Balancers, the name of the loadbalancer is always the same as the name of the backend service. The behavior foreach Google Cloud interface is as follows:

Google Cloud console. If you create either a proxy Network Load Balancer or apassthrough Network Load Balancer by using the Google Cloud console, the backend service isautomatically assigned the same name that you entered for the load balancername.
Google Cloud CLI or API. If you create either a proxy Network Load Balancer or apassthrough Network Load Balancer by using the gcloud CLI or the API, you enter aname of your choice while creating the backend service. This backend servicename is then reflected in the Google Cloud console as the name of the loadbalancer.

To learn about how naming works for Application Load Balancers, seeURL mapsoverview: Load balancer naming.

Backends

A backend is one or more endpoints that receive traffic from a Google Cloudload balancer, a Cloud Service Mesh-configured Envoy proxy, or a proxyless gRPCclient. There are several types of backends:

Instance group containing virtual machine (VM) instances.An instance group can be amanaged instancegroup (MIG),with or withoutautoscaling, or it can be anunmanaged instancegroup.More than one backend service can reference an instance group, but all backendservices that reference the instance groupmust use compatible balancingmodes. For more information, in this document, seeRestrictions and guidance for instance groups.
Zonal NEG
Serverless NEG
Private Service Connect NEG
Internet NEG
Hybrid connectivity NEG
Port mapping NEG
Service Directory service bindings (Preview)

You cannot delete a backend instance group or NEG that is associated with abackend service. Before you delete an instance group or NEG, you must firstremove it as a backend from all backend services that reference it.

Instance groups

This section discusses how instance groups work with the backend service.

Backend VMs and external IP addresses

Backend VMs in backend services don't need external IP addresses:

For global external Application Load Balancers andexternal proxy Network Load Balancers: Clients communicate with a Google Front End (GFE) whichhosts your load balancer's external IP address. GFEs communicate with backendVMs or endpoints by sending packets to an internal address created by joiningan identifier for the backend's VPC network with the internalIPv4 address of the backend. Communication between GFEs and backend VMs orendpoints is facilitated throughspecialroutes.
- For instance group backends, the internal IPv4address is always the primary internal IPv4 address that corresponds to thenic0 interface of the VM.
- ForGCE_VM_IP_PORT endpoints in a zonal NEG, you can specify theendpoint's IP address as either the primary IPv4 address associated with anynetwork interface of a VM or any IPv4 address from an alias IP address rangeassociated with any network interface of a VM.

For regional external Application Load Balancers: Clients communicate with an Envoy proxywhich hosts your load balancer's external IP address. Envoy proxiescommunicate with backend VMs or endpoints by sending packets to an internaladdress created by joining an identifier for the backend's VPCnetwork with the internal IPv4 address of the backend.
- For instance group backends, the internal IPv4 address is always the primaryinternal IPv4 address that corresponds to thenic0 interface of the VM,andnic0 must be in the same network as the load balancer.
- ForGCE_VM_IP_PORT endpoints in a zonal NEG, you can specify theendpoint's IP address as either the primary IPv4 address associated with anynetwork interface of a VM or any IPv4 address from an alias IP address rangeassociated with any network interface of a VM, as long as the networkinterface is in the same network as the load balancer.
For external passthrough Network Load Balancers: Clients communicate directly with backends by wayof Google'sMaglevpass-through load balancing infrastructure. Packets are routed and deliveredto backends with the original source anddestination IP addresses preserved. Backends respond to clientsusing directserver return.The methods used to select a backend and to track connections areconfigurable.
- For instance group backends, packets are always delivered to thenic0interface of the VM.
- ForGCE_VM_IP endpoints in a zonal NEG, packets are delivered to the VM'snetwork interface that is in the subnetwork associated with the NEG.

Named ports

The backend service'snamed port attribute is only applicable to proxy-basedload balancers (Application Load Balancers and Proxy Network Load Balancers) usinginstance group backends. The named port defines the destination port used forthe TCP connection between the proxy (GFE or Envoy) and the backend instance.

Named ports are configured as follows:

On each instance group backend, you must configure one or morenamed portsusing key-value pairs. The key represents a meaningful port name that youchoose, and the value represents the port number you assign to the name. Themapping of names to numbers is done individually for each instance groupbackend.
On the backend service, you specify a single named port using just the portname (--port-name).

On a per-instance group backend basis, the backend service translates the portname to a port number. When an instance group's named port matches the backendservice's--port-name, the backend service uses this port number forcommunication with the instance group's VMs.

For example, you might set the named port on an instance group with the namemy-service-name and the port8888:

gcloud compute instance-groups unmanaged set-named-ports my-unmanaged-ig \    --named-ports=my-service-name:8888

Then you refer to the named port in the backend service configuration with the--port-name on the backend service set tomy-service-name:

gcloud compute backend-services update my-backend-service \    --port-name=my-service-name

A backend service can use a different port number when communicating with VMsin different instance groups if each instance group specifies a different portnumber for the same port name.

The resolved port number used by the proxy load balancer's backend servicedoesn't need to match the port number used by the load balancer's forwardingrules. A proxy load balancer listens for TCP connections sent to the IP addressand destination port of its forwarding rules. Because the proxy opens a secondTCP connection to its backends, the second TCP connection's destination port canbe different.

Named ports are only applicable to instance group backends. Zonal NEGs withGCE_VM_IP_PORT endpoints, hybrid NEGs withNON_GCP_PRIVATE_IP_PORTendpoints, and internet NEGs define ports using a different mechanism, namely,on the endpoints themselves. Serverless NEGs reference Google services and PSCNEGs reference service attachments using abstractions that don't involvespecifying a destination port.

Internal passthrough Network Load Balancers and external passthrough Network Load Balancers don'tuse named ports. This is because they are pass-through load balancers that routeconnections directly to backends instead of creating new connections. Packetsare delivered to the backends preserving the destination IP address and port ofthe load balancer's forwarding rule.

To learn how to create named ports, see the following instructions:

Unmanaged instance groups:Working with namedports
Managed instance groups:Assigning named ports to managed instancegroups

Restrictions and guidance for instance groups

Keep the following in mind when you use instance group backends:

A VM instance can only belong to a single load-balanced instance group. Forexample, a VM can be a member of two unmanaged instance groups, or a VM canbe a member of one managed instance group and one unmanaged instance group.When a VM is a member of two or more instance groups, only one of the instancegroups can be referenced by one or more load balancer backend services.
The same instance group can be used by two or more backend services. Eachmapping between an instance group and a backend service can use a differentbalancing mode except for the incompatible balancing mode combinations.
- The incompatible balancing mode combinations are as follows:
  - TheUTILIZATION balancing mode is incompatible with all otherbalancing modes. If an instance group is a backend of multiple backendservices, the instance group must use theUTILIZATION balancing modeon every backend service.
  - TheCUSTOM_METRICS balancing mode is incompatible with all otherbalancing modes. If an instance group is a backend of multiple backendservices, the instance group must use theCUSTOM_METRICS balancing modeon every backend service.
- As a consequence of the incompatible balancing mode combinations, if aninstance group uses either theUTILIZATION orCUSTOM_METRICS balancingmode as a backend for at least one backend service, the same instance groupcan't be used as a backend for a passthrough Network Load Balancer because passthrough Network Load Balancersrequire theCONNECTION balancing mode.
There's no single command that can change the balancing mode of the sameinstance group on multiple backend services. To change the balancing mode foran instance group that's a backend of two or more backend services, you can usethis technique:
- Remove the instance group as a backend from all backend services exceptfor one backend service.
- Change the instance group's balancing mode for the one remaining backendservice.
- Re-add the instance group as a backend to the other backend services.

Consider the following best practices, which provide more flexible options:

Avoid using the same instance group as a backend for two or more backendservices. Instead, use multiple NEGs.
- Unlike instance groups, a VM can have an endpoint in two or moreload-balanced NEGs.
- For example, if a VM needs to simultaneously be a backend of both apassthrough Network Load Balancer and either a proxy Network Load Balancer or an Application Load Balancer,use multiple load-balanced NEGs. Place a VM endpoint in a unique NEGcompatible with each load balancer type. Then associate each NEG with thecorresponding load balancer backend service.
Don't add an autoscaled managed instance group to more than one backendservice when using theHTTP Load Balancing Utilization autoscalingmetric. Two or more backend services referencing the same autoscaledmanaged instance group can contradict with one another unless the autoscalingmetric is unrelated to load balancer activity.

Zonal network endpoint groups

Network endpoints represent services by their IP address or an IP address andport combination, rather than referring to a VM in an instance group. Anetworkendpoint group (NEG) is a logical grouping of network endpoints.

Zonal NEGs arezonal resources that representcollections of either IP addresses or IP address and port combinations forGoogle Cloud resources within a singlesubnet.

A backend service that uses zonal NEGs as its backendsdistributes traffic among applications or containers runningwithin VMs.

There are two types of network endpoints available for zonal NEGs:

GCE_VM_IP endpoints (supported only with internal passthrough Network Load Balancers and backendservice-based external passthrough Network Load Balancers).
GCE_VM_IP_PORT endpoints.

To see which products support zonal NEG backends,seeTable: Backend services and supported backendtypes.

For details, seeZonal NEGsoverview.

Internet network endpoint groups

Internet NEGs are resources that define external backends.An external backend is a backend that is hosted within on-premisesinfrastructure or on infrastructure provided by third parties.

An internet NEG is a combination of a hostname or an IP address, plus anoptional port. There are two types of network endpoints available for internetNEGs:INTERNET_FQDN_PORTandINTERNET_IP_PORT.

Internet NEGs are available in two scopes: global and regional. To see whichproducts support internet NEG backends in each scope, seeTable: Backendservices and supported backend types.

For details, seeInternet network endpoint groupoverview.

Serverless network endpoint groups

A network endpoint group (NEG) specifies a group of backend endpoints for a loadbalancer. Aserverless NEG is a backend that points to aCloud Run,App Engine,Cloud Run functions, orAPI Gatewayresource.

A serverless NEG can represent one of the following:

A Cloud Run resource or a group of resources.
A Cloud Run function or group of functions (formerlyCloud Run functions 2nd gen).
A Cloud Run function (1st gen) or group of functions
An App Engine standard environment or App Engine flexible environment app, a specific service within an app,a specific version of an app, or a group of services.
An API Gateway that provides access to your services through aREST APIconsistent across all services, regardless of service implementation.This capability is inPreview.

Important: For a group of resources to be in the same serverless NEG, they musthave a common URL pattern.

To set up a serverless NEG for serverless applications that share a URLpattern, you use a URLmask. A URL maskis a template of your URL schema (for example,example.com/<service>). Theserverless NEG will use this template to extract the<service> name from theincoming request's URL and route the request to the matchingCloud Run, Cloud Run functions, or App Engineservice with the same name.

To see which load balancers support serverless NEG backends,seeTable: Backend services and supported backendtypes.

For more information about serverless NEGs, see theServerless network endpointgroups overview.

Service bindings

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Aservice binding is a backend that establishes a connection between abackend service in Cloud Service Mesh and a service registered inService Directory. A backend service can reference severalservice bindings. A backend service with a service binding cannot referenceany other type of backend. For more information, seeCloud Service Mesh integration with Service Directory.

Mixed backends

The following usage considerations apply when you add different types ofbackends to a single backend service:

A single backend service cannot simultaneously use both instancegroups and zonal NEGs.
You can use a combination of differenttypes of instance groups on the samebackend service. For example, a single backend service can reference acombination of both managed and unmanaged instance groups. For completeinformation about which backends are compatible with which backend services,see the table in the previous section.
With certain proxy load balancers, you can use a combination of zonal NEGs(withGCE_VM_IP_PORT endpoints)and hybrid connectivity NEGs (withNON_GCP_PRIVATE_IP_PORTendpoints) to configurehybrid load balancing.To see which load balancers have this capability, referTable: Backendservices and supported backend types.

Protocol to the backends

When you create a backend service, you must specify the protocol used tocommunicate with the backends. You can specify only one protocol per backendservice — you cannot specify a secondary protocol to use as a fallback.

Which protocols are valid depends on the type of load balancer or whether youare using Cloud Service Mesh.

**Table:** Protocol to the backends
Product	Backend service protocol options
Application Load Balancer	HTTP, HTTPS, HTTP/2
Proxy Network Load Balancer	TCP or SSL The regional proxy Network Load Balancers support only TCP.
Passthrough Network Load Balancer	TCP, UDP, or UNSPECIFIED
Cloud Service Mesh	HTTP, HTTPS, HTTP/2, gRPC, TCP

Changing a backend service's protocol makes the backends inaccessible throughload balancers for a few minutes.

IP address selection policy

This field is applicable to proxy load balancers.You must use the IP address selection policy to specify the traffic type that issent from the backend service to your backends.

When you select the IP address selection policy, ensure that your backendssupport the selected traffic type. For more information,seeTable: Backend services and supported backendtypes.

IP address selection policy is used when you want to convert your load balancerbackend service to support a different traffic type.For more information, seeConvert from single-stack to dual-stack.

You can specify the following values for the IP address selection policy:

IP address selection policy Description

Only IPv4 Only send IPv4 traffic to the backends of the backend service, regardless of traffic from the client to the GFE. Only IPv4 health checks are used to check the health of the backends.

IP address selection policy	Description
Only IPv4	Only send IPv4 traffic to the backends of the backend service, regardless of traffic from the client to the GFE. Only IPv4 health checks are used to check the health of the backends.
Prefer IPv6	Prioritize the backend's IPv6 connection over the IPv4 connection (provided there is a healthy backend with IPv6 addresses). The health checks periodically monitor the backends' IPv6 and IPv4 connections. The GFE first attempts the IPv6 connection; if the IPv6 connection is broken or slow, the GFE useshappy eyeballs to fall back and connect to IPv4. Even if one of the IPv6 or IPv4 connections is unhealthy, the backend is still treated as healthy, and both connections can be tried by the GFE, with happy eyeballs ultimately selecting which one to use.
Only IPv6	Only send IPv6 traffic to the backends of the backend service, regardless of traffic from the client to the proxy. Only IPv6 health checks are used to check the health of the backends. There is no validation to check if the backend traffic type matches the IP address selection policy. For example, if you have IPv4-only backends and select`Only IPv6` as the IP address selection policy, the configuration results in unhealthy backends because traffic fails to reach those backends and the HTTP`503` response code is returned to the clients.

Prefer IPv6

Prioritize the backend's IPv6 connection over the IPv4 connection (provided there is a healthy backend with IPv6 addresses).

The health checks periodically monitor the backends' IPv6 and IPv4 connections. The GFE first attempts the IPv6 connection; if the IPv6 connection is broken or slow, the GFE useshappy eyeballs to fall back and connect to IPv4.

Even if one of the IPv6 or IPv4 connections is unhealthy, the backend is still treated as healthy, and both connections can be tried by the GFE, with happy eyeballs ultimately selecting which one to use.

Only IPv6

Only send IPv6 traffic to the backends of the backend service, regardless of traffic from the client to the proxy. Only IPv6 health checks are used to check the health of the backends.

There is no validation to check if the backend traffic type matches the IP address selection policy. For example, if you have IPv4-only backends and selectOnly IPv6 as the IP address selection policy, the configuration results in unhealthy backends because traffic fails to reach those backends and the HTTP503 response code is returned to the clients.

Encryption between the load balancer and backends

For information about encryption between the load balancer and backends, seeEncryption to thebackends.

Balancing mode, target capacity, and capacity scaler

For Application Load Balancers, Cloud Service Mesh, and proxy Network Load Balancers, thebalancing mode, target capacity, and capacity scaler are parameters you providewhen you add asupported backend to a backendservice. The load balancers use these parameters to manage the distribution ofnew requests or new connections to zones that contain supported backends:

Thebalancing mode defines how the load balancer measures capacity.Google Cloud has the following balancing modes:
- CONNECTION: defines capacity based on the number of new TCP connections.
- RATE: defines capacity based on the rate of new HTTP requests.
- IN-FLIGHT(Preview):defines capacity based on the number of in-flight HTTP requests instead ofthe rate of HTTP requests. Use this balancing mode instead ofRATE ifrequests take more than a second to complete.
- UTILIZATION: defines capacity based on the approximated CPU utilization ofVMs in a zone of an instance group.
- CUSTOM_METRICS: defines capacity based onuser-definedcustom metrics.
Thetarget capacity defines the target capacity number.
- The target capacity isn't a circuit breaker.
- When capacity usage reaches the target capacity, the load balancer directsnew requests or new connections to a different zone if backends areconfigured in two or more zones.
- Global external Application Load Balancers, global external proxy Network Load Balancers,cross-region internal Application Load Balancers, and cross-region internal proxy Network Load Balancers also usecapacity to direct requests to zones in different regions, if you'veconfigured backends in more than one region.
- When all zones have reached target capacity, new requests or newconnections are distributed by overfilling proportionally.
Thecapacity scaler provides a way to scale the target capacity manually.The values for the capacity scaler are as follows:
- 0: indicates that the backend is completely drained. You can'tuse a value of0 if a backend service only has one backend.
- 0.1 (10%) -1.0 (100%): indicates the percentage of backend capacitythat is in use.

Passthrough Network Load Balancers symbolically use theCONNECTION balancing mode, but don't support a target capacity or capacityscaler. For more information about how passthrough Network Load Balancers distribute newconnections, see the following:

Supported backends

For Application Load Balancers, Cloud Service Mesh, and proxy Network Load Balancers, thefollowing types of backends support the balancing mode, target capacity, andcapacity scaler parameters:

Internet NEGs, serverless NEGs, and Private Service Connect NEGs don'tsupport the balancing mode, target capacity, and capacity scaler parameters.

Balancing modes for Application Load Balancers and Cloud Service Mesh

Available balancing modes for Application Load Balancer and Cloud Service Meshbackends depend on the type of supported backend and a traffic durationsetting (Preview).

Traffic duration setting

Preview

For Application Load Balancer and Cloud Service Mesh backends, you can optionallyspecify a traffic duration setting. This setting is unique to the mappingbetween a supported backend and a backend service. The traffic duration settinghas two valid values:

SHORT: recommended for HTTP requests answered with responses from backendsin less than one second. If you don't explicitly specify a traffic duration, the load balancer operates as if you'd specifiedSHORT.
LONG: recommended for HTTP requests for which the backend needs more thanone second to generate responses.

To explicitly set the traffic duration when you add a backend to a backendservice, do one of the following:

Run thegcloud compute backend-services add-backend command with the--traffic-duration flag.
Create a backend service or update a backend service with thetrafficDuration attribute.

Balancing modes for short traffic duration

When the traffic duration setting isn't specified or is set toSHORT(Preview),the available balancing modes for Application Load Balancer and Cloud Service Meshbackends depend on the type of supported backend.

**Table:** Balancing modes for Application Load Balancer and Cloud Service Mesh backends using the short traffic duration setting
Supported backend	Balancing mode
Supported backend	`CONNECTION`	`RATE`	`IN_FLIGHT`	`UTILIZATION`	`CUSTOM_METRICS`
Instance groups
Zonal NEGs with`GCE_VM_IP_PORT` endpoints
Zonal hybrid connectivity NEGs

Balancing modes for long traffic duration

Preview

When the traffic duration setting isLONG, the available balancing modes forApplication Load Balancer and Cloud Service Mesh backends depend on the type ofsupported backend.

**Table:** Balancing modes for Application Load Balancer and Cloud Service Mesh backends using the long traffic duration setting
Supported backend	Balancing mode
Supported backend	`CONNECTION`	`RATE`	`IN_FLIGHT`	`UTILIZATION`	`CUSTOM_METRICS`
Instance groups
Zonal NEGs with`GCE_VM_IP_PORT` endpoints
Zonal hybrid connectivity NEGs

Balancing modes for Proxy Network Load Balancers

Available balancing modes for proxy Network Load Balancer backends depend on the type ofsupported backend.

**Table:** Balancing modes for Proxy Network Load Balancers
Supported backend	Balancing mode
Supported backend	`CONNECTION`	`RATE`	`IN_FLIGHT`	`UTILIZATION`	`CUSTOM_METRICS`
Instance groups
Zonal NEGs with`GCE_VM_IP_PORT` endpoints
Zonal hybrid connectivity NEGs

Target capacity specifications

Target capacity specifications are relevant to Application Load Balancer,Cloud Service Mesh, and proxy Network Load Balancer backends that supportbalancingmode, target capacity, and capacity scaler settings.

Target capacity specifications aren't relevant to passthrough Network Load Balancers.

Connection balancing mode

Proxy Network Load Balancer backends can use theCONNECTION balancing mode with oneof the following required target capacity parameters:

**Target capacity parameters for the`CONNECTION` balancing mode**
Target capacity parameter	Supported backend
Target capacity parameter	Zonal (managed or unmanaged) instance groups	Regional managed instance groups	Zonal NEGs with`GCE_VM_IP_PORT` endpoints	Zonal hybrid connectivity NEGs
`max-connections` Target TCP connections per backend zone
`max-connections-per-instance` Target TCP connections per VM instance. Cloud Load Balancing uses this parameter to calculate target TCP connections per backend zone.
`max-connections-per-endpoint` Target TCP connections per NEG endpoint. Cloud Load Balancing uses this parameter to calculate target TCP connections per backend zone.

Using the`max-connections` parameter

When you specify themax-connections parameter, the value you provide definesthe capacity for an entire zone.

For a zonal instance group withN total instances andh healthy instances(whereh ≤N), the calculations are as follows:
- If you setmax-connections toX, the zonal target capacity isX.
- The average connections per instance isX / h.
Regional managed instance groups don't support themax-connections parameterbecause they consist of multiple zones. Instead, use themax-connections-per-instance parameter.
For a zonal NEG withN total endpoints andh healthy endpoints(whereh ≤N), the calculations are as follows:
- If you setmax-connections toX, the zonal target capacity isX.
- The average connections per endpoint isX / h.

Using the`max-connections-per-instance` or`max-connections-per-endpoint` parameter

When you specify either themax-connections-per-instance ormax-connections-per-endpoint parameter, the load balancer uses the value youprovide to calculate a per-zone capacity:

For a zonal instance group withN total instances andh healthy instances(whereh ≤N), the calculations are as follows:
- If you setmax-connections-per-instance toX, the zonal targetcapacity isN * X. This is equivalent to settingmax-connections toN * X.
- The average connections per instance is(N * X) / h.
For a regional managed instance group, if you setmax-connections-per-instance toX, Google Cloud calculates aper-zone target capacity for each zone of the instance group. In each zone, ifthere areK total instances andh healthy instances (whereh≤K), thecalculations are as follows:
- The zone's target capacity isK * X.
- The average connections per instance in the zone is(K * X) / h.
For a zonal NEG withN total endpoints andh healthy endpoints(whereh ≤N), the calculations are as follows:
- If you setmax-connections-per-endpoint toX, the zonal targetcapacity isN * X. This is equivalent to settingmax-connections toN * X.
- The average connections per endpoint is(N * X) / h.

Rate balancing mode

Application Load Balancer and Cloud Service Mesh backends with an unspecified orshorttraffic duration setting (Preview)can use theRATE balancing mode with one of the following required targetcapacity parameters:

**Table:** Target capacity parameters for the`RATE` balancing mode
Target capacity parameter	Supported backend
Target capacity parameter	Zonal (managed or unmanaged) instance groups	Regional managed instance groups	Zonal NEGs with`GCE_VM_IP_PORT` endpoints	Zonal hybrid connectivity NEGs
`max-rate` Target HTTP request rate per backend zone
`max-rate-per-instance` Target HTTP request rate per VM instance. Cloud Load Balancing uses this parameter to calculate target HTTP request rate per backend zone.
`max-rate-per-endpoint` Target HTTP request rate per NEG endpoint. Cloud Load Balancing uses this parameter to calculate target HTTP request rate per backend zone.

Using the`max-rate` parameter

When you specify themax-rate parameter, the value you provide defines thecapacity for an entire zone.

For a zonal instance group withN total instances andh healthy instances(whereh ≤N), the calculations are as follows:
- If you setmax-rate toX, the zonal target capacity isX requestsper second.
- The average requests per second per instance isX / h.
Regional managed instance groups don't support themax-rate parameterbecause they consist of multiple zones. Instead, use themax-rate-per-instance parameter.
For a zonal NEG withN total endpoints andh healthy endpoints(whereh ≤N), the calculations are as follows:
- If you setmax-rate toX, the zonal target capacity isX requestsper second.
- The average requests per second per endpoint isX / h.

Using the`max-rate-per-instance` or`max-rate-per-endpoint` parameter

When you specify either themax-rate-per-instance ormax-rate-per-endpointparameter, the load balancer uses the value you provide to calculate a per-zonecapacity:

For a zonal instance group withN total instances andh healthy instances(whereh ≤N), the calculations are as follows:
- If you setmax-rate-per-instance toX, the zonal target capacity isN * X requests per second. This is equivalent to settingmax-rate toN * X.
- The average requests per second per instance is(N * X) / h.
For a regional managed instance group, if you setmax-rate-per-instance toX, Google Cloud calculates a per-zone target capacity for each zone ofthe instance group. In each zone, if there areK total instances andhhealthy instances (whereh≤K), the calculations are as follows:
- The zone's target capacity isK * X requests per second.
- The average requests per second per instance in the zone is(K * X) / h.
For a zonal NEG withN total endpoints andh healthy endpoints (whereh≤N), the calculations are as follows:
- If you setmax-rate-per-endpoint toX, the zonal target capacity isN * X requests per second. This is equivalent to settingmax-rate toN * X.
- The average requests per second per endpoint is(N * X) / h.

In-flight balancing mode

Preview

Application Load Balancer and Cloud Service Mesh backends with a longtrafficduration setting can use theIN_FLIGHT balancingmode with one of the following required target capacity parameters:

**Table:** Target capacity parameters for the`IN_FLIGHT` balancing mode
Target capacity parameter	Supported backend
Target capacity parameter	Zonal (managed or unmanaged) instance groups	Regional managed instance groups	Zonal NEGs with`GCE_VM_IP_PORT` endpoints	Zonal hybrid connectivity NEGs
`max-in-flight-requests` Target number of in-progress HTTP requests per backend zone
`max-in-flight-requests-per-instance` Target number of in-progress HTTP requests per VM instance. Cloud Load Balancing uses this parameter to calculate target number of in-progress HTTP requests per backend zone.
`max-in-flight-requests-per-endpoint` Target number of in-progress HTTP requests per NEG endpoint. Load balancing uses this parameter, to calculate target number of in-progress HTTP requests per backend zone.

Using the`max-in-flight-requests` parameter

When you specify themax-in-flight-requests parameter, the value you providedefines the capacity for an entire zone.

For a zonal instance group withN total instances andh healthy instances(whereh ≤N), the calculations are as follows:
- If you setmax-in-flight-requests toX, the zonal target capacity isX in-progress HTTP requests.
- The average number of in-progress HTTP requests per instance isX / h.
Regional managed instance groups don't support themax-in-flight-requestsparameter because they consist of multiple zones. Instead, use themax-in-flight-requests-per-instance parameter.
For a zonal NEG withN total endpoints andh healthy endpoints (whereh≤N), the calculations are as follows:
- If you setmax-in-flight-requests toX, the zonal target capacity isX in-progress HTTP requests.
- The average number of in-progress HTTP requests per endpoint isX / h.

Using the`max-in-flight-requests-per-instance` or`max-in-flight-requests-per-endpoint` parameters

When you specify either themax-in-flight-requests-per-instance ormax-in-flight-requests-per-endpoint parameter, the load balancer uses thevalue you provide to calculate a per-zone capacity:

For a zonal instance group withN total instances andh healthy instances(whereh ≤N), the calculations are as follows:
- If you setmax-in-flight-requests-per-instance toX, the zonal targetcapacity isN * X in-progress HTTP requests. This is equivalent tosettingmax-in-flight-requests toN * X.
- The average in-progress HTTP requests per instance is(N * X) / h.
For a regional managed instance group, if you setmax-in-flight-requests-per-instance toX, Google Cloud calculates aper-zone target capacity for each zone of the instance group. In each zone, ifthere areK total instances andh healthy instances (whereh≤K), thecalculations are as follows:
- The zone's target capacity isK * X in-progress HTTP requests.
- The average in-progress HTTP requests per instance in the zone is(K * X) / h.
For a zonal NEG withN total endpoints andh healthy endpoints (whereh≤N), the calculations are as follows:
- If you setmax-in-flight-requests-per-endpoint toX, the zonal targetcapacity isN * X in-progress HTTP requests. This is equivalent tosettingmax-in-flight-requests toN * X.
- The average in-progress HTTP requests per endpoint is(N * X) / h.

Utilization balancing mode

Application Load Balancer, Cloud Service Mesh, and proxy Network Load Balancer instance groupbackends can use theUTILIZATION balancing mode. NEG backends don't supportthis balancing mode.

TheUTILIZATION balancing mode depends on VM CPU utilization along with otherfactors. When these factors fluctuate, the load balancer might calculateutilization in a way that leads to some VMs receiving more requests orconnections than others. Therefore, keep the following in mind:

Only use theUTILIZATION balancing mode with session affinity set toNONE.If your backend service uses a session affinity that's different fromNONE,then use theRATE,IN-FLIGHT, orCONNECTION balancing modes instead.
If the average utilization of VMs in all instance groups is less than 10%,some load balancers prefer to distribute new requests or connections tospecific zones. This zonal preference becomes less prevalent when the requestrate or connection count increases.

TheUTILIZATION balancing mode has no mandatory target capacity setting, butyou can optionally define a target capacity by using one of the target capacityparameters or combinations of target capacity parameters described in thefollowing sections.

Utilization target capacity parameters for Application Load Balancer and Cloud Service Mesh backends with an unspecified or short traffic duration setting

Application Load Balancer and Cloud Service Mesh backends with an unspecified orshorttraffic duration setting (Preview) can use theUTILIZATION balancing mode with one of the following target capacityparameters or combinations of parameters:

**Table:**`UTILIZATION` balancing mode target capacity parameters and parameter combinations for Application Load Balancer and Cloud Service Mesh backends with an unspecified or short traffic duration setting
Target capacity parameter or parameter combination	Supported backend
Target capacity parameter or parameter combination	Zonal (managed or unmanaged) instance groups	Regional managed instance groups	Zonal NEGs with`GCE_VM_IP_PORT` endpoints	Zonal hybrid connectivity NEGs
`max-utilization` Target utilization per backend zone
`max-rate` Target HTTP request rate per backend zone
`max-rate` and`max-utilization` Target is the first to be reached in the backend zone: Zone's target utilization Zone's target HTTP request rate
`max-rate-per-instance` Target HTTP request rate per VM instance. Cloud Load Balancing uses this parameter to calculate target HTTP request rate per backend zone.
`max-rate-per-instance` and`max-utilization` Target is the first to be reached in the backend zone: Zone's target utilization Zone's target HTTP request rate (calculated from the target HTTP request rate per VM instance of the VMs in the zone)

For more information about themax-rate andmax-rate-per-instance targetcapacity parameters, in this document, seeRate balancing mode.

Utilization target capacity parameters for Application Load Balancer and Cloud Service Mesh backends with a long traffic duration setting

Application Load Balancer and Cloud Service Mesh backends with a longtrafficduration setting (Preview) can use theUTILIZATIONbalancing mode with one of the following target capacity parameters orcombinations of parameters:

**Table:**`UTILIZATION` balancing mode target capacity parameters and parameter combinations for Application Load Balancer and Cloud Service Mesh backends with a long traffic duration setting (Preview)
Target capacity parameter or parameter combination	Supported backend
Target capacity parameter or parameter combination	Zonal (managed or unmanaged) instance groups	Regional managed instance groups	Zonal NEGs with`GCE_VM_IP_PORT` endpoints	Zonal hybrid connectivity NEGs
`max-utilization` Target utilization per backend zone
`max-in-flight-requests` Target number of in-progress HTTP requests per backend zone
`max-in-flight-requests` and`max-utilization` Target is the first to be reached in the backend zone: Zone's target utilization Zone's target number of in-progress HTTP requests
`max-in-flight-requests-per-instance` Target number of in-progress HTTP requests per VM instance. Cloud Load Balancing uses this parameter to calculate target number of in-progress HTTP requests per backend zone.
`max-in-flight-requests-per-instance` and`max-utilization` Target is the first to be reached in the backend zone: Zone's target utilization Zone's target number of in-progress HTTP requests (calculated from the target number of in-progress HTTP requests per VM instance of the VMs in the zone)

For more information about themax-in-flight-requests andmax-in-flight-requests-per-instance target capacity parameters, in thisdocument, seeIn-flight balancing mode.

Utilization target capacity parameters for proxy Network Load Balancers

Instance group backends of proxy Network Load Balancers can use theUTILIZATION balancingmode with one of the following target capacity parameters or combinations ofparameters.

**Table:**`UTILIZATION` balancing mode target capacity parameters and parameter combinations for proxy Network Load Balancer backends
Target capacity parameter or parameter combination	Supported backend
Target capacity parameter or parameter combination	Zonal (managed or unmanaged) instance groups	Regional managed instance groups	Zonal NEGs with`GCE_VM_IP_PORT` endpoints	Zonal hybrid connectivity NEGs
`max-utilization` Target utilization per backend zone
`max-connections` Target TCP connections per backend zone
`max-connections` and`max-utilization` Target is the first to be reached in the backend zone: Zone's target utilization Zone's target TCP connections
`max-connections-per-instance` Target TCP connections per VM instance. Cloud Load Balancing uses this parameter to calculate target TCP connections per backend zone.
`max-connections-per-instance` and`max-utilization` Target is the first to be reached in the backend zone: Zone's target utilization Zone's target TCP connections (calculated from the target TCP connections per VM instance of the VMs in the zone)

For more information about themax-connections andmax-connections-per-instance target capacity parameters, in this document, seeConnection balancing mode.

Custom metrics balancing mode

Application Load Balancer and proxy Network Load Balancer backends canuse theCUSTOM_METRICS balancing mode. Custom metrics let you define targetcapacity based on application or infrastructure data that's most important toyou. For more information, seeCustom metrics forApplication Load Balancers.

TheCUSTOM_METRICS balancing mode has no mandatory target capacity setting,but you can optionally define a target capacity by using one of the targetcapacity parameters or combinations of target capacity parameters described inthe following sections.

Custom metrics target capacity parameters for Application Load Balancer backends with an unspecified or short traffic duration setting

Application Load Balancer backends with an unspecified orshorttraffic duration setting (Preview)can use theCUSTOM_METRICS balancing mode with one of the following targetcapacity parameters or combinations of parameters:

**Table:**`CUSTOM_METRICS` balancing mode target capacity parameters and parameter combinations for Application Load Balancer backends with an unspecified or short traffic duration setting
Target capacity parameter or parameter combination	Supported backend
Target capacity parameter or parameter combination	Zonal (managed or unmanaged) instance groups	Regional managed instance groups	Zonal NEGs with`GCE_VM_IP_PORT` endpoints	Zonal hybrid connectivity NEGs
`backends[].customMetrics[].maxUtilization` Target custom metric utilization per backend zone
`max-rate` Target HTTP request rate per backend zone
`max-rate` and`backends[].customMetrics[].maxUtilization` Target is the first to be reached in the backend zone: Zone's target custom metric utilization Zone's target HTTP request rate
`max-rate-per-instance` Target HTTP request rate per VM instance. Cloud Load Balancing uses this parameter to calculate target HTTP request rate per backend zone.
`max-rate-per-instance` and`backends[].customMetrics[].maxUtilization` Target is the first to be reached in the backend zone: Zone's target custom metric utilization Zone's target HTTP request rate (calculated from the target HTTP request rate per VM instance of the VMs in the zone)
`max-rate-per-endpoint` Target HTTP request rate per NEG endpoint. Cloud Load Balancing uses this parameter to calculate target HTTP request rate per backend zone.
`max-rate-per-endpoint` and`backends[].customMetrics[].maxUtilization` Target is the first to be reached in the backend zone: Zone's target custom metric utilization Zone's target HTTP request rate (calculated from the target HTTP request rate per NEG endpoint of the endpoints in the zone)

For more information about themax-rate,max-rate-per-instance, andmax-rate-per-endpoint target capacity parameters, in this document, seeRate balancing mode.

Custom metrics target capacity parameters for Application Load Balancer backends with a long traffic duration setting

Application Load Balancer backends with a longtrafficduration setting can use theCUSTOM_METRICSbalancing mode with one of the following target capacity parameters orcombinations of parameters:

**Table:**`CUSTOM_METRICS` balancing mode target capacity parameters and parameter combinations for Application Load Balancer backends with a long traffic duration setting (Preview)
Target capacity parameter or parameter combination	Supported backend
Target capacity parameter or parameter combination	Zonal (managed or unmanaged) instance groups	Regional managed instance groups	Zonal NEGs with`GCE_VM_IP_PORT` endpoints	Zonal hybrid connectivity NEGs
`backends[].customMetrics[].maxUtilization` Target custom metric utilization per backend zone
`max-in-flight-requests` Target number of in-progress HTTP requests per backend zone
`max-in-flight-requests` and`backends[].customMetrics[].maxUtilization` Target is the first to be reached in the backend zone: Zone's target custom metric utilization Zone's target number of in-progress HTTP requests
`max-in-flight-requests-per-instance` Target number of in-progress HTTP requests per VM instance. Cloud Load Balancing uses this parameter to calculate target number of in-progress HTTP requests per backend zone.
`max-in-flight-requests-per-instance` and`backends[].customMetrics[].maxUtilization` Target is the first to be reached in the backend zone: Zone's target custom metric utilization Zone's target number of in-progress HTTP requests (calculated from the target number of in-progress HTTP requests per VM instance of the VMs in the zone)
`max-in-flight-requests-per-endpoint` Target number of in-progress HTTP requests per NEG endpoint. Load balancing uses this parameter to calculate target number of in-progress HTTP requests per backend zone.
`max-in-flight-requests-per-endpoint` and`backends[].customMetrics[].maxUtilization` Target is the first to be reached in the backend zone: Zone's target custom metric utilization Zone's target number of in-progress HTTP requests (calculated from the target number of in-progress HTTP requests per NEG endpoint of the endpoints in the zone)

For more information about themax-in-flight-requests,max-in-flight-requests-per-instance, andmax-flight-requests-per-endpoint target capacity parameters, see theIn-flight balancing mode.

Service load balancing policy

Aservice load balancing policy (serviceLbPolicy) is a resource associatedwith the load balancer'sbackendservice. It lets you customize theparameters that influence how traffic is distributed within the backendsassociated with a backend service:

Customize the load balancing algorithm used to determine how traffic isdistributed among regions or zones.
Enable auto-capacity draining so that the load balancer can quickly draintraffic from unhealthy backends.

Additionally, you can designate specific backends aspreferred backends. Thesebackends must be used to capacity (that is, the target capacity specified bythe backend's balancing mode) before requests are sent to the remainingbackends.

To learn more, seeAdvanced load balancing optimizations.

Load balancing locality policy

For a backend service, traffic distribution is based on a balancing mode and aload balancing locality policy. The balancing mode determines the fraction oftraffic that should be sent to each backend (instance group or NEG). The loadbalancing locality policy then (LocalityLbPolicy) determines how traffic isdistributed across instances or endpoints within each zone. For regional managedinstance groups, the locality policy applies to each constituent zone.

The load balancing locality policy is configured per-backend service. Thefollowing settings are available:

ROUND_ROBIN (default): This is the default load balancing locality policysetting in which the load balancer selects a healthy backend in round robinorder.
LEAST_REQUEST: AnO(1) algorithm in which the load balancer selects tworandom healthy hosts and picks the host which has fewer active requests.
RING_HASH: This algorithm implements consistent hashing to backends. Thealgorithm has the property that the addition or removal of a host from a setof N hosts only affects 1/N of the requests.
RANDOM: The load balancer selects a random healthy host.
ORIGINAL_DESTINATION: The load balancer selects a backend based on theclient connection metadata. Connections are opened to the original destinationIP address specified in the incoming client request, before the request wasredirected to the load balancer.
MAGLEV: Implements consistent hashing to backends and can be used as areplacement for theRING_HASH policy. Maglev is not as stable asRING_HASHbut has faster table lookup build times and host selection times. For moreinformation about Maglev, see theMaglevwhitepaper.
WEIGHTED_MAGLEV: Implements per-instance weighted load balancing forexternal passthrough Network Load Balancers by using weights reported by health checks. If this policy isused, the backend service must configure a non legacy HTTP-based health check,and health check replies are expected to contain the non-standard HTTPresponse header field,X-Load-Balancing-Endpoint-Weight, to specify theper-instance weights. Load balancing decisions are made based on theper-instance weights reported in the last processed health check replies, aslong as every instance reports a valid weight or reportsUNAVAILABLE_WEIGHT.Otherwise, load balancing will remain equal-weight.
For an example, seeSet up weighted load balancing forexternal passthrough Network Load Balancers.
WEIGHTED_ROUND_ROBIN: The load balancer usesuser-defined custommetrics to select theoptimal instance or endpoint within the backend to serve the request.

**Table:** Supported load balancing locality policy settings
Load balancer	Load balancing locality policy options
Global external Application Load Balancer Regional external Application Load Balancer Cross-region internal Application Load Balancer Regional internal Application Load Balancer	Supported options: `ROUND_ROBIN` `LEAST_REQUEST` `RING_HASH` `RANDOM` `ORIGINAL_DESTINATION` (not supported for global external Application Load Balancers) `MAGLEV` `WEIGHTED_ROUND_ROBIN`
Global external proxy Network Load Balancer Regional external proxy Network Load Balancer Cross-region internal proxy Network Load Balancer Regional internal proxy Network Load Balancer	Supported options: `ROUND_ROBIN` `LEAST_REQUEST` `RING_HASH` `RANDOM` `MAGLEV` `ORIGINAL_DESTINATION` (not supported for global external proxy Network Load Balancers)
External passthrough Network Load Balancer	Supported options: `MAGLEV` `WEIGHTED_MAGLEV`
Internal passthrough Network Load Balancer Classic Application Load Balancer Classic proxy Network Load Balancer	Not supported

Note that the effective default value of the load balancing locality policy(localityLbPolicy) changes according to your session affinitysettings. If session affinity is not configured—that is, if sessionaffinity remains at the default value ofNONE—then thedefault value forlocalityLbPolicy isROUND_ROBIN. Ifsession affinity is set to a value other thanNONE, then thedefault value forlocalityLbPolicy isMAGLEV.

To configure a load balancing locality policy, you can use theGoogle Cloud console, gcloud(--locality-lb-policy)or the API(localityLbPolicy).

Backend subsetting

Backend subsetting is an optional feature that improves performance andscalability by assigning a subset of backends to each of the proxy instances.

Backend subsetting is supported for the following:

Regional internal Application Load Balancer
Internal passthrough Network Load Balancer

Backend subsetting for regional internal Application Load Balancers

Preview

The cross-region internal Application Load Balancer doesn't support backend subsetting.

For regional internal Application Load Balancers, backend subsetting automatically assigns only asubset of the backends within the regional backend service to each proxyinstance. By default, each proxy instance opens connections to allthe backends within a backend service. When the number of proxy instances andthe backends are both large, opening connections to all the backends can lead toperformance issues.

By enabling subsetting, each proxy only opens connections to a subsetof the backends, reducing the number of connections which are kept open to eachbackend. Reducing the number of simultaneously open connections to each backendcan improve performance for both the backends and the proxies.

The following diagram shows a load balancer with two proxies. Without backendsubsetting, traffic from both proxies is distributed to all the backends in thebackend service 1. With backend subsetting enabled, traffic from each proxy isdistributed to a subset of the backends. Traffic from proxy 1 is distributed tobackends 1 and 2, and traffic from proxy 2 is distributed to backends 3 and 4.

Comparing internal Application Load Balancer without and with backend subsetting. — Comparing internal Application Load Balancer without and with backend subsetting (click to enlarge).

You can additionally refine the load balancing traffic to the backends by setting thelocalityLbPolicy policy.For more information, seeTraffic policies.

To read about setting up backend subsetting for internal Application Load Balancers, seeConfigure backend subsetting.

Caveats related to backend subsetting for internal Application Load Balancer

Although backend subsetting is designed to ensure that all backend instancesremain well utilized, it can introduce some bias in the amount of traffic thateach backend receives. Setting thelocalityLbPolicy toLEAST_REQUEST isrecommended for backend services that are sensitive to the balance of backendload.
Enabling or disabling subsetting breaks existing connections.
Backend subsetting requires that the session affinity isNONE (a 5-tuple hash).Other session affinity options can only be used if backend subsetting isdisabled. The default values of the--subsetting-policy and--session-affinity flags are bothNONE, and only one of them at a timecan be set to a different value.

Backend subsetting for internal passthrough Network Load Balancer

Backend subsetting for internal passthrough Network Load Balancers lets you scale your internal passthrough Network Load Balancerto support a larger number of backend VM instances per internal backendservice.

For information about how subsetting affects this limit, seeBackendservices in "Quotas andlimits".

By default, subsetting is disabled, which limits the backend service todistributing to up to 250 backend instances or endpoints. If your backendservice needs to support more than 250 backends, you can enable subsetting. Whensubsetting is enabled, a subset of backend instances is selected for each clientconnection.

The following diagram shows a scaled-down model of the difference between thesetwo modes of operation.

Comparing an internal passthrough Network Load Balancer without and with subsetting. — Comparing an internal passthrough Network Load Balancer without and with subsetting (click to enlarge).

Without subsetting, the complete set of healthy backends is better utilized, andnew client connections are distributed among all healthy backends accordingtotraffic distribution. Subsettingimposes load balancing restrictions but allows the load balancer to support morethan 250 backends.

For configuration instructions, seeSubsetting.

Caveats related to backend subsetting for internal passthrough Network Load Balancer

When subsetting is enabled, not all backends will receive traffic from a givensender even when the number of backends is small.
For the maximum number of backend instances when subsetting is enabled, seethe quotas page .
Only 5-tuplesession affinityis supported with subsetting.
Packet Mirroring is not supported with subsetting.
Enabling or disabling subsetting breaks existing connections.
If on-premises clients need for to access an internal passthrough Network Load Balancer, subsetting cansubstantially reduce the number of backends that receive connections from youron-premises clients. This is because the region of the Cloud VPNtunnel or Cloud Interconnect VLAN attachment determines the subset ofthe load balancer's backends. All Cloud VPN andCloud Interconnect endpoints in a specific region use the samesubset. Different subsets are used in different regions.

Backend subsetting pricing

There is no charge for using backend subsetting.For more information, seeAll networking pricing.

Session affinity

Session affinity lets you control how the load balancer selects backendsfor new connections in a predictable way as long as the number of healthybackends remains constant. This is useful for applications that need multiplerequests from a given user to be directed to the same backend or endpoint. Suchapplications usually include stateful servers used by ads serving, games, orservices with heavy internal caching.

Google Cloud load balancers provide session affinity on a best-effortbasis. Factors such as changing backend health check states, adding or removingbackends, changes in backend weights (including enabling or disabling weightedbalancing), or changes to backend fullness, as measured by the balancing mode,can break session affinity.

Load balancing with session affinity works well when there is areasonably largedistribution of unique connections. Reasonably large means at least severaltimes the number of backends. Testing a load balancer with a small number ofconnections won't result in an accurate representation of the distribution ofclient connections among backends.

By default, all Google Cloud load balancers select backends by using afive-tuple hash (--session-affinity=NONE), as follows:

Packet's source IP address
Packet's source port (if present in the packet's header)
Packet's destination IP address
Packet's destination port (if present in the packet's header)
Packet's protocol

To learn more about session affinity for passthrough Network Load Balancers, see the followingdocuments:

To learn more about session affinity for Application Load Balancers, see thefollowing documents:

To learn more about session affinity for proxy Network Load Balancers, see thefollowing documents:

Backend service timeout

Most Google Cloud load balancers have abackend service timeout. Thedefault value is 30 seconds. The full range of timeout values allowed is1 - 2,147,483,647 seconds.

For external Application Load Balancers and internal Application Load Balancers using the HTTP, HTTPS, orHTTP/2 protocol, the backend service timeout is a request and response timeoutfor HTTP(S) traffic.
For more details about the backend service timeout for each load balancer, seethe following:
- For global external Application Load Balancersand regional external Application Load Balancers, seeTimeouts and retries.
- For internal Application Load Balancers, seeTimeouts and retries.
For external proxy Network Load Balancers and internal proxy Network Load Balancers, the configuredbackend service timeout is the length of time the load balancer keeps the TCPconnection open in the absence of any data transmitted from either the clientor the backend. After this time has passed without any data transmitted, theproxy closes the connection.
- Default value: 30 seconds
- Configurable range: 1 to 2,147,483,647 seconds
For internal passthrough Network Load Balancers and external passthrough Network Load Balancers, you can set the value ofthe backend service timeout usinggcloud or the API, but the value isignored. Backend service timeout has no meaning for these pass-throughload balancers.

For Cloud Service Mesh, the backend service timeout field (specified usingtimeoutSec) is not supported with proxyless gRPC services.For such services, configure the backend service timeout using themaxStreamDuration field. This is because gRPC does not support thesemantics oftimeoutSec that specifies the amount of time to wait for abackend to return a full response after the request is sent. gRPC's timeoutspecifies the amount of time to wait from the beginning of the stream untilthe response has been completely processed, including all retries.

Health checks

Each backend service whose backends are instance groups or zonal NEGs musthave an associatedhealth check. Backendservices using a serverless NEG or a global internet NEG as a backend mustnotreference a health check.

When you create a load balancer using the Google Cloud console, you can create thehealth check, if it is required, when you create the load balancer, or you canreference an existing health check.

When you create a backend service using either instance group or zonal NEGbackends using the Google Cloud CLI or the API, you must reference anexisting health check. Refer to theload balancerguide in theHealthChecks Overview for details about the type and scope of health check required.

For more information, read the following documents:

Additional features enabled on the backend service resource

The following optional features are supported by some backend services.

Cloud CDN

Cloud CDN uses Google's global edge network to serve content closer tousers, which accelerates your websites and applications. Cloud CDN isenabled on backend services used byglobal external Application Load Balancers. The load balancerprovides the frontend IP addresses and ports that receive requests, and thebackends that respond to the requests.

For more details, see theCloud CDN documentation.

Cloud CDN is incompatible with IAP. They can't beenabled on the same backend service.

Cloud Armor

If you use one of the following load balancers, you can add additionalprotection to your applications by enabling Cloud Armor on the backendservice during load balancer creation:

If you use the Google Cloud console, you can do one of the following:

Select an existingCloud Armor security policy.
Accept the configuration of a default Cloud Armor rate-limitingsecurity policy with a customizable name, request count, interval, key, andrate limiting parameters. If you use Cloud Armor with an upstreamproxy service, such as a CDN provider,Enforce_on_key should be set as anXFF IP address.
Choose to opt out of Cloud Armor protection by selectingNone.

IAP

IAP lets you establish a centralauthorization layer for applications accessed by HTTPS, so you can use anapplication-level access control model instead of relying on network-levelfirewalls. IAP is supported bycertainApplication Load Balancers.

IAP is incompatible with Cloud CDN. They can't beenabled on the same backend service.

Advanced traffic management features

To learn about advanced traffic management features that are configured on thebackend services and URL maps associated with load balancers, see the following:

API and`gcloud` reference

For more information about the properties of the backend service resource,see the following references:

Global backend service APIresource

What's next

For related documentation and information about how backend services are used inload balancing, review the following:

For related videos:

How to configure backend services for global external Application Load Balancers

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.

Movatterモバイル変換

Backend services overview Stay organized with collections Save and categorize content based on your preferences.

Load balancer naming

Backends

Instance groups

Backend VMs and external IP addresses

Named ports

Restrictions and guidance for instance groups

Zonal network endpoint groups

Internet network endpoint groups

Serverless network endpoint groups

Service bindings

Mixed backends

Protocol to the backends

IP address selection policy

Encryption between the load balancer and backends

Balancing mode, target capacity, and capacity scaler

Supported backends

Balancing modes for Application Load Balancers and Cloud Service Mesh

Traffic duration setting

Balancing modes for short traffic duration

Balancing modes for long traffic duration

Balancing modes for Proxy Network Load Balancers

Target capacity specifications

Connection balancing mode

Using themax-connections parameter

Using themax-connections-per-instance ormax-connections-per-endpoint parameter

Rate balancing mode

Using themax-rate parameter

Using themax-rate-per-instance ormax-rate-per-endpoint parameter

In-flight balancing mode

Using themax-in-flight-requests parameter

Using themax-in-flight-requests-per-instance ormax-in-flight-requests-per-endpoint parameters

Utilization balancing mode

Utilization target capacity parameters for Application Load Balancer and Cloud Service Mesh backends with an unspecified or short traffic duration setting

Utilization target capacity parameters for Application Load Balancer and Cloud Service Mesh backends with a long traffic duration setting

Utilization target capacity parameters for proxy Network Load Balancers

Custom metrics balancing mode

Custom metrics target capacity parameters for Application Load Balancer backends with an unspecified or short traffic duration setting

Custom metrics target capacity parameters for Application Load Balancer backends with a long traffic duration setting

Service load balancing policy

Load balancing locality policy

Backend subsetting

Backend subsetting for regional internal Application Load Balancers

Caveats related to backend subsetting for internal Application Load Balancer

Backend subsetting for internal passthrough Network Load Balancer

Caveats related to backend subsetting for internal passthrough Network Load Balancer

Backend subsetting pricing

Session affinity

Backend service timeout

Health checks

Additional features enabled on the backend service resource

Cloud CDN

Cloud Armor

IAP

Advanced traffic management features

API andgcloud reference

What's next

Backend services overview

Using the`max-connections` parameter

Using the`max-connections-per-instance` or`max-connections-per-endpoint` parameter

Using the`max-rate` parameter

Using the`max-rate-per-instance` or`max-rate-per-endpoint` parameter

Using the`max-in-flight-requests` parameter

Using the`max-in-flight-requests-per-instance` or`max-in-flight-requests-per-endpoint` parameters

API and`gcloud` reference