Serverless network endpoint groups overview

A network endpoint group (NEG) specifies a group of backend endpoints for a loadbalancer. Aserverless NEG is a backend that points to aCloud Run,App Engine,Cloud Run functions, orAPI Gatewayresource.

A serverless NEG can represent one of the following:

A Cloud Run resource or a group of resources.
A Cloud Run function or group of functions (formerlyCloud Run functions 2nd gen).
A Cloud Run function (1st gen) or group of functions
An App Engine standard environment or App Engine flexible environment app, a specific service within an app,a specific version of an app, or a group of services.
An API Gateway that provides access to your services through aREST APIconsistent across all services, regardless of service implementation.This capability is inPreview.

Important: For a group of resources to be in the same serverless NEG, they musthave a common URL pattern.

Supported load balancers

The following table lists the serverless products supported by eachApplication Load Balancer. Serverless NEGs are not supported by proxy Network Load Balancersand passthrough Network Load Balancers.

Serverless NEG type	Application Load Balancers
Serverless NEG type	Regional internal	Cross-region internal	Global external	Classic	Regional external
Cloud Run Supports Cloud Run and Cloud Run functions (2nd gen)
App Engine
Cloud Functions Supports Cloud Run functions (1st gen), formerly known as Cloud Functions 1st gen

Use cases

When your load balancer is enabled for serverless apps, you can dothe following:

Configure your serverless app to serve from a dedicated IPv4 IP address thatis not shared with other services.

Map a single URL to multiple serverless functions or services that serve atthe same domain. In this document, see URL masks.
Share URL space with other Google Cloud compute platforms. By using multiplebackend services, a single load balancer can send traffic to multiplebackend types. The load balancer selects the correct backend service based onthe host or path of the request URL.
Reuse the same SSL certificates and private keys that you usefor Compute Engine, Google Kubernetes Engine, and Cloud Storage.Reusing the same certificates eliminates the need to manage separatecertificates for serverless apps.

Global external Application Load Balancer and classic Application Load Balancer

Setting up aglobal external Application Load Balancer or aclassic Application Load Balancerenables your serverless apps to integrate with existing cloud services. You cando the following:

Protect your service withGoogle Cloud Armor, an edge DDoS defense andWAF security product available to all services accessed through anexternal Application Load Balancer. There are somelimitationsassociated with this capability, especially for Cloud Run andApp Engine.
Enable your service to optimize delivery usingCloud CDN.Cloud CDN caches content close to your users.Cloud CDN provides capabilities like cache invalidation andCloud CDN signed URLs.
Use Google's Edge infrastructure to terminate user'sHTTP(S) connections closer to the user, thus decreasing latency.

To learn how to configure a load balancer with a serverless computebackend, see the following documentation:

Integrating an external Application Load Balancer with API Gatewayenables your serverless backends to take advantage of all the features providedby Cloud Load Balancing.For more information, seeExternal Application Load Balancer for API Gateway.To configure an external Application Load Balancer to route traffic to an API Gateway,seeGetting started with an external Application Load Balancer for API Gateway.This capability is inPreview.

Regional external Application Load Balancer

Using aregional external Application Load Balancerlets you run workloads with regulatory or compliance requirements onCloud Run or Cloud Run functions (2nd gen) backends.For example, if you require that your application's network configurations andtraffic termination reside in a specific region, a regional external Application Load Balancer isoften the preferred option to comply with the necessary jurisdictional controls.

To learn how to configure a regional external Application Load Balancer with a serverless computebackend, seeSet up a regional external Application Load Balancer withCloud Run.

Regional internal Application Load Balancer and cross-region internal Application Load Balancer

When an internal Application Load Balancer is configured withCloud Run or Cloud Run functions (2nd gen)backends, you can do the following:

Enableadvanced traffic managementfeatures suchas fault injection, header rewrites, redirects, traffic splitting, and more,for your Cloud Run and Cloud Run functions (2nd gen) services.
Seamlessly migrate legacy services from Compute Engine,GKE, or on-premises, to Cloud Run and Cloud Run functions (2nd gen) totake advantage of weight-based traffic splitting to gradually shift trafficto Cloud Run without any downtime.
Protect yourCloud Run and Cloud Run functions (2nd gen) services withVPC Service Controls.
Establish a single, policy-enforcing internal ingress point for your servicesrunning in Cloud Run, Cloud Run functions (2nd gen), Compute Engine, andGKE.

To learn how to configure an regional internal Application Load Balancer with a serverless computebackend, seeSet up a regional internal Application Load Balancer withCloud Run.

The rest of this page discusses how to use serverless NEGs with yourApplication Load Balancers. For more information about other types of NEGs, seeNetwork endpoint groups overview.

Endpoint types

Serverless NEGs don't have any network endpoints such as ports or IP addresses.They can only point to an existingCloud Run,App Engine,API Gateway,orCloud Run functions resource residing in the same region asthe NEG.

When you create a serverless NEG, you specify the fully qualified domain name(FQDN) of the Cloud Run, App Engine,API Gateway,or Cloud Run functions resource. The endpoint is of typeSERVERLESS. Other endpoint types are not supported in a serverless NEG.

A serverless NEG cannot have more than one endpoint. The endpoint points toeither a serverless application or aURL mask. The load balancerserves as the frontend for the serverless compute application and proxiestraffic to the specified endpoint. However, if the backend service containsmultiple serverless NEGs in different regions, the load balancer sends trafficto the NEG in the closest region to minimize request latency.

Network tier

For global external Application Load Balancers, you can use a serverless NEG in a load balancer usingeither Standard or Premium Network Service Tiers. The Premium Tier is required only ifyou want to set up serverless NEGs in multiple regions.

Regional external Application Load Balancers are always Standard Tier.

Cross-region internal Application Load Balancers and regional internal Application Load Balancers are always Premiumtier.

Load balancing components

A load balancer using a serverless NEG backend requiresspecial configurationonly for the backend service. The frontend configuration isthe same as any other proxy-based Google Cloud load balancer.Additionally, internal Application Load Balancers require aproxy-only subnet to runEnvoy proxies on your behalf.

The following diagrams show a sample serverless NEG deployment.

Global external

This diagram shows how a serverless NEG fits into a global external Application Load Balancerarchitecture.

Regional external

This diagram shows how a serverless NEG fits into a regional external Application Load Balancerarchitecture.

Regional internal

This diagram shows how a serverless NEG fits into the regional internal Application Load Balancermodel.

Cross-region

This diagram shows how a serverless NEG fits into the cross-region internal Application Load Balancermodel.

Frontend components

No special frontend configuration is required for load balancing with serverlessNEG backends.Forwarding rulesare used to route traffic by IP address, port, and protocol to a target proxy.Thetarget proxy then terminatesconnections from clients.

URL maps are used byApplication Load Balancers to set up URL-based routing of requests to theappropriate backend services.

For more details on each of these components, refer the architecture sections ofthe specific load balancer overviews:

Backend service

Backend services provide configuration information to the load balancer. Loadbalancers use the information in a backend service to direct incoming trafficto one or more attached backends. Serverless NEGs can be used as backends forcertain load balancers.

The following restrictions apply depending on the type of load balancer:

A global backend service used by global external Application Load Balancers can have severalserverless NEGs attached to it, but only one serverless NEGper region.
A regional backend service used by regional internal Application Load Balancers andregional external Application Load Balancers can only have one serverless NEG attached to it.
A global backend service used by cross-region internal Application Load Balancers can only haveCloud Run and Cloud Run functions (2nd gen) resources attached to it.

Each serverless NEG can point to either of the following:

The FQDN for a single resource
AURL mask that points to multiple resources that serve at thesame domain

A URL mask is a URL schema template that tells the serverless NEG backend how tomap the user request to the correct service. URL masks are useful if you areusing a custom domain for your serverless application and have multipleservices serving at the same domain. Instead of creating a separateserverless NEG for each resource, you can create the NEG with ageneric URL mask for the custom domain. For more informationand examples, seeURL masks.

For additional restrictions when adding a serverless NEG as a backend, seeLimitations.

Outlier detection for serverless NEGs

Outlier detection is an optional configuration that can be enabled on a globalbackend service that has serverless NEGs attached to it. The outlier detectionanalysis is only available for a cross-region internal Application Load Balancer, global external Application Load Balancer,and not for a classic Application Load Balancer. The outlier detection analysis identifiesunhealthy serverless NEGs based on their HTTP response patterns, and reduces theerror rate by routing most new requests from unhealthy resources to healthyresources. To learn how the outlier detection algorithm works and understand itslimitations, see the following example.

Assume that there is a backend service with two serverless NEGsattached to it—one in theREGION_A region and another in theREGION_B region. If the serverless NEG that serves as a backend to aglobal external Application Load Balancer in theREGION_A region is not responsive, outlierdetection identifies the serverless NEG as unhealthy. Based on outlier detectionanalysis, some of the new requests are then sent to the serverless NEG in theREGION_B region.

Based on the type of server error that is encountered, you can use one of thefollowing outlier detection methods to enable outlier detection:

Consecutive 5xx errors. A5xx series HTTP status code qualifies as anerror.
Consecutive gateway errors. Only502,503, and504 HTTP status codesqualify as an error.

Note that even after enabling outlier detection, you'll likely see some requestsbeing sent to the unhealthy resource and thus returning 5XX errors to theclients. This is because results of the outlier detection algorithm (ejection ofendpoints from the load balancing pool and returning them back to the pool) areexecuted independently by each proxy instance of the load balancer. In mostcases, more than one proxy instance handles the traffic received by a backendservice. Thus, it is possible that an unhealthy endpoint is detected and ejectedby only some of the proxies, and while this happens, other proxies may continueto send requests to the same unhealthy endpoint.

To reduce error rates further, you can configure more aggressive outlier detectionparameters. We recommend configuring higher values for the ejection thresholds(outlierDetection.baseEjectionTime). For example, our tests show thatsettingoutlierDetection.baseEjectionTime to 180 seconds with a sustainedQPS of higher than 100 results in less than 5% observed error rates. To learnmore about the outlier detection API, seeoutlierDetectionin theglobal backend service APIdocumentation.

The followingoutlierDetection fields are not supported when the backendservice has a serverless NEG attached to it:

outlierDetection.enforcingSuccessRate
outlierDetection.successRateMinimumHosts
outlierDetection.successRateRequestVolume
outlierDetection.successRateStdevFactor

To learn how to configure outlier detection, seeSet up aglobal external Application Load Balancer with a serverless backend: Enable outlierdetection.

Note: Outlier detection analysis for serverless NEGs can't beconfigured on backend services that have Identity-Aware Proxy (IAP) enabled.This is because the outlier detection analysis process on anIAP-enabled backend service cannot identify unhealthy serverlessresources. As a result, it cannot route new requests to a healthy serverlessresource.

URL masks

A serverless NEG backend can point to either a singleCloud Run (or App Engine orCloud Run functions if applicable) resource, or aURL mask that points tomultiple resources. A URL mask is a template of your URL schema. The serverlessNEG uses this template to map the request to the appropriate resource.

URL masks are an optional feature that make it easier to configure serverlessNEGs when your serverless application is comprised of multipleCloud Run, Cloud Run functions, or App Engineresources. Serverless NEGs used with internal Application Load Balancers can only use a URL maskthat points to Cloud Run or Cloud Run functions (2nd gen) services.

URL masks are useful if your serverless app is mapped to a custom domain ratherthan the default address that Google Cloud provides.With a custom domain such asexample.com, you could have multipleresources deployed to different subdomains or paths on the same domain. In suchcases, instead of creating a separate serverless NEG backend for eachresource, you can create a single serverless NEG with a generic URL mask for thecustom domain (for example,example.com/<service>). The NEG extracts theservice name from the request's URL.

The following illustration shows an external Application Load Balancer with a single backendservice and serverless NEG that uses a URL mask to map user requests todifferent services.

Distributing traffic to serverless apps. — Using a URL mask to distribute traffic to different services (click to enlarge).

URL masks work best when your application's resources use a predictable URLschema. The advantage of using a URL mask instead of a URL map is that you don'tneed to create separate serverless NEGs for thelogin andsearch services.You also don't need to modify your load balancer configuration each time you adda new resource to your application.

Limitations

A serverless NEG cannot have any network endpoints such as IP address or port.
Serverless NEGs can point only to serverless resourcesresiding in the same region where the NEG is created.
For a load balancer that is using a Serverless NEG backend, the serverless NEGmust be created in the same project as the backingCloud Run, App Engine,API Gateway, or Cloud Run functions resources pointed to bythe NEG. You might see requests failing if you connect a service that is notin the same project as the serverless NEG.
A load balancer configured with a serverless NEG cannot detect whether theunderlying serverless resource is working as expected. This means thateven if your resource is returning errors, the load balancer continues todirect traffic to it. Make sure to thoroughly testnew versions of your resources before routing user traffic to them.
Google Cloud console. You can only create serverless NEGs while you arecreating or editing a load balancer by using theLoad balancing page intheGoogle Cloud console. While you can't create or edit serverless NEGs on theNetwork endpoint groupspage, you can use this page to see a list of all the serverlessNEGs in your project.

Limitations with backend services

The following limitations apply to backend services that have a serverless NEGbackend:

A global backend service used by global external Application Load Balancers can have only oneserverless NEGper region. To combine multiple serverless NEGs in a singlebackend service, all the NEGs must represent functionally equivalentdeployments in different regions. For example, the NEGs can point to the sameCloud Run, App Engine, orCloud Run functions resource deployed in different regions.
A global backend service used by cross-region internal Application Load Balancers can haveonly one Cloud Run or Cloud Run functions (2nd gen)resource attached to it.
A regional backend service can only have one serverless NEG attached to it.
Cross-project service referencingin a Shared VPC deployment is supported with configurations thatcontain a serverless NEG. To use this feature, you create the load balancer'sfrontend components (IP address, forwarding rule, target proxy, and URL map)in a project different from load balancer's backend components (backendservice and serverless NEGs). Note that the backend service, associatedserverless NEGs, and the backing serverless resource (Cloud Run,App Engine, API Gateway, or Cloud Run functions),must always be created in the same project.
Thebackend service timeoutsetting does not applyto backend services with serverless NEG backends. Attempting to modify thebackend service'sresource.timeoutSec property results in the followingerror:Timeout sec is not supported for a backend service with Serverlessnetwork endpoint groups.
For backend services with serverless NEG backends, the default timeout is 60minutes. This timeout is not configurable. If your application needslong-running connections, configure your clients to retry requests on failure.
All serverless NEGs combined in abackend service must also use the same type of backend. This meansCloud Run serverless NEGs can only be combined withother Cloud Run serverless NEGs, and App Engineserverless NEGs can only be combined with App Engine serverlessNEGs.
You cannot mix serverless NEGs with other types of NEGs in the same backendservice. For example, you cannot route to a GKE cluster and aCloud Run service from the same backend service.
When setting up backend services that route to serverless NEGs,certain fields are restricted:
- You cannot specify a balancing mode. That is, theRATE,UTILIZATION, andCONNECTION values have no effect on the load balancer's trafficdistribution.
- Health checks are not supported for serverless backends. Therefore, backendservices that contain serverless NEG backends cannot be configured with healthchecks. However, you can optionally enableoutlierdetection to identify unhealthyserverless resources and route new requests to a healthy serverless resource.
You cannot use thegcloud compute backend-servicesedit command to modifya backend service with a serverless NEG backend. As a workaround, use thegcloud compute backend-servicesupdate commandinstead.

Additional limitations apply depending on the type of load balancer and theserverless backend.

Limitations with regional internal Application Load Balancers and regional external Application Load Balancers

Serverless NEGs used with regional internal Application Load Balancersor regional external Application Load Balancers can only point to Cloud Runor Cloud Run functions (2nd gen) resources.
For projects that are using serverless NEGs, the queries per second (QPS)limit is 5000 QPS per project for traffic sent to any serverless NEGsconfigured with regional external Application Load Balancers or regional internal Application Load Balancers. Thislimit is aggregated across all regional external Application Load Balancers andregional internal Application Load Balancers in the project. This is not a per load balancerlimit.

Limitations with cross-region internal Application Load Balancers

Serverless NEGs used with cross-region internal Application Load Balancerscan only point to Cloud Run or Cloud Run functions(2nd gen) resources.

Limitations with global external Application Load Balancers

This sections lists the limitations you'll encounter when configuring serverlessNEGs with global external Application Load Balancers.

Limitations with Cloud Run

An external Application Load Balancer with serverless NEGs does not supportKnative serving.
External Application Load Balancers don't supportauthenticating end-userrequests toCloud Run resources. However, you can useIAP to authenticate users within yourorganization. If you want to enableIAP, you should remember thatIAP and Cloud CDN are incompatible with each other.They cannot be enabled on the same backend service.

Limitations with App Engine

Multi-region load balancing is not supported with App Engine. This isbecause App Engine requires 1 region per project.
If you're using IAP, you must use the same OAuth client IDfor all App Engine services associated with a single load balancer.
Only one IAP policy is allowed on the request path. Forexample, if you have already set an IAP policy in the backendservice, you shouldn't set another IAP policy on theApp Engine app.
Global external Application Load Balancers with both App Engine flexible environment backendsand App Engine standard environment backends don't supportcross-project servicereferencing.

We recommend that youuse ingress controlsso that your app only receives requests sent from the load balancer(and the VPC if you use it). Otherwise, users can use your app'sApp Engine URL to bypass the load balancer, Cloud Armorsecurity policies, SSL certificates, and private keys that are passed throughthe load balancer.

Limitations with API Gateway

For more information,seeLimitations on serverless NEGs and API Gateway.

Limitations with traffic management features

Advanced traffic managementfeatures like load balancing locality policy and session affinityaren't supported with serverless NEG backends.
Specifying asession affinity on abackend service with a serverless NEG backend won't work. As a workaround forCloud Run, use its specificsession affinityfeature.

Pricing

To see pricing information for load balancers with serverless NEGs,seeAll networking pricing: Cloud Load Balancing.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換

Serverless network endpoint groups overview Stay organized with collections Save and categorize content based on your preferences.