Using load balancing for highly available applications Stay organized with collections Save and categorize content based on your preferences.
This tutorial explains how to use load balancing with a regional managedinstance group to redirect traffic away from busy or unavailable VM instances,allowing you to provide high availability even during a zonal outage.
Aregional managed instance groupdistributes an application on multiple instances across multiple zones. Aglobal load balancerdirects traffic across multiple regions via a single IP address.By using both of these services to distribute your applicationacross multiple zones, you can help ensure that your application isavailable even in extreme cases, like a zonal disruption.
Load balancers can be used to direct a variety of traffic types. Thistutorial shows you how to create a global load balancerthat directs external HTTP traffic, but much of the content of this tutorial isstill relevant to other types of load balancers. To learn about other types oftraffic that can be directed with a load balancer, seeTypes of Cloud Load Balancing.
This tutorial includes detailed steps for launching a web application on aregional managed instance group, configuring network access, creating aload balancer for directing traffic to the web application, and observing theload balancer by simulating a zonal outage. Depending on your experience withthese features, this tutorial takes about 45 minutes to complete.
Objectives
- Launch a demo web application on a regional managed instance group.
- Configure a global load balancer that directs HTTP traffic across multiplezones.
- Observe the effects of the load balancer by simulating a zonal outage.
Costs
In this document, you use the following billable components of Google Cloud:
- Compute Engine
To generate a cost estimate based on your projected usage, use thepricing calculator.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
- Create a project: To create a project, you need the Project Creator role (
roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant roles.
Verify that billing is enabled for your Google Cloud project.
Application architecture
The application includes the following Compute Engine components:
- VPC network:a virtual network within Google Cloud that can provide globalconnectivity using its own routes and firewall rules.
- Firewall rule:a Google Cloudfirewall lets you allow or deny traffic to your instances.
- Instance template:a template used to create each VM instance in the managed instance group.
- Regional managed instance group:a group of VM instances running the same application across multiple zones.
- Global static external IP address:a static IP address that is accessible on external networks and can beattached to a global resource.
- Global load balancer:a load balancer that allows backend instances to be distributed acrossmultiple regions. Use a global load balancer when your users need access tothe same applications and content, and you want to provide access using asingle anycast IP address.
- Health check:a policy used by the load balancer to evaluate the responsivenessof the application on each VM instance.
Launching the web application
This tutorial uses a web application that is stored on GitHub. If you wouldlike learn more about how the application was implemented, see theGoogleCloudPlatform/python-docs-samplesrepository on GitHub.
Launch the web application on every VM in an instance group by including astartup script in an instance template. Additionally, run the instance group ina dedicated VPC network to keep this tutorial's firewall rules from interferingwith any existing resources running in your project.
Create a VPC network
Using a VPC network protects existing resources in your project from beingaffected by the resources that you will create for this tutorial.A VPC network is also required torestrict incoming traffic so that it mustgo through the load balancer.
Create a VPC network to encapsulate the firewall rules for the demo webapplication:
In the Google Cloud console, go to theVPC networks page.
ClickCreate VPC Network.
UnderName, enter
web-app-vpc.SetSubnet creation mode toCustom.
Create a new subnet as follows:
- In theSubnets section, set theName field, enter
web-app-vpc-subnet. - In theRegion drop-down, selectus-central1.
- Make sure that theIP stack type option is set toIPv4.
- In thePrimary IPv4 range section, enter the IPv4 range
10.2.0.0/24.
- In theSubnets section, set theName field, enter
At the bottom of the page, clickCreate.
Wait until the VPC network is created before continuing.
Create a firewall rule
After the VPC network is created, set up a firewall rule to allowHTTP traffic to the VPC network:
Note: This example creates an ingress allow VPC firewall rule of which the target is all instances in the network. For production applications, consider using a more specific target. You can also use rules in a global network firewall policy, regional network firewall policy, or hierarchical firewall policy. For more information, seeFirewall policies andbest practices for network security.In the Google Cloud console, go to theFirewalls page.
ClickCreate firewall rule.
In theName field, enter
allow-web-app-http.SetNetwork to
web-app-vpc.Make sure that the following options are set as given:
- Direction of traffic option is set toIngress.
- Action on match option is set toAllow.
In theTargets drop-down, selectAll instances in the network.
SetSource filter to
IPv4 ranges.In theSource IP ranges field, enter
Note: Health check probes for the load balancer come from addressesin the ranges130.211.0.0/22, 35.191.0.0/16to allow for load balancer health checks.130.211.0.0/22and35.191.0.0/16. For this tutorial,your health check uses the HTTP protocol, so the firewall rule shouldallow connections to port 80. For more information on firewall rules forhealth checks, seeProbe IP ranges and firewall rules.UnderProtocols and ports, do the following:
- SelectSpecified protocols and ports.
- SelectTCP.
- In thePorts field, enter
80toallow access for HTTP traffic.
ClickCreate.
Create an instance template
Create a template that you will use to create a group of VM instances. Eachinstance created from the template launches a demo web application by usinga startup script.
In the Google Cloud console, go to theInstance templates page.
ClickCreate instance template.
UnderName, enter
load-balancing-web-app-template.UnderMachine configuration, set theMachine type to
e2-medium.Click theAdvanced options section to expand.
Click theNetworking section and do the following:
- In theNetwork interfaces section, delete any existing networkinterfaces by clicking theicon next to them.
- ClickAdd a network interface, and then select the
web-app-vpcnetwork. This forces each instance created with this template to run onthe previously created network. - In theSubnetwork drop-down, select
web-app-vpc-subnet. - ClickDone.
Click theManagement section and do the following:
In theAutomation section, enter the following startup script:
apt-get updateapt-get -y install git python3-pip python3-venvgit clone https://github.com/GoogleCloudPlatform/python-docs-samples.gitpython3 -m venv venv./venv/bin/pip3 install -Ur ./python-docs-samples/compute/managed-instances/demo/requirements.txt./venv/bin/pip3 install gunicorn./venv/bin/gunicorn --bind 0.0.0.0:80 app:app --daemon --chdir ./python-docs-samples/compute/managed-instances/demo
The script gets, installs, and launches the web application when a VMinstance starts up.
Leave the default values for the other options.
ClickCreate.
Wait until the template is created before continuing.
Create a regional managed instance group
To run the web application, use the instance template to create a regionalmanaged instance group:
In the Google Cloud console, go to theInstance groups page.
ClickCreate instance group.
ForName, enter
load-balancing-web-app-group.ForInstance template, select
load-balancing-web-app-template.SetNumber of instances to
6. If this field is disabled, turn offautoscaling first.To turn off autoscaling, go to theAutoscaling section. In theAutoscaling mode drop-down, selectOff: do not autoscale.
Pro Tip:When creating a regional managed instance group, Compute Enginerecommends that you provision enough instances so that, if all of theinstances in any one zone are unavailable, the remaining instancesstill meet the minimum number of instances that you require.However, provisioning more instances than you need might incur additionalcosts. For more information, seeHow to increase availability by overprovisioning.
ForLocation, selectMultiple zones.
Pro Tip: To ensure your application is available during extreme events, like zonal outages, Compute Engine recommends that youdistribute your application across multiple zones.
ForRegion, selectus-central1.
ForZones, select the following zones from the drop-down list:
- us-central1-b
- us-central1-c
- us-central1-f
Leave the default values for the other options.
ClickCreate. This redirects you back to theInstance groups page.
You might need to wait a few minutes until all of the instances in thegroup are running.
Configuring the load balancer
To use a load balancer to direct traffic to your web application, you must reservean external IP address to receive all incoming traffic. Then, create a loadbalancer that accepts traffic from that IP address and redirects thattraffic to the instance group.
Reserve a static IP address
Use aglobal static external IP addressto provide the load balancer with a single point of entry for receiving alluser traffic. Compute Engine preserves static IP addresses even ifyou change or delete any affiliated Google Cloud resources. This allowsthe web application to always have the same entry point, even if other parts ofthe web application might change.
In the Google Cloud console, go to theIP addresses page.
ClickReserve external static IP address.
In theName field, enter
web-app-ipv4.SetIP version toIPv4.
SetType toGlobal.
ClickReserve.
Create a load balancer
This section explains the steps required to create a global loadbalancer that directs HTTP traffic.
This load balancer uses a frontend to receive incoming traffic and a backend todistribute this traffic to healthy instances. Because the load balancer ismade of multiple components, this task is divided into five parts:
- Select the load balancer type
- Name the load balancer
- Configure the frontend
- Configure the backend
- Review and finalize
Complete all the parts to create the load balancer.
Note: For simplicity, this tutorial uses an HTTP load balancer. To learnhow to support HTTPS and HTTP/2, seeCreating content-based HTTP(S) load balancing.For other types of traffic, seeChoosing a load balancer.Select the load balancer type
In the Google Cloud console, go to theLoad balancing page.
- ClickCreate load balancer.
- ForType of load balancer, selectApplication Load Balancer (HTTP/HTTPS) and clickNext.
- ForPublic facing or internal, selectPublic facing (external) and clickNext.
- ForGlobal or single region deployment, selectBest for global workloads and clickNext.
- ForLoad balancer generation, selectGlobal external Application Load Balancer and clickNext.
- ClickConfigure.
Name the load balancer
- In the left panel, forLoad balancer name, enter
web-app-load-balancer.
Configure the frontend
- On theFrontend configuration page, underName, enter
web-app-ipv4-frontend. - Set theProtocol to
HTTP. - Set theIP version to
IPv4. - Set theIP address to
web-app-ipv4. - Set thePort to
80. - ClickDone to create the frontend.
Configure the backend
- In the left panel, clickBackend configuration.
- ClickBackend services & backend buckets drop-down to open amenu, and then clickCreate a backend service.
- In the new window, for theName of the backendservice, enter
web-app-backend. - In theBackends section, do the following:
- SetInstance group to
load-balancing-web-app-group. - SetPort numbers to
80. Thisallows HTTP trafficbetween the load balancer and the instance group. - UnderBalancing mode, selectUtilization.
- ClickDone.
- SetInstance group to
Create the health check for the backend of the load balancer as follows:
Pro Tip: Health checks are used for both load balancing andautohealing, but for different purposes:
- Health checks for load balancing are used for detecting unresponsive instances and directing traffic away from them.
- Health checks for autohealing are used for detecting and recreating failed instances.
Use separate health checks for load balancing and for autohealing. Using the same health check for these services would remove the distinction between unresponsive instances and failed instances, causing unnecessary latency and/or unavailability for your users. For more information, see Health check concepts.
- Click theHealth check drop-down, and then clickCreate a health check. A new window opens.
- In the new window underName, enter
web-app-load-balancer-check. - Set theProtocol toHTTP.
- UnderPort, enter
80. - For this tutorial, set theRequest path to
/health, which is apath that the demo web application is set up to respond to. Set the followingHealth criteria:
- SetCheck interval to
3seconds. This defines the amount oftime from the start of one probe to the start of the next one. - SetTimeout to
3seconds. This defines the amount of timethat Google Cloud waits for a response to a probe. Itsvalue must be less than or equal to the check interval. - SetHealthy Threshold to
2consecutive successes. Thisdefines the number of sequential probes that must succeed in orderfor the instance to be considered healthy. - SetUnhealthy Threshold to
2consecutive failures. Thisdefines the number of sequential probes that must fail in orderfor the instance to be considered unhealthy.
Pro Tip: For information about refining theCheck interval andTimeout values for your own application, seeHow health checks work. For detailed information about optimizing and measuring latency, see Optimizing Application Latency with Load Balancing
- SetCheck interval to
ClickCreate to create the health check.
Leave the default values for the other options.
ClickCreate to create the backend service.
Review and finalize
Verify your load balancing settings before creating the load balancer:
- In the left panel of theCreate global external Application Load Balancer page,clickReview and finalize.
On theReview and finalize page, verify thatFrontend uses an IPaddress with aProtocol of
HTTP.On the same page, verify the followingBackend settings:
- TheBackend service is
web-app-backend. - TheEndpoint protocol is
HTTP. - TheHealth check is
web-app-load-balancer-check. - TheInstance group is
load-balancing-web-app-group.
- TheBackend service is
ClickCreate to finish creating the load balancer.
You might need to wait a few minutes for the load balancer to finish being created.
Test the load Balancer
Verify that you can connect to the web application by using the load balanceras follows:
In the Google Cloud console, go to theLoad balancing page.
In theName column, click
web-app-load-balancerto expand the loadbalancer you just created.To connect to the web-app using the external static IPaddresses, do the following:
- In theFrontend section, copy the IP address shown in theIP:Portcolumn.
Open a new browser tab and paste the IP address into the address bar.This should display the demo web application:

Notice that, whenever you refresh the page, the load balancer connectsto different instances in different zones. This happens because you arenot connecting to an instance directly; you are connecting tothe load balancer, which selects the instance you are redirected to.
When you are done, close the browser tab for the demo web application.
Simulating a zonal outage
You can observe the functionality of the load balancer by simulating thewidespread unavailability of a zonal outage. This simulation works by forcingall of the instances located in a specified zone to report an unhealthy statuson the/health request path. When these instances report an unhealthy status,they fail the load balancing health check, prompting the load balancer tostop directing traffic to these instances.
Monitor which zones the load balancer is directing traffic to.
In the Google Cloud console, go toCloud Shell.
Cloud Shell opens in a pane of theGoogle Cloud console. It cantake a few seconds for the session to initialize.
Pro Tip: You can openCloud Shell from any Google Cloud console pageby using theActivate Cloud Shell button.
Save the static external IP address of your load balancer as follows:
Get the external IP address from the frontend forwarding rule of theload balancer by entering the following command in your terminal:
gcloud compute forwarding-rules describe web-app-ipv4-frontend --global
The output looks as follows. Copy the
EXTERNAl_IP_ADDRESSfrom the output.IPAddress:EXTERNAl_IP_ADDRESS...
Create a local bash variable:
export LOAD_BALANCER_IP=EXTERNAl_IP_ADDRESS
Replace
EXTERNAl_IP_ADDRESSwith theexternal IP address that you copied.
To monitor which zones the load balancer is directing traffic to, runthe following bash script:
while truedo BODY=$(curl -s "$LOAD_BALANCER_IP") NAME=$(echo -n "$BODY" | grep "load-balancing-web-app-group" | perl -pe 's/.+?load-balancing-web-app-group-(.+?)<.+/\1/') ZONE=$(echo -n "$BODY" | grep "us-" | perl -pe 's/.+?(us-.+?)<.+/\1/') echo $ZONE sleep 2 # Wait for 2 secondsdone
This script continuously attempts to connect to the web application byusing the IP address for the frontend of the load balancer, and outputswhich zone the web application is running from for each connection.
The resulting output should include zones
us-central1-b,us-central1-c, andus-central1-f:us-central1-fus-central1-bus-central1-cus-central1-fus-central1-fus-central1-cus-central1-fus-central1-cus-central1-c
Keep this terminal open.
Note: This monitor should run continuously. But, you can stop it at anytime by pressingControl+Cin the terminal.
While your monitor is running, begin simulating the zonal outage.
- In Cloud Shell,open a second terminal sessionby clicking theAddbutton.
Create a local bash variable for the project ID:
export PROJECT_ID=PROJECT_ID
where
PROJECT_IDis the project ID for your current project, whichis displayed on each new line in the Cloud Shell:user@cloudshell:~ (PROJECT_ID)$
Create a local bash variable for the zone that you want to disable. Tosimulate a failure of zone
us-central1-f, use the following command:export DISABLE_ZONE=us-central1-f
Then, run the following bash script. This script causes thedemo web application instances in the disabled zone to outputunhealthy responses to the load balancer health check. Unhealthyresponses prompt the load balancer to direct traffic away fromthese instances.
export MACHINES=$(gcloud --project=$PROJECT_ID compute instances list --filter="zone:($DISABLE_ZONE)" --format="csv(name,networkInterfaces[0].accessConfigs[0].natIP)" | grep "load-balancing-web-app-group")for i in $MACHINES;do NAME=$(echo "$i" | cut -f1 -d,) IP=$(echo "$i" | cut -f2 -d,) echo "Simulating zonal failure for zone $DISABLE_ZONE, instance $NAME" curl -q -s "http://$IP/makeUnhealthy" >/dev/null --retry 2done
After a short delay, the load balancer stops directing traffic tothe unhealthy zones, so the output from the first terminal window stopslisting zone
us-central1-f:us-central1-cus-central1-cus-central1-cus-central1-bus-central1-bus-central1-cus-central1-bus-central1-cus-central1-c
This indicates that the load balancer is directing trafficonly to the healthy, responsive instances.
Note: Optionally, you can repeat this step to simulate failures of zonesus-central1-bandus-central1-c.Keep both terminals open.
In the second terminal, create a local bash variable for the zone thatyou want to restore. To restore traffic to zone
us-central1-f, usethe following command:export ENABLE_ZONE=us-central1-f
Then, run the following bash script. This script causes thedemo web application instances in the enabled zone to outputhealthy responses to the load balancer health check. Healthy responsesprompt the load balancer to begin distributing traffic back toward theseinstances.
export MACHINES=$(gcloud --project=$PROJECT_ID compute instances list --filter="zone:($ENABLE_ZONE)" --format="csv(name,networkInterfaces[0].accessConfigs[0].natIP)" | grep "load-balancing-web-app-group")for i in $MACHINES;do NAME=$(echo "$i" | cut -f1 -d,) IP=$(echo "$i" | cut -f2 -d,) echo "Simulating zonal restoration for zone $ENABLE_ZONE, instance $NAME" curl -q -s "http://$IP/makeHealthy" >/dev/null --retry 2done
After a few minutes, the output from the first terminal windowgradually lists zone
us-central1-fagain:us-central1-bus-central1-bus-central1-cus-central1-fus-central1-cus-central1-cus-central1-bus-central1-cus-central1-f
This indicates that the load balancer is directing incoming trafficto all zones again.
Note: If you also disabled zoneus-central1-borzoneus-central1-c, you can repeat this step to restore traffic tothem.Close both terminals when you have finished.
Clean up
After you finish the tutorial, you can clean up the resources that you created so that they stop using quota and incurring charges. The following sections describe how to delete or turn off these resources.
If you created a separate project for this tutorial, delete the entire project.Otherwise, if the project has resources that you want to keep, only delete theresources created in this tutorial.
Deleting the project
Deleting specific resources
The following sections describe how to delete the specific resources that youcreated during this tutorial.
Deleting the load balancer
In the Google Cloud console, go to theLoad balancing page.
Click the checkbox next to
web-app-load-balancer.ClickDeleteat the top of the page.
In the new window, select all checkboxes. Then, clickDelete load balancerand selected resources to confirm the deletion.
Deleting the static external IP address
Wait until the load balancer is deleted before deleting the static external IPaddress.
In the Google Cloud console, go to theExternal IP addresses page.
Click the checkbox next to
web-app-ipv4.ClickRelease static addressat the top of the page. In the new window, clickRelease toconfirm the release.
Deleting the instance group
Wait until the load balancer is deleted beforedeleting the instance group.
- In the Google Cloud console, go to theInstance groups page.
- Select the checkbox for your
load-balancing-web-app-groupinstance group. - To delete the instance group, clickDelete.
Deleting the instance template
You must finish deleting the instance group before deleting the instancetemplate. You cannot delete an instance template if a managed instance groupis using it.
In the Google Cloud console, go to theInstance Templates page.
Click the checkbox next to
load-balancing-web-app-template.ClickDeleteat the top of the page. In the new window, clickDelete toconfirm the deletion.
Deleting the VPC network
You must finish deleting the instance group before deleting the VPCnetwork. You cannot delete a VPC network if other resources still uses it.
In the Google Cloud console, go to theVPC networks page.
Click
web-app-vpc.ClickDelete VPC network atthe top of the page. In the new window, clickDelete to confirm thedeletion.
What's next
- Try another tutorial:
- Learn more aboutManaged Instance Groups.
- Learn more aboutLoad Balancing.
- Learn more aboutOptimizing Application Latency with Load Balancing.
- Learn more aboutDesigning Robust Systems.
- Learn more aboutBuilding Scalable and Resilient Web Applications on Google Cloud.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.