Using load balancing for highly available applications

This tutorial explains how to use load balancing with a regional managedinstance group to redirect traffic away from busy or unavailable VM instances,allowing you to provide high availability even during a zonal outage.

Aregional managed instance groupdistributes an application on multiple instances across multiple zones. Aglobal load balancerdirects traffic across multiple regions via a single IP address.By using both of these services to distribute your applicationacross multiple zones, you can help ensure that your application isavailable even in extreme cases, like a zonal disruption.

Load balancers can be used to direct a variety of traffic types. Thistutorial shows you how to create a global load balancerthat directs external HTTP traffic, but much of the content of this tutorial isstill relevant to other types of load balancers. To learn about other types oftraffic that can be directed with a load balancer, seeTypes of Cloud Load Balancing.

This tutorial includes detailed steps for launching a web application on aregional managed instance group, configuring network access, creating aload balancer for directing traffic to the web application, and observing theload balancer by simulating a zonal outage. Depending on your experience withthese features, this tutorial takes about 45 minutes to complete.

Objectives

  • Launch a demo web application on a regional managed instance group.
  • Configure a global load balancer that directs HTTP traffic across multiplezones.
  • Observe the effects of the load balancer by simulating a zonal outage.

Costs

In this document, you use the following billable components of Google Cloud:

  • Compute Engine

To generate a cost estimate based on your projected usage, use thepricing calculator.

New Google Cloud users might be eligible for afree trial.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  5. Verify that billing is enabled for your Google Cloud project.

Application architecture

The application includes the following Compute Engine components:

  • VPC network:a virtual network within Google Cloud that can provide globalconnectivity using its own routes and firewall rules.
  • Firewall rule:a Google Cloudfirewall lets you allow or deny traffic to your instances.
  • Instance template:a template used to create each VM instance in the managed instance group.
  • Regional managed instance group:a group of VM instances running the same application across multiple zones.
  • Global static external IP address:a static IP address that is accessible on external networks and can beattached to a global resource.
  • Global load balancer:a load balancer that allows backend instances to be distributed acrossmultiple regions. Use a global load balancer when your users need access tothe same applications and content, and you want to provide access using asingle anycast IP address.
  • Health check:a policy used by the load balancer to evaluate the responsivenessof the application on each VM instance.

Launching the web application

This tutorial uses a web application that is stored on GitHub. If you wouldlike learn more about how the application was implemented, see theGoogleCloudPlatform/python-docs-samplesrepository on GitHub.

Launch the web application on every VM in an instance group by including astartup script in an instance template. Additionally, run the instance group ina dedicated VPC network to keep this tutorial's firewall rules from interferingwith any existing resources running in your project.

Create a VPC network

Using a VPC network protects existing resources in your project from beingaffected by the resources that you will create for this tutorial.A VPC network is also required torestrict incoming traffic so that it mustgo through the load balancer.

Create a VPC network to encapsulate the firewall rules for the demo webapplication:

  1. In the Google Cloud console, go to theVPC networks page.

    Go to VPC networks

  2. ClickCreate VPC Network.

  3. UnderName, enterweb-app-vpc.

  4. SetSubnet creation mode toCustom.

  5. Create a new subnet as follows:

    1. In theSubnets section, set theName field, enterweb-app-vpc-subnet.
    2. In theRegion drop-down, selectus-central1.
    3. Make sure that theIP stack type option is set toIPv4.
    4. In thePrimary IPv4 range section, enter the IPv4 range10.2.0.0/24.
  6. At the bottom of the page, clickCreate.

Wait until the VPC network is created before continuing.

Create a firewall rule

After the VPC network is created, set up a firewall rule to allowHTTP traffic to the VPC network:

Note: This example creates an ingress allow VPC firewall rule of which the target is all instances in the network. For production applications, consider using a more specific target. You can also use rules in a global network firewall policy, regional network firewall policy, or hierarchical firewall policy. For more information, seeFirewall policies andbest practices for network security.
  1. In the Google Cloud console, go to theFirewalls page.

    Go to Firewalls

  2. ClickCreate firewall rule.

  3. In theName field, enterallow-web-app-http.

  4. SetNetwork toweb-app-vpc.

  5. Make sure that the following options are set as given:

    • Direction of traffic option is set toIngress.
    • Action on match option is set toAllow.
  6. In theTargets drop-down, selectAll instances in the network.

  7. SetSource filter toIPv4 ranges.

  8. In theSource IP ranges field, enter130.211.0.0/22, 35.191.0.0/16 to allow for load balancer health checks.

    Note: Health check probes for the load balancer come from addressesin the ranges130.211.0.0/22 and35.191.0.0/16. For this tutorial,your health check uses the HTTP protocol, so the firewall rule shouldallow connections to port 80. For more information on firewall rules forhealth checks, seeProbe IP ranges and firewall rules.
  9. UnderProtocols and ports, do the following:

    1. SelectSpecified protocols and ports.
    2. SelectTCP.
    3. In thePorts field, enter80 toallow access for HTTP traffic.
  10. ClickCreate.

Create an instance template

Create a template that you will use to create a group of VM instances. Eachinstance created from the template launches a demo web application by usinga startup script.

  1. In the Google Cloud console, go to theInstance templates page.

    Go to Instance templates

  2. ClickCreate instance template.

  3. UnderName, enterload-balancing-web-app-template.

  4. UnderMachine configuration, set theMachine type toe2-medium.

  5. Click theAdvanced options section to expand.

  6. Click theNetworking section and do the following:

    1. In theNetwork interfaces section, delete any existing networkinterfaces by clicking theicon next to them.
    2. ClickAdd a network interface, and then select theweb-app-vpcnetwork. This forces each instance created with this template to run onthe previously created network.
    3. In theSubnetwork drop-down, selectweb-app-vpc-subnet.
    4. ClickDone.
  7. Click theManagement section and do the following:

    1. In theAutomation section, enter the following startup script:

      apt-get updateapt-get -y install git python3-pip python3-venvgit clone https://github.com/GoogleCloudPlatform/python-docs-samples.gitpython3 -m venv venv./venv/bin/pip3 install -Ur ./python-docs-samples/compute/managed-instances/demo/requirements.txt./venv/bin/pip3 install gunicorn./venv/bin/gunicorn --bind 0.0.0.0:80 app:app --daemon --chdir ./python-docs-samples/compute/managed-instances/demo

      The script gets, installs, and launches the web application when a VMinstance starts up.

  8. Leave the default values for the other options.

  9. ClickCreate.

Wait until the template is created before continuing.

Create a regional managed instance group

To run the web application, use the instance template to create a regionalmanaged instance group:

  1. In the Google Cloud console, go to theInstance groups page.

    Go to Instance groups

  2. ClickCreate instance group.

  3. ForName, enterload-balancing-web-app-group.

  4. ForInstance template, selectload-balancing-web-app-template.

  5. SetNumber of instances to6. If this field is disabled, turn offautoscaling first.

    To turn off autoscaling, go to theAutoscaling section. In theAutoscaling mode drop-down, selectOff: do not autoscale.

    Pro Tip:When creating a regional managed instance group, Compute Enginerecommends that you provision enough instances so that, if all of theinstances in any one zone are unavailable, the remaining instancesstill meet the minimum number of instances that you require.However, provisioning more instances than you need might incur additionalcosts. For more information, seeHow to increase availability by overprovisioning.

  6. ForLocation, selectMultiple zones.

    Pro Tip: To ensure your application is available during extreme events, like zonal outages, Compute Engine recommends that youdistribute your application across multiple zones.

  7. ForRegion, selectus-central1.

  8. ForZones, select the following zones from the drop-down list:

    • us-central1-b
    • us-central1-c
    • us-central1-f
  9. Leave the default values for the other options.

  10. ClickCreate. This redirects you back to theInstance groups page.

    You might need to wait a few minutes until all of the instances in thegroup are running.

Configuring the load balancer

To use a load balancer to direct traffic to your web application, you must reservean external IP address to receive all incoming traffic. Then, create a loadbalancer that accepts traffic from that IP address and redirects thattraffic to the instance group.

Reserve a static IP address

Use aglobal static external IP addressto provide the load balancer with a single point of entry for receiving alluser traffic. Compute Engine preserves static IP addresses even ifyou change or delete any affiliated Google Cloud resources. This allowsthe web application to always have the same entry point, even if other parts ofthe web application might change.

  1. In the Google Cloud console, go to theIP addresses page.

    Go to IP addresses

  2. ClickReserve external static IP address.

  3. In theName field, enterweb-app-ipv4.

  4. SetIP version toIPv4.

  5. SetType toGlobal.

  6. ClickReserve.

Create a load balancer

This section explains the steps required to create a global loadbalancer that directs HTTP traffic.

This load balancer uses a frontend to receive incoming traffic and a backend todistribute this traffic to healthy instances. Because the load balancer ismade of multiple components, this task is divided into five parts:

  • Select the load balancer type
  • Name the load balancer
  • Configure the frontend
  • Configure the backend
  • Review and finalize

Complete all the parts to create the load balancer.

Note: For simplicity, this tutorial uses an HTTP load balancer. To learnhow to support HTTPS and HTTP/2, seeCreating content-based HTTP(S) load balancing.For other types of traffic, seeChoosing a load balancer.

Select the load balancer type

  1. In the Google Cloud console, go to theLoad balancing page.

    Go to Load balancing

  2. ClickCreate load balancer.
  3. ForType of load balancer, selectApplication Load Balancer (HTTP/HTTPS) and clickNext.
  4. ForPublic facing or internal, selectPublic facing (external) and clickNext.
  5. ForGlobal or single region deployment, selectBest for global workloads and clickNext.
  6. ForLoad balancer generation, selectGlobal external Application Load Balancer and clickNext.
  7. ClickConfigure.

Name the load balancer

  1. In the left panel, forLoad balancer name, enterweb-app-load-balancer.

Configure the frontend

  1. On theFrontend configuration page, underName, enterweb-app-ipv4-frontend.
  2. Set theProtocol toHTTP.
  3. Set theIP version toIPv4.
  4. Set theIP address toweb-app-ipv4.
  5. Set thePort to80.
  6. ClickDone to create the frontend.

Configure the backend

  1. In the left panel, clickBackend configuration.
  2. ClickBackend services & backend buckets drop-down to open amenu, and then clickCreate a backend service.
  3. In the new window, for theName of the backendservice, enterweb-app-backend.
  4. In theBackends section, do the following:
    1. SetInstance group toload-balancing-web-app-group.
    2. SetPort numbers to80. Thisallows HTTP trafficbetween the load balancer and the instance group.
    3. UnderBalancing mode, selectUtilization.
    4. ClickDone.
  5. Create the health check for the backend of the load balancer as follows:

    Pro Tip: Health checks are used for both load balancing andautohealing, but for different purposes:

    • Health checks for load balancing are used for detecting unresponsive instances and directing traffic away from them.
    • Health checks for autohealing are used for detecting and recreating failed instances.

    Use separate health checks for load balancing and for autohealing. Using the same health check for these services would remove the distinction between unresponsive instances and failed instances, causing unnecessary latency and/or unavailability for your users. For more information, see Health check concepts.

    1. Click theHealth check drop-down, and then clickCreate a health check. A new window opens.
    2. In the new window underName, enterweb-app-load-balancer-check.
    3. Set theProtocol toHTTP.
    4. UnderPort, enter80.
    5. For this tutorial, set theRequest path to/health, which is apath that the demo web application is set up to respond to.
    6. Set the followingHealth criteria:

      1. SetCheck interval to3 seconds. This defines the amount oftime from the start of one probe to the start of the next one.
      2. SetTimeout to3 seconds. This defines the amount of timethat Google Cloud waits for a response to a probe. Itsvalue must be less than or equal to the check interval.
      3. SetHealthy Threshold to2 consecutive successes. Thisdefines the number of sequential probes that must succeed in orderfor the instance to be considered healthy.
      4. SetUnhealthy Threshold to2consecutive failures. Thisdefines the number of sequential probes that must fail in orderfor the instance to be considered unhealthy.

      Pro Tip: For information about refining theCheck interval andTimeout values for your own application, seeHow health checks work. For detailed information about optimizing and measuring latency, see Optimizing Application Latency with Load Balancing

    7. ClickCreate to create the health check.

  6. Leave the default values for the other options.

  7. ClickCreate to create the backend service.

Review and finalize

Verify your load balancing settings before creating the load balancer:

  1. In the left panel of theCreate global external Application Load Balancer page,clickReview and finalize.
  2. On theReview and finalize page, verify thatFrontend uses an IPaddress with aProtocol ofHTTP.

  3. On the same page, verify the followingBackend settings:

    • TheBackend service isweb-app-backend.
    • TheEndpoint protocol isHTTP.
    • TheHealth check isweb-app-load-balancer-check.
    • TheInstance group isload-balancing-web-app-group.
  4. ClickCreate to finish creating the load balancer.

You might need to wait a few minutes for the load balancer to finish being created.

Test the load Balancer

Verify that you can connect to the web application by using the load balanceras follows:

  1. In the Google Cloud console, go to theLoad balancing page.

    Go to Load balancing

  2. In theName column, clickweb-app-load-balancer to expand the loadbalancer you just created.

  3. To connect to the web-app using the external static IPaddresses, do the following:

    1. In theFrontend section, copy the IP address shown in theIP:Portcolumn.
    2. Open a new browser tab and paste the IP address into the address bar.This should display the demo web application:

      Demo web application.

    Notice that, whenever you refresh the page, the load balancer connectsto different instances in different zones. This happens because you arenot connecting to an instance directly; you are connecting tothe load balancer, which selects the instance you are redirected to.

    When you are done, close the browser tab for the demo web application.

Simulating a zonal outage

You can observe the functionality of the load balancer by simulating thewidespread unavailability of a zonal outage. This simulation works by forcingall of the instances located in a specified zone to report an unhealthy statuson the/health request path. When these instances report an unhealthy status,they fail the load balancing health check, prompting the load balancer tostop directing traffic to these instances.

  1. Monitor which zones the load balancer is directing traffic to.

    1. In the Google Cloud console, go toCloud Shell.

      Open Cloud Shell

      Cloud Shell opens in a pane of theGoogle Cloud console. It cantake a few seconds for the session to initialize.

      Pro Tip: You can openCloud Shell from any Google Cloud console pageby using theActivate Cloud Shell button.

    2. Save the static external IP address of your load balancer as follows:

      1. Get the external IP address from the frontend forwarding rule of theload balancer by entering the following command in your terminal:

        gcloud compute forwarding-rules describe web-app-ipv4-frontend --global

        The output looks as follows. Copy theEXTERNAl_IP_ADDRESS from the output.

        IPAddress:EXTERNAl_IP_ADDRESS...
      2. Create a local bash variable:

        export LOAD_BALANCER_IP=EXTERNAl_IP_ADDRESS

        ReplaceEXTERNAl_IP_ADDRESS with theexternal IP address that you copied.

    3. To monitor which zones the load balancer is directing traffic to, runthe following bash script:

      while truedo    BODY=$(curl -s "$LOAD_BALANCER_IP")    NAME=$(echo -n "$BODY" | grep "load-balancing-web-app-group" | perl -pe 's/.+?load-balancing-web-app-group-(.+?)<.+/\1/')    ZONE=$(echo -n "$BODY" | grep "us-" | perl -pe 's/.+?(us-.+?)<.+/\1/')    echo $ZONE    sleep 2 # Wait for 2 secondsdone

      This script continuously attempts to connect to the web application byusing the IP address for the frontend of the load balancer, and outputswhich zone the web application is running from for each connection.

      The resulting output should include zonesus-central1-b,us-central1-c, andus-central1-f:

      us-central1-fus-central1-bus-central1-cus-central1-fus-central1-fus-central1-cus-central1-fus-central1-cus-central1-c

      Keep this terminal open.

      Note: This monitor should run continuously. But, you can stop it at anytime by pressingControl+C in the terminal.
  2. While your monitor is running, begin simulating the zonal outage.

    1. In Cloud Shell,open a second terminal sessionby clicking theAddbutton.
    2. Create a local bash variable for the project ID:

      export PROJECT_ID=PROJECT_ID

      wherePROJECT_ID is the project ID for your current project, whichis displayed on each new line in the Cloud Shell:

      user@cloudshell:~ (PROJECT_ID)$
    3. Create a local bash variable for the zone that you want to disable. Tosimulate a failure of zoneus-central1-f, use the following command:

      export DISABLE_ZONE=us-central1-f

      Then, run the following bash script. This script causes thedemo web application instances in the disabled zone to outputunhealthy responses to the load balancer health check. Unhealthyresponses prompt the load balancer to direct traffic away fromthese instances.

      export MACHINES=$(gcloud --project=$PROJECT_ID compute instances list --filter="zone:($DISABLE_ZONE)" --format="csv(name,networkInterfaces[0].accessConfigs[0].natIP)" | grep "load-balancing-web-app-group")for i in $MACHINES;do  NAME=$(echo "$i" | cut -f1 -d,)  IP=$(echo "$i" | cut -f2 -d,)  echo "Simulating zonal failure for zone $DISABLE_ZONE, instance $NAME"  curl -q -s "http://$IP/makeUnhealthy" >/dev/null --retry 2done

      After a short delay, the load balancer stops directing traffic tothe unhealthy zones, so the output from the first terminal window stopslisting zoneus-central1-f:

      us-central1-cus-central1-cus-central1-cus-central1-bus-central1-bus-central1-cus-central1-bus-central1-cus-central1-c

      This indicates that the load balancer is directing trafficonly to the healthy, responsive instances.

      Note: Optionally, you can repeat this step to simulate failures of zonesus-central1-b andus-central1-c.

      Keep both terminals open.

    4. In the second terminal, create a local bash variable for the zone thatyou want to restore. To restore traffic to zoneus-central1-f, usethe following command:

      export ENABLE_ZONE=us-central1-f

      Then, run the following bash script. This script causes thedemo web application instances in the enabled zone to outputhealthy responses to the load balancer health check. Healthy responsesprompt the load balancer to begin distributing traffic back toward theseinstances.

      export MACHINES=$(gcloud --project=$PROJECT_ID compute instances list --filter="zone:($ENABLE_ZONE)" --format="csv(name,networkInterfaces[0].accessConfigs[0].natIP)" | grep "load-balancing-web-app-group")for i in $MACHINES;do  NAME=$(echo "$i" | cut -f1 -d,)  IP=$(echo "$i" | cut -f2 -d,)  echo "Simulating zonal restoration for zone $ENABLE_ZONE, instance $NAME"  curl -q -s "http://$IP/makeHealthy" >/dev/null --retry 2done

      After a few minutes, the output from the first terminal windowgradually lists zoneus-central1-fagain:

      us-central1-bus-central1-bus-central1-cus-central1-fus-central1-cus-central1-cus-central1-bus-central1-cus-central1-f

      This indicates that the load balancer is directing incoming trafficto all zones again.

      Note: If you also disabled zoneus-central1-b orzoneus-central1-c, you can repeat this step to restore traffic tothem.

      Close both terminals when you have finished.

Clean up

After you finish the tutorial, you can clean up the resources that you created so that they stop using quota and incurring charges. The following sections describe how to delete or turn off these resources.

If you created a separate project for this tutorial, delete the entire project.Otherwise, if the project has resources that you want to keep, only delete theresources created in this tutorial.

Deleting the project

    Caution: Deleting a project has the following effects:
    • Everything in the project is deleted. If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.
    • Custom project IDs are lost. When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as anappspot.com URL, delete selected resources inside the project instead of deleting the whole project.

    If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects can help you avoid exceeding project quota limits.

  1. In the Google Cloud console, go to theManage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then clickDelete.
  3. In the dialog, type the project ID, and then clickShut down to delete the project.

Deleting specific resources

The following sections describe how to delete the specific resources that youcreated during this tutorial.

Deleting the load balancer

  1. In the Google Cloud console, go to theLoad balancing page.

    Go to Load balancing

  2. Click the checkbox next toweb-app-load-balancer.

  3. ClickDeleteat the top of the page.

  4. In the new window, select all checkboxes. Then, clickDelete load balancerand selected resources to confirm the deletion.

Deleting the static external IP address

Wait until the load balancer is deleted before deleting the static external IPaddress.

  1. In the Google Cloud console, go to theExternal IP addresses page.

    Go to External IP addresses

  2. Click the checkbox next toweb-app-ipv4.

  3. ClickRelease static addressat the top of the page. In the new window, clickRelease toconfirm the release.

Deleting the instance group

Wait until the load balancer is deleted beforedeleting the instance group.

  1. In the Google Cloud console, go to theInstance groups page.

    Go to Instance groups

  2. Select the checkbox for yourload-balancing-web-app-group instance group.
  3. To delete the instance group, clickDelete.

Deleting the instance template

You must finish deleting the instance group before deleting the instancetemplate. You cannot delete an instance template if a managed instance groupis using it.

  1. In the Google Cloud console, go to theInstance Templates page.

    Go to Instance templates

  2. Click the checkbox next toload-balancing-web-app-template.

  3. ClickDeleteat the top of the page. In the new window, clickDelete toconfirm the deletion.

Deleting the VPC network

You must finish deleting the instance group before deleting the VPCnetwork. You cannot delete a VPC network if other resources still uses it.

  1. In the Google Cloud console, go to theVPC networks page.

    Go to VPC networks

  2. Clickweb-app-vpc.

  3. ClickDelete VPC network atthe top of the page. In the new window, clickDelete to confirm thedeletion.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.