Load balancing and scaling

Google Cloud offers load balancing and autoscaling forgroups of instances.

Load balancing

Google Cloud offers server-side load balancing so you can distributeincoming traffic across multiple virtual machine (VM) instances. Loadbalancing provides the following benefits:

  • Scale your app
  • Support heavy traffic
  • Detect and automatically remove unhealthy VM instances usinghealth checks. Instances thatbecome healthy again are automatically re-added.
  • Route traffic to the closest virtual machine

Google Cloud load balancing uses forwarding ruleresources to match certain types of traffic and forward it to a load balancer.For example, a forwarding rule can match TCP traffic destined to port 80 on IPaddress192.0.2.1, then forward it to a load balancer, which then directsit to healthy VM instances.

Google Cloud load balancing is a managed service, which means itscomponents are redundant and highly available. If a load balancing componentfails, it is restarted or replaced automatically and immediately.

Google Cloud offers several different types of load balancing thatdiffer incapabilities, usage scenarios, and how you configure them. SeeGoogle Cloud load balancing documentationfor descriptions.

Autoscaling

Compute Engine offers autoscaling to automatically add or remove VMinstances from amanaged instance group (MIG)based on increases or decreases in load. Autoscaling lets your apps gracefullyhandle increases in traffic, and it reduces cost when the need for resources islower. You can autoscale a MIG based on its CPU utilization, Cloud Monitoringmetrics, schedules, or load balancing serving capacity.

When you set up an autoscaler to scale based on load balancing serving capacity,the autoscaler watches the serving capacity of an instance group and scaleswhen the VM instances are over or under capacity. The serving capacity of aninstance can be defined in the load balancer'sbackend service and can bebased on either utilization or requests per second. For more information, seeScaling based on load balancing serving capacity.

To learn more about autoscaling, seeAutoscaling groups of instances.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.