Regional deployment on Compute Engine

Last reviewed 2025-08-12 UTC

This document provides a reference architecture for a multi-tier applicationthat runs on Compute Engine VMs in multiple zones within a Google Cloudregion. You can use this reference architecture to efficientlyrehost (lift and shift) on-premises applications to the cloud with minimal changes to the applications.The document also describes the design factors that you should consider when youbuild a regional architecture for your cloud applications. The intended audiencefor this document is cloud architects.

Architecture

The following diagram shows an architecture for an application that runs inactive-active mode in isolated stacks that are deployed across threeGoogle Cloud zones within a region. The architecture is aligned with theregional deployment archetype.

An application runs in active-active-mode in isolated stacks that are deployed across three Google Cloud zones within a region.

The architecture is based on the infrastructure as a service (IaaS) cloudmodel. You provision the required infrastructure resources (compute, networking,and storage) in Google Cloud. You retain full control over theinfrastructure and responsibility for the operating system, middleware, andhigher layers of the application stack. To learn more about IaaS and other cloudmodels, seePaaS vs. IaaS vs. SaaS vs. CaaS: How are they different?.

The preceding diagram includes the following components:

ComponentPurpose
Regional external load balancer

The regional external load balancer receives and distributes user requests to the web tier VMs.

Use an appropriate load balancer type depending on the traffic type and other requirements. For example, if the backend consists of web servers (as shown in the preceding architecture), then use an Application Load Balancer to forward HTTP(S) traffic. To load-balance TCP traffic, use a Network Load Balancer. For more information, see Choose a load balancer.

Regional managed instance group (MIG) for the web tier

The web tier of the application is deployed on Compute Engine VMs that are part of a regional MIG. The MIG is the backend for the regional external load balancer.

The MIG contains Compute Engine VMs in three different zones. Each of these VMs hosts an independent instance of the web tier of the application.

Regional internal load balancer

The regional internal load balancer distributes traffic from the web tier VMs to the application tier VMs.

Depending on your requirements, you can use a regional internal Application Load Balancer or Network Load Balancer. For more information, see Choose a load balancer.

Regional MIG for the application tier

The application tier is deployed on Compute Engine VMs that are part of a regional MIG, which is the backend for the internal load balancer.

The MIG contains Compute Engine VMs in three different zones. Each VM hosts an independent instance of the application tier.

Third-party database deployed on a Compute Engine VM

The architecture in this document shows a third-party database (like PostgreSQL) that's deployed on a Compute Engine VM. You can deploy a standby database in another zone. The database replication and failover capabilities depend on the database that you use.

Installing and managing a third-party database involves additional effort and operational cost for applying updates, monitoring, and ensuring availability. You can avoid the overhead of installing and managing a third-party database and take advantage of built-in high availability (HA) features by using a fully managed database service like Cloud SQL or AlloyDB for PostgreSQL. For more information about managed database options, see Database services later in this guide.

Virtual Private Cloud network and subnet

All the Google Cloud resources in the architecture use a single VPC network and subnet.

Depending on your requirements, you can choose to build an architecture that uses multiple VPC networks or multiple subnets. For more information, see Deciding whether to create multiple VPC networks in "Best practices and reference architectures for VPC design."

Cloud Storage dual-region bucket

Application and database backups are stored in a dual-region Cloud Storage bucket. If a zone or region outage occurs, your application and data aren't lost.

Alternatively, you can use Backup and DR Service to create, store, and manage the database backups.

Products used

This reference architecture uses the following Google Cloud products:

  • Compute Engine: A secure and customizable compute service that lets youcreate and run VMs on Google's infrastructure.
  • Cloud Load Balancing: A portfolio of high performance, scalable, global andregional load balancers.
  • Cloud Storage: A low-cost, no-limit object store for diverse data types.Data can be accessed from within and outside Google Cloud, and it'sreplicated across locations for redundancy.
  • Virtual Private Cloud (VPC): A virtual system that provides global, scalablenetworking functionality for your Google Cloud workloads. VPC includesVPC Network Peering, Private Service Connect, private services access, andShared VPC.

Use cases

This section describes use cases for which a regional deployment onCompute Engine is an appropriate choice.

Efficient migration of on-premises applications

You can use this reference architecture to build a Google Cloud topologyto rehost (lift and shift) on-premises applications to the cloud with minimalchanges to the applications. All the tiers of the application in this referencearchitecture are hosted on Compute Engine VMs. This approach lets youmigrate on-premises applications efficiently to the cloud and take advantage ofthe cost benefits, reliability, performance, and operational simplicity thatGoogle Cloud provides.

Highly available application with users within a geographic area

We recommend a regional deployment architecture for applications that needrobustness against zone outages but can tolerate some downtime caused by regionoutages. If any part of the application stack fails, the application continuesto run if at least one functioning component with adequate capacity exists inevery tier. If a zone outage occurs, the application stack continues to run inthe other zones.

Low latency for application users

If all the users of an application are within a single geographic area, such asa single country, a regional deployment architecture can help improve theuser-perceived performance of the application. You can optimize network latencyfor user requests by deploying the application in the Google Cloud regionthat's closest to your users.

Low-latency networking between application components

A single-region architecture might be well suited for applications such asbatch computing that need low-latency and high-bandwidth network connectionsamong the compute nodes. All the resources are in a single Google Cloudregion, so inter-resource network traffic remains within the region. Theinter-resource network latency is low, and you don't incur cross-region datatransfer costs. Intra-region network costs still apply.

Compliance with data residency requirements

You can use a single-region architecture to build a topology that helps you tomeet data residency requirements. For example, a country in Europe might requirethat all user data be stored and accessed in data centers that are locatedphysically within Europe. To meet this requirement, you can run the applicationin aGoogle Cloud region in Europe.

Design considerations

This section provides guidance to help you use this reference architecture todevelop an architecture that meets your specific requirements for system design,security and compliance, reliability, operational efficiency, cost, andperformance.

Note: The guidance in this section isn't exhaustive. Depending on thespecific requirements of your application and the Google Cloud productsand features that you use, there might be additional design factors andtrade-offs that you should consider.

System design

This section provides guidance to help you to choose Google Cloud regionsfor your regional deployment and to select appropriate Google Cloudservices.

Region selection

When you choose the Google Cloud regions where your applications must bedeployed, consider the following factors and requirements:

Some of these factors and requirements might involve trade-offs. Forexample, the most cost-efficient region might not have the lowestcarbon footprint. For more information, seeBest practices for Compute Engine regions selection.

Compute infrastructure

The reference architecture in this document uses Compute Engine VMs forcertain tiers of the application. Depending on the requirements of yourapplication, you can choose from other Google Cloud compute services:

  • Containers: You can runcontainerized applications inGoogle Kubernetes Engine (GKE) clusters. GKE is a container-orchestration engine thatautomates deploying, scaling, and managing containerized applications.
  • Serverless: If you prefer to focus your IT efforts on your data andapplications instead of setting up and operating infrastructure resources,then you can useserverless services likeCloud Run.

The decision of whether to use VMs, containers, or serverless services involvesa trade-off between configuration flexibility and management effort. VMs andcontainers provide more configuration flexibility, but you're responsible formanaging the resources. In a serverless architecture, you deploy workloads to apreconfigured platform that requires minimal management effort. For moreinformation about choosing appropriate compute services for your workloads inGoogle Cloud, seeHosting Applications on Google Cloud.

Storage services

The architecture shown in this document usesregional Persistent Disk volumes for all the tiers. Persistent disks provide synchronous replication of dataacross two zones within a region.

Google Cloud Hyperdisk provides better performance, flexibility, and efficiency than Persistent Disk.With Hyperdisk Balanced, you can provision IOPS and throughputseparately and dynamically, which lets you tune the volume to a wide variety ofworkloads.

For low-cost storage that's replicated across multiple locations, you can useCloud Storage regional, dual-region, or multi-region buckets.

  • Data in regional buckets is replicated synchronously across the zones in theregion.
  • Data in dual-region or multi-region buckets is stored redundantly in at leasttwo separate geographic locations. Metadata is written synchronously acrossregions, and data is replicated asynchronously. For dual-region buckets, youcan useturbo replication,which ensures that objects are replicated across region pairs, with arecovery point objective (RPO) of 15 minutes. For more information, seeData availability and durability.

To store data that's shared across multiple VMs in a region, such as across allthe VMs in the web tier or application tier, you can use aFilestore regional instance.The data that you store in a Filestore regional instance is replicatedsynchronously across three zones within the region. This replication ensureshigh availability and robustness against zone outages. You can store shared configuration files,common tools and utilities, and centralized logs in the Filestoreinstance, and mount the instance on multiple VMs. For robustness against regionoutages, you can replicate a Filestore instance to a differentregion. For more information, seeInstance replication.

If your database is Microsoft SQL Server, we recommend usingCloud SQL for SQL Server. In scenarios when Cloud SQL doesn't support yourconfiguration requirements, or if you need access to the operating system, youcan deploy aMicrosoft SQL Server failover cluster instance (FCI).In this scenario, you can use the fully managedGoogle Cloud NetApp Volumes to provide continuous availability (CA) SMB storage for the database.

When you design storage for your workloads, consider the functionalcharacteristics, resilience requirements, performance expectations, and costgoals. For more information, seeDesign an optimal storage strategy for your cloud workload.

Database services

The reference architecture in this document uses a third-party database that'sdeployed on Compute Engine VMs. Installing and managing a third-partydatabase involves effort and cost for operations like applying updates,monitoring and ensuring availability, performing backups, and recovering fromfailures.

You can avoid the effort and cost of installing and managing a third-partydatabase by using a fully managed database service likeCloud SQL,AlloyDB for PostgreSQL,Bigtable,Spanner,orFirestore.These Google Cloud database services provide uptime service-levelagreements (SLAs), and they include default capabilities for scalability andobservability.

If your workload needs anOracle database,you can deploy the database on a Compute Engine VM or useOracle Database@Google Cloud. For more information, seeOracle workloads in Google Cloud.

Network design

Choose a network design that meets your business and technical requirements. Youcan use a single VPC network or multiple VPC networks. For more information, seethe following documentation:

Security, privacy, and compliance

This section describes factors that you should consider when you use thisreference architecture to design and build a regional topology inGoogle Cloud that meets the security, privacy, and compliance requirements of yourworkloads.

Protection against external threats

To protect your application against threats like distributed-denial-of-service(DDoS) attacks and cross-site scripting (XSS), you can use Google Cloud Armorsecurity policies. Each policy is a set of rules that specifies certainconditions that should be evaluated and actions to take when the conditions aremet. For example, a rule could specify that if the source IPaddress of the incoming traffic matches a specific IP address or CIDR range,then the traffic must be denied. You can also apply preconfigured webapplication firewall (WAF) rules. For more information, seeSecurity policy overview.

External access for VMs

In the reference architecture that this document describes, theCompute Engine VMs don't need inbound access from the internet. Don'tassignexternal IP addresses to the VMs. Google Cloud resources that have only a private, internal IPaddress can still access certain Google APIs and services by usingPrivate Service Connect or Private Google Access. For moreinformation, seePrivate access options for services.

To enable secure outbound connections from Google Cloud resources thathave only private IP addresses, like the Compute Engine VMs in thisreference architecture, you can useSecure Web Proxy orCloud NAT.

Service account privileges

For the Compute Engine VMs in the architecture, instead of using thedefault service accounts, we recommend that you create dedicated serviceaccounts and specify the resources that the service account can access. Thedefault service account has a broad range of permissions, including some thatmight not be necessary. You can tailor dedicated service accounts tohave only the essential permissions. For more information, seeLimit service account privileges.

SSH security

To enhance the security of SSH connections to the Compute Engine VMs inyour architecture, implementIdentity-Aware Proxy (IAP) andCloud OS Login API.IAP lets you control network access based on user identity andIdentity and Access Management (IAM) policies. Cloud OS Login API lets you controlLinux SSH access based on user identity and IAM policies. Formore information about managing network access, seeBest practices for controlling SSH login access.

Network security

To control network traffic between the resources in the architecture, you mustconfigure appropriateCloud Next Generation Firewall (NGFW) policies.

More security considerations

When you build the architecture for your workload, consider the platform-levelsecurity best practices and recommendations that are provided in theEnterprise foundations blueprint andGoogle Cloud Well-Architected Framework: Security, privacy, and compliance.

Reliability

This section describes design factors that you should consider when you usethis reference architecture to build and operate reliable infrastructure foryour regional deployments in Google Cloud.

Infrastructure outages

In a regional architecture, if any individual component in the infrastructurestack fails, the application can process requests if at least one functioningcomponent with adequate capacity exists in each tier. For example, if a webserver instance fails, the load balancer forwards user requests to the otheravailable web server instances. If a VM that hosts a web server or app serverinstance crashes, theMIG recreates the VM automatically.

If a zone outage occurs, the load balancer isn't affected, because it's aregional resource. A zone outage might affect individual Compute EngineVMs. But the application remains available and responsive because the VMs are ina regional MIG. A regional MIG ensures that new VMs are created automatically tomaintain the configured minimum number of VMs. After Google resolves the zoneoutage, you must verify that the application runs as expected in all the zoneswhere it's deployed.

If all the zones in this architecture have an outage or if a region outageoccurs, then the application is unavailable. You must wait for Google to resolvethe outage, and then verify that the application works as expected.

You can reduce the downtime caused by region outages by maintaining a passive(failover) replica of the infrastructure stack in another Google Cloudregion. If an outage occurs in the primary region, you can activate the stack inthe failover region and useDNS routing policies to route traffic to the load balancer in the failover region.

For applications that require robustness against region outages, consider usinga multi-regional architecture. For more information, seeMulti-regional deployment on Compute Engine.

MIG autoscaling

To control the autoscalingbehavior of your stateless MIGs, you can specify target utilization metrics,such as average CPU utilization. You can also configure schedule-basedautoscaling for stateless MIGs.Stateful MIGs can't be autoscaled. For more information, seeAutoscaling groups of instances.

MIG size limit

When you decide the size of your MIGs, consider the default and maximum limitson the number of VMs that can be created in a MIG. For more information, seeAdd and remove VMs from a MIG.

VM autohealing

Sometimes the VMs that host your application might be running and available, butthere might be issues with the application itself. The application might freeze,crash, or not have sufficient memory. To verify whether an application isresponding as expected, you can configure application-based health checks aspart of the autohealing policy of your MIGs. If the application on a particularVM isn't responding, the MIG autoheals (repairs) the VM. For more informationabout configuring autohealing, seeAbout repairing VMs for high availability.

VM placement

In the architecture that this document describes, the application tier and webtier run on Compute Engine VMs that are distributed across multiplezones. This distribution ensures that your application is robust against zoneoutages.

To improve the robustness of the architecture, you can create aspread placement policy and apply it to the MIG template. When the MIG creates VMs, it places the VMswithin each zone on different physical servers (calledhosts), so your VMs arerobust against failures of individual hosts. For more information, seeCreate and apply spread placement policies to VMs.

VM capacity planning

To make sure that capacity for Compute Engine VMs is available when VMsneed to be provisioned, you can createreservations. A reservation providesassured capacity in a specific zone for a specified number of VMs of a machinetype that you choose. A reservation can be specific to a project, or sharedacross multiple projects. For more information about reservations, seeChoose a reservation type.

Stateful storage

A best practice in application design is to avoid the need for stateful localdisks. But if the requirement exists, you can configure your persistent disks tobe stateful to ensure that the data is preserved when the VMs are repaired orrecreated. However, we recommend that you keep the boot disks stateless, so thatyou can update them to the latest images with new versions and securitypatches. For more information, seeConfiguring stateful persistent disks in MIGs.

Data durability

You can useBackup and DR to create, store, and manage backups of the Compute Engine VMs.Backup and DR stores backup data in its original, application-readableformat. When required, you can restore your workloads to production by directlyusing data from long-term backup storage and avoid the need to prepare or move data.

Compute Engine provides the following options to help you to ensure thedurability of data that's stored in Persistent Disk volumes:

If you use a managed database service like Cloud SQL, backups are takenautomatically based on the retention policy that you define. You can supplementthe backup strategy with additional logical backups to meet regulatory,workflow, or business requirements.

If you use a third-party database and you need to store database backups andtransaction logs, you can use regional Cloud Storage buckets. RegionalCloud Storage buckets provide low-cost backup storage that's redundantacross zones.

Database availability

If you use a managed database service likeCloud SQL in HA configuration,then in the event of a failure of the primary database, Cloud SQL failsover automatically to the standby database. You don't need to change the IPaddress for the database endpoint. If you use a self-managed third-partydatabase that's deployed on a Compute Engine VM, then you must use aninternal load balancer or other mechanism to ensure that the application canconnect to another database if the primary database is unavailable.

To implement cross-zone failover for a database that's deployed on aCompute Engine VM, you need a mechanism to identify failures of theprimary database and a process to fail over to the standby database. Thespecifics of the failover mechanism depend on the database that you use. You canset up an observer instance to detect failures of the primary database andorchestrate the failover. You must configure the failover rules appropriately toavoid asplit-brainsituation and prevent unnecessary failover. For example architectures that youcan use to implement failover for PostgreSQL databases, seeArchitectures for high availability of PostgreSQL clusters on Compute Engine.

More reliability considerations

When you build the cloud architecture for your workload, review thereliability-related best practices and recommendations that are provided in thefollowing documentation:

Cost optimization

This section provides guidance to optimize the cost of setting up and operatinga regional Google Cloud topology that you build by using this referencearchitecture.

VM machine types

To help you optimize the resource utilization of your VM instances,Compute Engine providesmachine type recommendations.Use the recommendations to choose machine types that match your workload'scompute requirements. For workloads with predictable resource requirements, youcan customize the machine type to your needs and save money by usingcustom machine types.

VM provisioning model

If your application is fault tolerant, thenSpot VMs can help to reduce your Compute Engine costs for the VMs in theapplication and web tiers. The cost of Spot VMs is significantly lowerthan regular VMs. However, Compute Engine might preemptively stop ordelete Spot VMs to reclaim capacity.

Spot VMs are suitable forbatch jobs that can tolerate preemption and don't have high availabilityrequirements. Spot VMs offer the same machine types, options, andperformance as regular VMs. However, when the resource capacity in a zone islimited, MIGs might not be able to scale out (that is, create VMs) automaticallyto the specified target size until the required capacity becomes availableagain.

VM resource utilization

Theautoscaling capability of stateless MIGs enables your application to handle increases intraffic gracefully, and it helps you to reduce cost when the need for resourcesis low.Stateful MIGs can't be autoscaled.

Third-party licensing

When you migrate third-party workloads to Google Cloud, you might be ableto reduce cost by bringing your own licenses (BYOL). For example, to deployMicrosoft Windows Server VMs, instead of using apremium image that incurs additional cost for the third-party license, you can create and useacustom Windows BYOL image.You then pay only for the VM infrastructure that you use on Google Cloud.This strategy helps you continue to realize value from your existing investmentsin third-party licenses.If you decide to use the BYOL approach, then the following recommendations mighthelp to reduce cost:

  • Provision the required number of compute CPU cores independently ofmemory by usingcustom machine types.By doing this, you limit the third-party licensing cost to the number ofCPU cores that you need.
  • Reduce the number of vCPUs per core from 2 to 1 by disablingsimultaneous multithreading (SMT).

If you deploy a third-party database like Microsoft SQL Server onCompute Engine VMs, then you must consider the license costs for thethird-party software. When you use a managed database service likeCloud SQL, the database license costs are included in the charges forthe service.

More cost considerations

When you build the architecture for your workload, also consider the generalbest practices and recommendations that are provided inGoogle Cloud Well-Architected Framework: Cost optimization.

Operational efficiency

This section describes the factors that you should consider when you use thisreference architecture to design and build a regional Google Cloudtopology that you can operate efficiently.

VM configuration updates

To update the configuration of the VMs in a MIG (such as the machine type orboot-disk image), you create a new instance template with the requiredconfiguration and then apply the new template to the MIG. The MIG updates theVMs by using the update method that you choose: automatic or selective. Choosean appropriate method based on your requirements for availability andoperational efficiency. For more information about these MIG update methods, seeApply new VM configurations in a MIG.

VM images

For your VMs, instead of using Google-provided publicimages, we recommend that you create and usecustom OS images that contain theconfigurations and software that your applications require. You can group yourcustom images into a custom image family. An image family always points to themost recent image in that family, so your instance templates and scripts can usethat image without you having to update references to a specific imageversion. You must regularly update your custom images to include the securityupdates and patches that are provided by the OS vendor.

Deterministic instance templates

If the instance templates that you use for your MIGs include startup scripts toinstall third-party software, make sure that the scripts explicitly specifysoftware-installation parameters such as the software version. Otherwise, whenthe MIG creates the VMs, the software that's installed on the VMs might not beconsistent. For example, if your instance template includes a startup script toinstall Apache HTTP Server 2.0 (theapache2 package), then make sure that thescript specifies the exactapache2 version that should be installed, such asversion2.4.53. For more information, seeDeterministic instance templates.

More operational considerations

When you build the architecture for your workload, consider the general bestpractices and recommendations for operational efficiency that are described inGoogle Cloud Well-Architected Framework: Operational excellence.

Performance optimization

This section describes the factors that you should consider when you use thisreference architecture to design and build a regional topology inGoogle Cloud that meets the performance requirements of your workloads.

Compute performance

Compute Engine offers a wide range of predefined and customizablemachine types for the workloads that you run on VMs. Choose an appropriatemachine type based on your performance requirements. For more information, seeMachine families resource and comparison guide.

VM multithreading

Each virtual CPU (vCPU) that you allocate to a Compute Engine VM isimplemented as a single hardware multithread. By default, two vCPUs share aphysical CPU core. For applications that involve highly parallel operations or that performfloating point calculations (such as genetic sequence analysis, and financialrisk modeling), you can improve performance by reducing the number of threadsthat run on each physical CPU core. For more information, seeSet the number of threads per core.

VM multithreading might have licensing implications for some third-partysoftware, like databases. For more information, read the licensing documentationfor the third-party software.

Network Service Tiers

Network Service Tiers lets you optimize the network cost and performance of your workloads. You canchoose Premium Tier or Standard Tier. Premium Tier delivers traffic on Google'sglobal backbone to achieve minimal packet loss and low latency. Standard Tierdelivers traffic using peering, internet service providers (ISP), or transitnetworks at an edge point of presence (PoP) that's closest to the region whereyour Google Cloud workload runs. To optimize performance, we recommendusing Premium Tier. To optimize cost, we recommend using Standard Tier.

Network performance

For workloads that need low inter-VM network latency within the application andweb tiers, you can create a compact placement policy and apply it to the MIGtemplate that's used for those tiers. When the MIG creates VMs, it places theVMs on physical servers that are close to each other. While a compact placementpolicy helps improve inter-VM network performance, a spread placement policy canhelp improve VM availability as described earlier. To achieve an optimal balancebetween network performance and availability, when you create a compactplacement policy, you can specify how far apart the VMs must be placed. For moreinformation, seePlacement policies overview.

Compute Engine has a per-VM limit for egressnetwork bandwidth.This limit depends on the VM's machine type and whether traffic is routedthrough the same VPC network as the source VM. For VMs with certain machinetypes, to improve network performance, you can get a higher maximum egressbandwidth by enablingTier_1 networking.

More performance considerations

When you build the architecture for your workload, consider the general bestpractices and recommendations that are provided inGoogle Cloud Well-Architected Framework: Performance optimization.

What's next

Contributors

Authors:

Other contributors:

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-08-12 UTC.