Single-zone deployment on Compute Engine Stay organized with collections Save and categorize content based on your preferences.
This document provides a reference architecture for a multi-tier applicationthat runs on Compute Engine VMs in a single zone in Google Cloud. You can usethis reference architecture to efficientlyrehost (lift and shift) on-premises applications to the cloud with minimal changes to the applications.The document also describes the design factors that you should consider when youbuild a zonal architecture for your cloud applications. The intended audiencefor this document is cloud architects.
Architecture
The following diagram shows an architecture for an application that runs in asingle Google Cloud zone. This architecture is aligned with theGoogle Cloud zonal deployment archetype.
The architecture is based on the infrastructure as a service (IaaS) cloud model.You provision the required infrastructure resources (compute, networking, andstorage) in Google Cloud. You retain full control over the infrastructureand responsibility for the operating system, middleware, and higher layers ofthe application stack. To learn more about IaaS and other cloud models, seePaaS vs. IaaS vs. SaaS vs. CaaS: How are they different?
The preceding diagram includes the following components:
| Component | Purpose |
|---|---|
| Regional external load balancer | The regional external load balancer receives and distributesuser requests to the web tier VMs. |
| Zonalmanaged instance group (MIG) forthe web tier | The web tier of the application is deployed onCompute Engine VMs that are part of a zonal MIG. The MIG isthe backend for the regional external load balancer. Each VM in theMIG hosts an independent instance of the web tier of theapplication. |
| Regional internal load balancer | The regional internal load balancer distributes traffic from theweb tier VMs to the application tier VMs. |
| Zonal MIG for the application tier | The application tier is deployed on Compute Engine VMsthat are part of a zonal MIG, which is the backend for the internalload balancer. Each VM in the MIG hosts an independent instance ofthe application tier. |
| Third-party database deployed on a Compute EngineVM | The architecture in this document shows a third-party database(likePostgreSQL) that's deployed on aCompute Engine VM. You can deploy a standby database inanother zone. The database replication and failover capabilitiesdepend on the database that you use. Installing and managing a third-party database involvesadditional effort and operational cost for applying updates,monitoring, and ensuring availability. You can avoid the overhead ofinstalling and managing a third-party database and take advantage ofbuilt-in high availability (HA) features by using a fully manageddatabase service like Cloud SQL or AlloyDB for PostgreSQL. For moreinformation about managed database options, seeDatabase services. |
| Virtual Private Cloud network andsubnet | All the Google Cloud resources in the architecture use asingle VPC network and subnet. Depending on your requirements, you can choose to build anarchitecture that uses multiple VPC networks or multiple subnets. Formore information, seeDeciding whether to create multipleVPC networks |
| Cloud Storage regionalbucket | Application and database backups are stored in a regionalCloud Storage bucket. If a zone outage occurs, yourapplication and data aren't lost. Alternatively, you can useBackup and DR Service to create,store, and manage the database backups. |
Products used
This reference architecture uses the following Google Cloud products:
- Compute Engine: A secure and customizable compute service that lets youcreate and run VMs on Google's infrastructure.
- Cloud Load Balancing: A portfolio of high performance, scalable, global andregional load balancers.
- Cloud Storage: A low-cost, no-limit object store for diverse data types.Data can be accessed from within and outside Google Cloud, and it'sreplicated across locations for redundancy.
- Virtual Private Cloud (VPC): A virtual system that provides global, scalablenetworking functionality for your Google Cloud workloads. VPC includesVPC Network Peering, Private Service Connect, private services access, andShared VPC.
Use cases
This section describes use cases for which a single-zone deployment onCompute Engine is an appropriate choice.
- Cloud development and testing: You can use a single-zone deploymentarchitecture to build a low-cost cloud environment for development and testing.
- Applications that don't need HA: A single-zone architecture might besufficient for applications that can tolerate downtime due toinfrastructure outages.
- Low-latency, low-cost networking between application components: Asingle-zone architecture might be well suited for applications such asbatch computing that need low-latency and high-bandwidth networkconnections among the compute nodes. With a single-zone deployment, there'sno cross-zone network traffic, and you don't incur costs for intra-zone traffic.
- Migration of commodity workloads: The zonal deployment architectureprovides a cloud-migration path for commodity on-premisesapplications for which you have no control over the code or that can'tsupport architectures beyond a basic active-passive topology.
- Running license-restricted software: A single-zone architecturemight be well suited for license-restricted systems where running more thanone instance at a time is either too expensive or isn't permitted.
Design considerations
This section provides guidance to help you use this reference architecture todevelop an architecture that meets your specific requirements for system design,security, reliability, operational efficiency, cost, andperformance.
Note: The guidance in this section isn't exhaustive. Depending on thespecific requirements of your application and the Google Cloud productsand features that you use, there might be additional design factors andtrade-offs that you should consider.When you build the architecture for your workload, consider best practices and recommendations in theGoogle Cloud Well-Architected Framework.
System design
This section provides guidance to help you to choose Google Cloudregions and zones for your zonal deployment and to select appropriate Google Cloudservices.
Region selection
When you choose the Google Cloud regions where your applications must bedeployed, consider the following factors and requirements:
- Availability of Google Cloud services in each region. For moreinformation, seeProducts available by location.
- Availability of Compute Engine machine types in each region. Formore information, seeRegions and zones.
- End-userlatency requirements.
- Cost of Google Cloud resources.
- Cross-regional data transfer costs.
- Regulatory requirements.
- Sustainability requirements.
Some of these factors and requirements might involve trade-offs. Forexample, the most cost-efficient region might not have the lowestcarbon footprint. For more information, seeBest practices for Compute Engine regions selection.
Compute infrastructure
The reference architecture in this document uses Compute Engine VMs forcertain tiers of the application. Depending on the requirements of yourapplication, you can choose from other Google Cloud compute services:
- Containers: You can runcontainerized applications inGoogle Kubernetes Engine (GKE) clusters. GKE is a container-orchestration engine thatautomates deploying, scaling, and managing containerized applications.
- Serverless: If you prefer to focus your IT efforts on your data andapplications instead of setting up and operating infrastructure resources,then you can useserverless services likeCloud Run.
The decision of whether to use VMs, containers, or serverless services involvesa trade-off between configuration flexibility and management effort. VMs andcontainers provide more configuration flexibility, but you're responsible formanaging the resources. In a serverless architecture, you deploy workloads to apreconfigured platform that requires minimal management effort. For moreinformation about choosing appropriate compute services for your workloads inGoogle Cloud, seeHosting Applications on Google Cloud.
Storage services
The architecture shown in this document useszonal Persistent Disk volumes for all the tiers. For more durable persistent storage, you can useregional Persistent Disk volumes,which provide synchronous replication of data across two zones within a region.
Google Cloud Hyperdisk provides better performance, flexibility, and efficiency than Persistent Disk.With Hyperdisk Balanced, you can provision IOPS and throughputseparately and dynamically, which lets you tune the volume to a wide variety ofworkloads.
For low-cost storage that's replicated across multiple locations, you can useCloud Storage regional, dual-region, or multi-region buckets.
- Data in regional buckets is replicated synchronously across the zones in theregion.
- Data in dual-region or multi-region buckets is stored redundantly in at leasttwo separate geographic locations. Metadata is written synchronously acrossregions, and data is replicated asynchronously. For dual-region buckets, youcan useturbo replication,which ensures that objects are replicated across region pairs, with arecovery point objective (RPO) of 15 minutes. For more information, seeData availability and durability.
To store data that's shared across multiple VMs in a region, such as across allthe VMs in the web tier or application tier, you can use aFilestore regional instance.The data that you store in a Filestore regional instance is replicatedsynchronously across three zones within the region. This replication ensureshigh availability and robustness against zone outages. You can store shared configuration files,common tools and utilities, and centralized logs in the Filestoreinstance, and mount the instance on multiple VMs. For robustness against regionoutages, you can replicate a Filestore instance to a differentregion. For more information, seeInstance replication.
If your database is Microsoft SQL Server, we recommend usingCloud SQL for SQL Server. In scenarios when Cloud SQL doesn't support yourconfiguration requirements, or if you need access to the operating system, youcan deploy aMicrosoft SQL Server failover cluster instance (FCI).In this scenario, you can use the fully managedGoogle Cloud NetApp Volumes to provide continuous availability (CA) SMB storage for the database.
When you design storage for your workloads, consider the functionalcharacteristics, resilience requirements, performance expectations, and costgoals. For more information, seeDesign an optimal storage strategy for your cloud workload.
Database services
The reference architecture in this document uses a third-party database that'sdeployed on Compute Engine VMs. Installing and managing a third-partydatabase involves effort and cost for operations like applying updates,monitoring and ensuring availability, performing backups, and recovering fromfailures.
You can avoid the effort and cost of installing and managing a third-partydatabase by using a fully managed database service likeCloud SQL,AlloyDB for PostgreSQL,Bigtable,Spanner,orFirestore.These Google Cloud database services provide uptime service-levelagreements (SLAs), and they include default capabilities for scalability andobservability.
If your workload needs anOracle database,you can deploy the database on a Compute Engine VM or useOracle Database@Google Cloud. For more information, seeOracle workloads in Google Cloud.
Network design
Choose a network design that meets your business and technical requirements. Youcan use a single VPC network or multiple VPC networks. For more information, seethe following documentation:
- Deciding whether to create multiple VPC networks
- Decide the network design for your Google Cloud landing zone
Security, privacy, and compliance
This section describes factors that you should consider when you use thisreference architecture to design and build a zonal topology inGoogle Cloud that meets the security and compliance requirements of yourworkloads.
Protection against external threats
To protect your application against threats like distributed-denial-of-service(DDoS) attacks and cross-site scripting (XSS), you can use Google Cloud Armorsecurity policies. Each policy is a set of rules that specifies certainconditions that should be evaluated and actions to take when the conditions aremet. For example, a rule could specify that if the source IPaddress of the incoming traffic matches a specific IP address or CIDR range,then the traffic must be denied. You can also apply preconfigured webapplication firewall (WAF) rules. For more information, seeSecurity policy overview.
External access for VMs
In the reference architecture that this document describes, theCompute Engine VMs don't need inbound access from the internet. Don'tassignexternal IP addresses to the VMs. Google Cloud resources that have only a private, internal IPaddress can still access certain Google APIs and services by usingPrivate Service Connect or Private Google Access. For moreinformation, seePrivate access options for services.
To enable secure outbound connections from Google Cloud resources thathave only private IP addresses, like the Compute Engine VMs in thisreference architecture, you can useSecure Web Proxy orCloud NAT.
Service account privileges
For the Compute Engine VMs in the architecture, instead of using thedefault service accounts, we recommend that you create dedicated serviceaccounts and specify the resources that the service account can access. Thedefault service account has a broad range of permissions, including some thatmight not be necessary. You can tailor dedicated service accounts tohave only the essential permissions. For more information, seeLimit service account privileges.
SSH security
To enhance the security of SSH connections to the Compute Engine VMs inyour architecture, implementIdentity-Aware Proxy (IAP) andCloud OS Login API.IAP lets you control network access based on user identity andIdentity and Access Management (IAM) policies. Cloud OS Login API lets you controlLinux SSH access based on user identity and IAM policies. Formore information about managing network access, seeBest practices for controlling SSH login access.
Network security
To control network traffic between the resources in the architecture, you mustconfigure appropriateCloud Next Generation Firewall (NGFW) policies.
Each firewall rule lets you control traffic based on parameters like theprotocol, IP address, and port. For example, you can configure a firewall ruleto allow TCP traffic from the web server VMs to a specific port of the databaseVMs, and block all other traffic.
More security considerations
When you build the architecture for your workload, consider the platform-levelsecurity best practices and recommendations that are provided in theEnterprise foundations blueprint andGoogle Cloud Well-Architected Framework: Security, privacy, and compliance.
Reliability
This section describes design factors that you should consider when you usethis reference architecture to build and operate reliable infrastructure foryour zonal deployments in Google Cloud.
Robustness against infrastructure outages
In a single-zone deployment architecture, if any component in theinfrastructure stack fails, the application can process requests if each tiercontains at least one functioning component with adequate capacity. For example,if a web server instance fails, the load balancer forwards user requests to theother available web server instances. If a VM that hosts a web server or appserver instance crashes, theMIG recreates the VM automatically.If the database crashes, you must manually activate the second database andupdate the app server instances to connect to the database.
A zone outage or region outage affects all the Compute Engine VMs in asingle-zone deployment. A zone outage doesn't affect the load balancer in thisarchitecture because it's a regional resource. However, the load balancer can'tdistribute traffic, because there are no available backends. If a zone or regionoutage occurs, you must wait for Google to resolve the outage, and then verifythat the application works as expected.
You can reduce the downtime caused by zone or region outages by maintaining apassive (failover) replica of the infrastructure stack in anotherGoogle Cloud zone or region. If an outage occurs in the primary zone, youcan activate the stack in the failover zone or region, and use aDNS routing policy to route traffic to the load balancer in the failover zone or region.
For applications that require robustness against zone or region outages,consider using a regional or multi-regional architecture. See the followingreference architectures:
MIG autoscaling
Theautoscaling capability of stateless MIGs lets you maintain application availability andperformance at predictable levels.
To control the autoscalingbehavior of your stateless MIGs, you can specify target utilization metrics,such as average CPU utilization. You can also configure schedule-basedautoscaling for stateless MIGs.Stateful MIGs can't be autoscaled. For more information, seeAutoscaling groups of instances.
MIG size limit
When you decide the size of your MIGs, consider the default and maximum limitson the number of VMs that can be created in a MIG. For more information, seeAdd and remove VMs from a MIG.
VM autohealing
Sometimes the VMs that host your application might be running and available, butthere might be issues with the application itself. The application might freeze,crash, or not have sufficient memory. To verify whether an application isresponding as expected, you can configure application-based health checks aspart of the autohealing policy of your MIGs. If the application on a particularVM isn't responding, the MIG autoheals (repairs) the VM. For more informationabout configuring autohealing, seeAbout repairing VMs for high availability.
VM placement
In the architecture that this document describes, the application tier and webtier run on Compute Engine VMs within a single zone.
To improve the robustness of the architecture, you can create aspread placement policy and apply it to the MIG template. When the MIG creates VMs, it places the VMswithin each zone on different physical servers (calledhosts), so your VMs arerobust against failures of individual hosts. For more information, seeCreate and apply spread placement policies to VMs.
VM capacity planning
To make sure that capacity for Compute Engine VMs is available when VMsneed to be provisioned, you can createreservations. A reservation providesassured capacity in a specific zone for a specified number of VMs of a machinetype that you choose. A reservation can be specific to a project, or sharedacross multiple projects. For more information about reservations, seeChoose a reservation type.
Stateful storage
A best practice in application design is to avoid the need for stateful localdisks. But if the requirement exists, you can configure your persistent disks tobe stateful to ensure that the data is preserved when the VMs are repaired orrecreated. However, we recommend that you keep the boot disks stateless, so thatyou can update them to the latest images with new versions and securitypatches. For more information, seeConfiguring stateful persistent disks in MIGs.
Data durability
You can useBackup and DR to create, store, and manage backups of the Compute Engine VMs.Backup and DR stores backup data in its original, application-readableformat. When required, you can restore your workloads to production by directlyusing data from long-term backup storage and avoid the need to prepare or move data.
Compute Engine provides the following options to help you to ensure thedurability of data that's stored in Persistent Disk volumes:
- You can usesnapshots to capture the point-in-time state of Persistent Disk volumes. Thesnapshots are stored redundantly in multiple regions, with automaticchecksums to ensure the integrity of your data. Snapshots are incrementalby default, so they use less storage space and you save money. Snapshotsare stored in aCloud Storage location that you can configure. For more recommendations about using and managingsnapshots, seeBest practices for Compute Engine disk snapshots.
- To ensure that data in Persistent Disk remains available if a zone outage occurs, you can useRegional Persistent Disk orHyperdisk Balanced High Availability. Data in these disk types is replicated synchronously between two zones in the same region. For more information, seeAbout synchronous disk replication.
Database availability
If you use a managed database service likeCloud SQL in HA configuration,then in the event of a failure of the primary database, Cloud SQL failsover automatically to the standby database. You don't need to change the IPaddress for the database endpoint. If you use a self-managed third-partydatabase that's deployed on a Compute Engine VM, then you must use aninternal load balancer or other mechanism to ensure that the application canconnect to another database if the primary database is unavailable.
To implement cross-zone failover for a database that's deployed on aCompute Engine VM, you need a mechanism to identify failures of theprimary database and a process to fail over to the standby database. Thespecifics of the failover mechanism depend on the database that you use. You canset up an observer instance to detect failures of the primary database andorchestrate the failover. You must configure the failover rules appropriately toavoid asplit-brainsituation and prevent unnecessary failover. For example architectures that youcan use to implement failover for PostgreSQL databases, seeArchitectures for high availability of PostgreSQL clusters on Compute Engine.
More reliability considerations
When you build the cloud architecture for your workload, review thereliability-related best practices and recommendations that are provided in thefollowing documentation:
- Google Cloud infrastructure reliability guide
- Patterns for scalable and resilient apps
- Designing resilient systems
- Google Cloud Well-Architected Framework: Reliability
Cost optimization
This section provides guidance to optimize the cost of setting up and operatinga zonal Google Cloud topology that you build by using this referencearchitecture.
VM machine types
To help you optimize the resource utilization of your VM instances,Compute Engine providesmachine type recommendations.Use the recommendations to choose machine types that match your workload'scompute requirements. For workloads with predictable resource requirements, youcan customize the machine type to your needs and save money by usingcustom machine types.
VM provisioning model
If your application is fault tolerant, thenSpot VMs can help to reduce your Compute Engine costs for the VMs in theapplication and web tiers. The cost of Spot VMs is significantly lowerthan regular VMs. However, Compute Engine might preemptively stop ordelete Spot VMs to reclaim capacity.
Spot VMs are suitable forbatch jobs that can tolerate preemption and don't have high availabilityrequirements. Spot VMs offer the same machine types, options, andperformance as regular VMs. However, when the resource capacity in a zone islimited, MIGs might not be able to scale out (that is, create VMs) automaticallyto the specified target size until the required capacity becomes availableagain.
VM resource utilization
Theautoscaling capability of stateless MIGs enables your application to handle increases intraffic gracefully, and it helps you to reduce cost when the need for resourcesis low.Stateful MIGs can't be autoscaled.
Third-party licensing
When you migrate third-party workloads to Google Cloud, you might be ableto reduce cost by bringing your own licenses (BYOL). For example, to deployMicrosoft Windows Server VMs, instead of using apremium image that incurs additional cost for the third-party license, you can create and useacustom Windows BYOL image.You then pay only for the VM infrastructure that you use on Google Cloud.This strategy helps you continue to realize value from your existing investmentsin third-party licenses.If you decide to use the BYOL approach, then the following recommendations mighthelp to reduce cost:
- Provision the required number of compute CPU cores independently ofmemory by usingcustom machine types.By doing this, you limit the third-party licensing cost to the number ofCPU cores that you need.
- Reduce the number of vCPUs per core from 2 to 1 by disablingsimultaneous multithreading (SMT).
If you deploy a third-party database like Microsoft SQL Server onCompute Engine VMs, then you must consider the license costs for thethird-party software. When you use a managed database service likeCloud SQL, the database license costs are included in the charges forthe service.
More cost considerations
When you build the architecture for your workload, also consider the generalbest practices and recommendations that are provided inGoogle Cloud Well-Architected Framework: Cost optimization.
Operational efficiency
This section describes the factors that you should consider when you use thisreference architecture to design and build a zonal Google Cloud topologythat you can operate efficiently.
VM configuration updates
To update the configuration of the VMs in a MIG (such as the machine type orboot-disk image), you create a new instance template with the requiredconfiguration and then apply the new template to the MIG. The MIG updates theVMs by using the update method that you choose: automatic or selective. Choosean appropriate method based on your requirements for availability andoperational efficiency. For more information about these MIG update methods, seeApply new VM configurations in a MIG.
VM images
For your VMs, instead of using Google-provided publicimages, we recommend that you create and usecustom OS images that contain theconfigurations and software that your applications require. You can group yourcustom images into a custom image family. An image family always points to themost recent image in that family, so your instance templates and scripts can usethat image without you having to update references to a specific imageversion. You must regularly update your custom images to include the securityupdates and patches that are provided by the OS vendor.
Deterministic instance templates
If the instance templates that you use for your MIGs include startup scripts toinstall third-party software, make sure that the scripts explicitly specifysoftware-installation parameters such as the software version. Otherwise, whenthe MIG creates the VMs, the software that's installed on the VMs might not beconsistent. For example, if your instance template includes a startup script toinstall Apache HTTP Server 2.0 (theapache2 package), then make sure that thescript specifies the exactapache2 version that should be installed, such asversion2.4.53. For more information, seeDeterministic instance templates.
More operational considerations
When you build the architecture for your workload, consider the general bestpractices and recommendations for operational efficiency that are described inGoogle Cloud Well-Architected Framework: Operational excellence.
Performance optimization
This section describes the factors that you should consider when you use thisreference architecture to design and build a zonal topology inGoogle Cloud that meets the performance requirements of your workloads.
Compute performance
Compute Engine offers a wide range of predefined and customizablemachine types for the workloads that you run on VMs. Choose an appropriatemachine type based on your performance requirements. For more information, seeMachine families resource and comparison guide.
VM multithreading
Each virtual CPU (vCPU) that you allocate to a Compute Engine VM isimplemented as a single hardware multithread. By default, two vCPUs share aphysical CPU core. For applications that involve highly parallel operations or that performfloating point calculations (such as genetic sequence analysis, and financialrisk modeling), you can improve performance by reducing the number of threadsthat run on each physical CPU core. For more information, seeSet the number of threads per core.
VM multithreading might have licensing implications for some third-partysoftware, like databases. For more information, read the licensing documentationfor the third-party software.
Network Service Tiers
Network Service Tiers lets you optimize the network cost and performance of your workloads. You canchoose Premium Tier or Standard Tier. Premium Tier delivers traffic on Google'sglobal backbone to achieve minimal packet loss and low latency. Standard Tierdelivers traffic using peering, internet service providers (ISP), or transitnetworks at an edge point of presence (PoP) that's closest to the region whereyour Google Cloud workload runs. To optimize performance, we recommendusing Premium Tier. To optimize cost, we recommend using Standard Tier.
Network performance
For workloads that need low inter-VM network latency within the application andweb tiers, you can create a compact placement policy and apply it to the MIGtemplate that's used for those tiers. When the MIG creates VMs, it places theVMs on physical servers that are close to each other. While a compact placementpolicy helps improve inter-VM network performance, a spread placement policy canhelp improve VM availability as described earlier. To achieve an optimal balancebetween network performance and availability, when you create a compactplacement policy, you can specify how far apart the VMs must be placed. For moreinformation, seePlacement policies overview.
Compute Engine has a per-VM limit for egressnetwork bandwidth.This limit depends on the VM's machine type and whether traffic is routedthrough the same VPC network as the source VM. For VMs with certain machinetypes, to improve network performance, you can get a higher maximum egressbandwidth by enablingTier_1 networking.
More performance considerations
When you build the architecture for your workload, consider the general bestpractices and recommendations that are provided inGoogle Cloud Well-Architected Framework: Performance optimization.
What's next
- Learn more about the Google Cloud products used in this referencearchitecture:
- Get started with migrating your workloads to Google Cloud.
- Explore and evaluatedeployment archetypes that you can choose to build architectures for your cloud workloads.
- Review architecture options fordesigning reliable infrastructure for your workloads in Google Cloud.
- For more reference architectures, diagrams, and best practices, explore theCloud Architecture Center.
Contributors
Authors:
- Kumar Dhanagopal | Cross-Product Solution Developer
- Samantha He | Technical Writer
Other contributors:
- Ben Good | Solutions Architect
- Carl Franklin | Director, PSO Enterprise Architecture
- Daniel Lees | Cloud Security Architect
- Gleb Otochkin | Cloud Advocate, Databases
- Mark Schlagenhauf | Technical Writer, Networking
- Pawel Wenda | Group Product Manager
- Sean Derrington | Group Product Manager, Storage
- Sekou Page | Outbound Product Manager
- Simon Bennett | Group Product Manager
- Steve McGhee | Reliability Advocate
- Victor Moreno | Product Manager, Cloud Networking
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-08-12 UTC.