Design reliable infrastructure for your workloads in Google Cloud Stay organized with collections Save and categorize content based on your preferences.
As described inPlatform availability,Google Cloud infrastructure is designed to support a target availabilityof 99.9% for a workload that's deployed in a single zone. The targetavailability is 99.99% for a multi-zone deployment1 and 99.999% for amulti-region deployment. This part of theGoogle Cloud infrastructure reliability guide provides deployment guidance, example architectures, and design techniques thatcan help to protect your workloads against failures at the resource, zone, andregion level.
Avoid single points of failure
Applications are typically composed of multiple interdependent components, eachdesigned to perform a specific function. These components are typically groupedinto tiers based on the function that they perform and their relationship withthe other components. For example, a content-serving application might havethree tiers: a web tier containing a load balancer and web servers; an app tierwith a cluster of application servers; and a data tier for persistence. If anycomponent of this application stack depends on a single infrastructure resource,a failure of that resource can affect the availability of the entire stack. Forexample, if the app tier runs on a single VM, and if the VM crashes, then theentire stack is effectively unavailable. Such a component is asingle point of failure (SPOF).
An application stack might have more than one SPOF. Consider the multi-tierapplication stack that's shown in the following diagram:
As shown in the preceding diagram, this example architecture contains a singleload balancer, two web servers, a single app server, and a single database. Theload balancer, app server, and database in this example are SPOFs. A failure ofany of these components can cause user requests to the application to fail.
To remove the SPOFs in your application stack, distribute resources acrosslocations and deploy redundant resources.
Note: You might accept the Google Cloud global load balancer as a SPOFconsidering the stringent controls in Google Cloud and thedefense-in-depth measures that Google implements to prevent global outages. Tohelp reduce the risk of failure, manage changes to the load balancer carefully.Also, avoid or minimize dependencies on non-data plane actions, such ascreating a new load balancer.Distribute resources and create redundancy
Depending on the reliability requirements of your application, you can choosefrom the following deployment architectures:
| Architecture | Workload recommendation |
|---|---|
| Multi-region | Workloads that are business-critical and where high availabilityis essential, such as retail and social media applications. |
| Multi-zone | Workloads that need resilience against zone outages but cantolerate some downtime caused by region outages. |
| Single-zone | Workloads that can tolerate downtime or can be deployed atanother location when necessary with minimal effort. |
Cost, latency, and operational considerations
When you design a distributed architecture with redundant resources, besidesthe availability requirements of the application, you must also consider theeffects on operational complexity, latency, and cost.
In a distributed architecture, you provision and manage a higher number ofresources. The volume of cross-location network traffic is higher. You alsostore and replicate more data. As a result, the cost of your cloud resources ina distributed architecture is higher, and operating such deployments involvesmore complexity. For business-critical applications, the availability advantageof a distributed architecture might outweigh the increased cost and operationalcomplexity.
For applications that aren't business-critical, the high availability that adistributed architecture provides might not be essential. Certain applicationshave other requirements that are more important than availability. For example,batch computing applications require low-latency and high-bandwidth networkconnections between the VMs. A single-zone architecture might be well suited forsuch applications, and it can also help you reduce data transfer costs.
Deployment architectures
This section presents the following architectural options to buildinfrastructure for your workloads in Google Cloud:
- Single-zone deployment
- Multi-zone deployment
- Multi-region deployment with regional load balancing
- Multi-region deployment with global load balancing
Single-zone deployment
The following diagram shows a single-zone application architecture with redundancy in every tier, to achieve higher availability of the functions performed by each component:
As shown in the preceding diagram, this example architecture includes thefollowing components:
- A regional external HTTP/S load balancer to receive and respond to userrequests.
- A zonal managed instance group (MIG) as the backend for the HTTP/S loadbalancer. The MIG has two Compute Engine VMs. Each VM hosts an instanceof a web server.
- An internal load balancer to handle communication between the web server andthe app server instances.
- A second zonal MIG as the backend for the internal load balancer. This MIGcontains two Compute Engine VMs. Each VM hosts an instance of anapplication server.
- A Cloud SQL database instance (Enterprise edition) that the applicationwrites data to and reads from. The database is replicated manually to a secondCloud SQL database instance in the same zone.
Aggregate availability: Single-zone deployment
The following table shows the availability of each tier in the precedingsingle-zone architecture diagram:
| Resource | SLA |
|---|---|
| External load balancer | 99.99% |
| Web tier: Compute Engine VMs in a single zone | 99.9% |
| Internal load balancer | 99.99% |
| Application tier: Compute Engine VMs in a single zone | 99.9% |
| Cloud SQL instance (Enterprise edition) | 99.95% |
You can expect the Google Cloud infrastructure resources that arelisted in the preceding table to provide the following aggregate availabilityand estimated maximum monthly downtime:
- Aggregate availability: 0.9999 x 0.999 x 0.9999 x 0.999 x 0.9995 = 99.73%
- Estimated maximum monthly downtime: Approximately 1 hour and 57 minutes
This calculation considers only the infrastructure resources that are shownin the preceding architecture diagram. To assess the availability of anapplication in Google Cloud, you must also consider other factors, likethe following:
- The internal design of the application
- The DevOps processes and tools used to build, deploy, and maintain theapplication, its dependencies, and the Google Cloud infrastructure
For more information, seeFactors that affect application reliability.
Effects of outages, and guidance for recovery
In a single-zone deployment architecture, if any component fails, theapplication can process requests if each tier contains at least one functioningcomponent with adequate capacity. For example, if a web server instance fails,the load balancer forwards user requests to the other web server instances. If aVM that hosts a web server or app server instance crashes, the MIG ensures thata new VM is created automatically. If the database crashes, you must manuallyactivate the second database and update the app server instances to connect tothe database.
A zone outage or region outage affects the Compute Engine VMs and theCloud SQL database instances in a single-zone deployment. A zone outagedoesn't affect the load balancer in this architecture because it is a regionalresource. However, the load balancer can't distribute traffic, because there areno available backends. If a zone outage occurs, you must wait for Google toresolve the outage, and then verify that the application works as expected.
The next section describes an architectural approach that you can use todistribute resources across multiple zones, which helps to improve theresilience of the application to zone outages.
Multi-zone deployment
In a single-zone deployment, if a zone outage occurs, the application mightnot be able to serve requests until the issue is resolved. To help to improvethe resilience of your application against zone outages, you can provisionmultiple instances of zonal resources (such as Compute Engine VMs)across two or more zones. For services that support region-scoped resources(such as Cloud Storage buckets), you can deploy regional resources.
The following diagram shows a highly available cross-zone architecture,with the components in each tier of the application stack distributed across twozones:
As shown in the preceding diagram, this example architecture includes thefollowing components:
- A regional external HTTP/S load balancer receives and responds to userrequests.
- A regional MIG is the backend for the HTTP/S load balancer. The MIG containstwo Compute Engine VMs in different zones. Each VM hosts an instance ofa web server.
- An internal load balancer handles communication between the web server andthe app server instances.
- A second regional MIG is the backend for the TCP load balancer. This MIG hastwo Compute Engine VMs in different zones. Each VM hosts an instance ofan app server.
- A Cloud SQL instance (Enterprise edition) that's configured for HAis the database for the application. The primary database instance is replicatedsynchronously to a standby database instance.
Aggregate availability: Multi-zone deployment
The following table shows the availability of each tier in the precedingdual-zone architecture diagram:
| Resource | SLA |
|---|---|
| External load balancer | 99.99% |
| Web tier: Compute Engine VMs in separate zones | 99.99% |
| Internal load balancer | 99.99% |
| Application tier: Compute Engine VMs in separate zones | 99.99% |
| Cloud SQL instance (Enterprise edition) | 99.95% |
You can expect the Google Cloud infrastructure resources that are listedin the preceding table to provide the following aggregate availability andestimated maximum monthly downtime:
- Aggregate availability: 0.9999 x 0.9999 x 0.9999 x 0.9999 x 0.9995 = 99.91%
- Estimated maximum monthly downtime: Approximately 39 minutes
This calculation considers only the infrastructure resources that are shownin the preceding architecture diagram. To assess the availability of anapplication in Google Cloud, you must also consider other factors, likethe following:
- The internal design of the application
- The DevOps processes and tools used to build, deploy, and maintain theapplication, its dependencies, and the Google Cloud infrastructure
For more information, seeFactors that affect application reliability.
Effects of outages, and guidance for recovery
In a dual-zone deployment, if any component fails, the application canprocess requests if at least one functioning component with adequate capacityexists in each tier. For example, if a web server instance fails, the loadbalancer forwards user requests to the web server instance in the other zone. Ifa VM that hosts a web server or app server instance crashes, the MIG ensuresthat a new VM is created automatically. If the primary Cloud SQLdatabase crashes, Cloud SQL automatically fails over to the standbydatabase instance.
The following diagram shows the same architecture as the previous diagramand the effects of a zone outage on the availability of the application:
As shown in the preceding diagram, if an outage occurs at one of the zones,the load balancer in this architecture is not affected, because it is aregional resource. A zone outage might affect individual Compute EngineVMs and one of the Cloud SQL database instances. But the applicationremains available and responsive, because the VMs are in regional MIGs and theCloud SQL database is configured for HA. The MIGs ensure that new VMsare created automatically to maintain the configured minimum number of VMs. Ifthe primary Cloud SQL database instance is affected by a zone outage,Cloud SQLfails overautomatically to the standby instance in the other zone. After Google resolvesthe outage, you must verify that the application runs as expected in all thezones where it's deployed.
For more information about region-specific considerations, seeGeography and regions.If both the zones in this architecture have an outage, then the applicationis unavailable. The load balancer continues to be available unless a region-wideoutage occurs. However, the load balancer can't distribute traffic, becausethere are no available backends. If a multi-zone outage or region outage occurs,you must wait for Google to resolve the outage, and then verify that theapplication works as expected.
The next sections present architectural options to protect your applicationagainst multi-zone outages and region outages.
Multi-regiondeployment with regional load balancing
In a single-zone or multi-zone deployment, if a region outage occurs, theapplication can't serve requests until the issue is resolved. To protect yourapplication against region outages, you can distribute the Google Cloudresources across two or more regions.
The following diagram shows a highly available cross-region architecture,with the components in each tier of the application stack distributed acrossmultiple regions:
As shown in the preceding diagram, this example architecture includes thefollowing components:
- A public Cloud DNS zone with aroutingpolicy that steers traffic to two Google Cloud regions.
- A regional external HTTP/S load balancer in each region to receive andrespond to user requests.
- The backend for each regional HTTP/S load balancer is a regional MIG. EachMIG contains two Compute Engine VMs in different zones. Each of theseVMs hosts an instance of a web server.
- An internal load balancer in each region handles communication between theweb server instances and the app server instances.
- A second pair of regional MIGs is the backend for the internal loadbalancers. Each of these MIGs contains two Compute Engine VMs indifferent zones. Each VM hosts an instance of an app server.
- The application writes data to and reads from a multi-regionSpanner instance. The multi-region configuration that's used in thisarchitecture (
eur6) includes four read-write replicas. Theread-write replicas are provisioned equally across two regions and in separatezones. The multi-region Spanner configuration also includes a witnessreplica in a third region.
Aggregate availability: Multi-regiondeployment with regional load balancing
In the multi-region deployment that's shown in the preceding diagram, theload balancers and the VMs are provisioned redundantly in two regions. The DNSzone is aglobal resource, and the Spanner instance is a multi-regionresource.
To calculate the aggregate availability of the Google Cloudinfrastructure that's shown in this architecture, we must first calculate theaggregate availability of the resources in each region, and then consider theresources that span multiple regions. Use the following process:
- Calculate the aggregate availability of the infrastructure resourcesper region; that is, excluding the DNS and database resources:
Resource and SLA SLA External load balancer 99.99% Web tier: Compute Engine VMs in separate zones 99.99% Internal load balancer 99.99% Application tier: Compute Engine VMs in separate zones 99.99% Aggregate availability per region: 0.9999 x 0.9999 x 0.9999 x 0.9999 = 99.96%
Calculate the aggregate availability of the infrastructure resources considering the dual-region redundancy of the load balancers and the Compute Engine VMs.
The theoretical availability is 1-(1-0.9996)(1-0.9996) = 99.999984%. However, the actual availability that you can expect is limited to the target availability for multi-region deployments, which is 99.999%.
Calculate the aggregate availability of all the infrastructure resources, including the Cloud DNS and Spanner resources:
- Aggregate availability: 0.99999 x 1 x 0.99999 = 99.998%
- Estimated maximum monthly downtime: Approximately 52 seconds
This calculation considers only the infrastructure resources that are shownin the preceding architecture diagram. To assess the availability of anapplication in Google Cloud, you must also consider other factors, likethe following:
- The internal design of the application
- The DevOps processes and tools used to build, deploy, and maintain theapplication, its dependencies, and the Google Cloud infrastructure
For more information, seeFactors that affect application reliability.
Effects of outages, and guidance for recovery
If any component in this multi-region deployment fails but there is at leastone functioning component with adequate capacity in each tier, the applicationcontinues to work. For example, if a web server instance fails, the regionalexternal HTTP/S load balancer forwards user requests to the other web serverinstances in the region. Similarly, if one of the app server instances crashes,the internal load balancers send requests to the other app server instances. Ifany of the VMs crash, the MIGs ensure that new VMs are created automatically tomaintain the minimum configured number of VMs.
An outage at a single zone doesn't affect the load balancers,because they are regional resources and are resilient to zone outages. A zoneoutage might affect individual Compute Engine VMs. But the web serverand app server instances remain available, because the VMs are part of regionalMIGs. The MIGs ensure that new VMs are created automatically to maintain theminimum configured number of VMs. The Spanner instance in thisarchitecture uses a multi-region configuration, which is resilient to zoneoutages.
For information about how multi-region replication works inSpanner, seeRegional and multi-region configurations andDemystifying Spanner multi-region configurations.
The following diagram shows the same multi-region architecture as theprevious diagram and the effects of a single-region outage on the availabilityof the application:
As shown in the preceding diagram, even if an outage occurs at both the zonesin any region, the application remains available, because an independentapplication stack is deployed in each region. The DNS zone steers user requeststo the region that's not affected by the outage. The multi-regionSpanner instance is resilient to region outages. After Google resolvesthe outage, you must verify that the application runs as expected in the regionthat had the outage.
If any two of the regions in this architecture have outages, then theapplication is unavailable. Wait for Google to resolve the outages. Then, verifythat the application runs as expected in all the regions where it'sdeployed.
For multi-region deployments, instead of using regional load balancers, youcan consider using a global load balancer. The next section presents amulti-region deployment architecture that uses a global load balancer anddescribes the benefits and risks of that approach.
Multi-region deployment with global load balancing
The following diagram shows an alternative multi-region deployment that usesa global load balancer instead of regional load balancers:
As shown in the preceding diagram, this architecture uses a global externalHTTP/S load balancer (with Cloud CDN enabled) to receive and respond touser requests. Each forwarding rule of the load balancer uses a single externalIP address; you don't need to configure a separate DNS record for each region.The backends for the global external HTTP/S load balancer are two regional MIGs.The load balancer routes requests to the region that's closest to the users.
All the other components in this architecture are identical to thearchitecture shown inMulti-region deployment with regional load balancing.
Benefits and risks of global load balancing for multi-region deployments
To load-balance external traffic to an application that's distributed acrossmultiple regions, you can use either a global load balancer or multiple regionalload balancers.
The following are the benefits of an architecture that uses a global loadbalancer:
- You need to manage only a single load balancer.
- Global load balancers use a single anycast IP address to provide loadbalancing across Google Cloud regions.
- Global load balancers are resilient to region outages, and provide automaticcross-region failover.
- Global load balancers support the following features, which can help enhancethe reliability of your deployments:
- Edge caching usingCloud CDN
- Ability touse highly durable Cloud Storage buckets as backends
- Google Cloud Armorsecurity policies
The following are the risks of an architecture that uses a global loadbalancer:
- An incorrect configuration change to the global load balancer might make theapplication unavailable to users. For example, while updating the frontend ofthe global load balancer, if you accidentally delete a forwarding rule, theload balancer stops receiving user requests. The effect of this risk is lower inthe case of a multi-region architecture that uses regional load balancers,because even if the regional load balancer in one of the regions is affected bya configuration error, the load balancers in the other regions continue to work.
- An infrastructure outage that affects global resources might make the global load balancer unavailable.
To mitigate these risks, you must manage changes to the global load balancercarefully, and consider using defense-in-depth fallbacks where possible. Formore information, seeRecommendations to manage the risk of outages of global resources.
Aggregate availability: Multi-regiondeployment with global load balancing
In the multi-region deployment that's shown in the preceding diagram, the VMsand the internal load balancers are distributed redundantly across two regions.The external load balancer is aglobal resource, and the Spanner instance is a multi-regionresource.
To calculate the aggregate availability of this deployment, we firstcalculate the aggregate availability of the resources in each region, and thenconsider the resources that span multiple regions.
- Calculate the aggregate availability of the infrastructure resourcesper region, excluding the external load balancer and the database:
Resource SLA Web tier: Compute Engine VMs in separate zones 99.99% Internal load balancer 99.99% Application tier: Compute Engine VMs in separate zones 99.99% Aggregate availability per region: 0.9999 x 0.9999 x 0.9999 = 99.97%
Calculate the aggregate availability of the infrastructure resources considering the dual-region redundancy of the internal load balancer and the Compute Engine VMs.
The theoretical availability is 1-(1-0.9997)(1-0.9997) = 99.999991%. However, the actual availability that you can expect is limited to the target availability for multi-region deployments, which is 99.999%.
Calculate the aggregate availability of all the infrastructure resources, including the global load balancer and Spanner resources:
- Aggregate availability: 0.99999 x 0.9999 x 0.99999 = 99.988%
- Estimated maximum monthly downtime: Approximately 5 minutes and 11 seconds
This calculation considers only the infrastructure resources that are shownin the preceding architecture diagram. To assess the availability of anapplication in Google Cloud, you must also consider other factors, likethe following:
- The internal design of the application
- The DevOps processes and tools used to build, deploy, and maintain theapplication, its dependencies, and the Google Cloud infrastructure
For more information, seeFactors that affect application reliability.
Effects of outages, and guidance for recovery
If any component in this architecture fails, the application continues towork if at least one functioning component with adequate capacity exists in eachtier. For example, if a web server instance fails, the global external HTTP/Sload balancer forwards user requests to the other web server instances. If anapp server instance crashes, the internal load balancers send the requests tothe other app server instances. If any of the VMs crash, the MIGs ensure thatnew VMs are created automatically to maintain the minimum configured number ofVMs.
If an outage occurs at one of the zones in any region, the load balancer isnot affected. The global external HTTP/S load balancer is resilient to zone andregion outages. The internal load balancers are regional resources; they'reresilient to zone outages. A zone outage might affect individualCompute Engine VMs. But the web server and app server instances remainavailable, because the VMs are part of regional MIGs. The MIGs ensure that newVMs are created automatically to maintain the minimum configured number of VMs.The Spanner instance in this architecture uses a multi-regionconfiguration, which is resilient to zone outages.
The following diagram shows the same multi-region architecture as theprevious diagram and the effects of a single-region outage on the availabilityof the application:
As shown in the preceding diagram, even if an outage occurs at both the zonesin any region, the application remains available, because an independentapplication stack is deployed in each region. The global external HTTP/S loadbalancer routes user requests to the application in the region that's notaffected by the outage. The multi-region Spanner instance is resilientto region outages. After Google resolves the outage, you must verify that theapplication runs as expected in the region that had the outage.
For information about how multi-region replication works inSpanner, seeRegional and multi-region configurations andDemystifying Spanner multi-region configurations.
If any two of the regions in this architecture have outages, then theapplication is unavailable. The global external HTTP/S load balancer isavailable, but it can't distribute traffic because there are no availablebackends. Wait for Google to resolve the outages. Then, verify that theapplication runs as expected in all the regions where it's deployed.
Multi-region deployments can help ensure high availability for your mostcritical business applications. To ensure business continuity during failureevents, besides deploying the application across multiple regions, youmust take certain additional steps. For example, you must performcapacity planning to ensure that either sufficient capacity is reserved inall the regions or the risks associated with emergency autoscaling areacceptable. You must also implement operational practices for DR testing,managing incidents, verifying application status after incidents, andperforming retrospectives.
For more information about region-specific considerations, seeGeography and regions. ↩
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-11-20 UTC.