About high availability

MySQL | PostgreSQL | SQL Server

This page is an overview of the high availability (HA) configuration forCloud SQL instances. To configure a new instance for HA, or to enable HAon an existing instance, seeEnabling and disabling high availability on an instance.

HA configuration overview

The purpose of an HA configuration is to reduce downtime when a zone or instancebecomes unavailable. This might happen during a zonal outage, or when there's ahardware issue. With HA, your data continues to beavailable to client applications.

The HA configuration provides data redundancy. ACloud SQL instance configured for HA is also called aregionalinstance and has a primary and secondary zone within theconfigured region^*. Within a regional instance,the configuration is made upof aprimary instance and astandby instance. Throughsynchronous replication to each zone's persistent disk, all writesmade to the primary instance are replicated to disks in both zones before atransaction is reported as committed. In the event of an instance or zonefailure, the standby instance becomes the new primary instance. Users are thenrerouted to the new primary instance. This process is called afailover.

Note: There are cases where Cloud SQL issues a restart instead of afailover. When this happens, you seeRestart as an operation on theinstance when you view theOperations and logs pane on theinstanceOverview page or on the instanceOperations page.For example, the database can restart when a resource is exhausted, such aswhen an instance runs out of memory. To avoid downtime due to memoryissues, see Optimize high memory consumption.

After a failover, the instance that received the failover continues to be theprimary instance, even after the original instance comes back online. After thezone or instance that experienced an outage becomes available again, theoriginal primary instance is destroyed and recreated. Then it becomes the newstandby instance. If a failover occurs in the future, the new primary will failover to the original instance in the original zone.

If you need to have the primary instance in the zone that had the outage, youcan do afailback. A failback performs the same steps as the failover,only in the opposite direction, to reroute traffic back to the originalinstance. To perform a failback, use the procedure inInitiating failover.

Regional persistent disk support for Cloud SQL HAconfiguration that has at least one dedicated CPU has fullService Level Agreement (SLA) coverage. AnHA-configured instance costs twice as much as a standalone instance.This price includes CPU, RAM, and storage. For more information, see thepricing page.

^* For more information about region-specific considerations, seeGeography and regions.

You can create an account to evaluate how Cloud SQL performs in real-world scenarios. New customers also get $300 in free credits to spend on Cloud SQL to run, test, and deploy workloads. You won't be charged until you upgrade.

Diagram overview of the Cloud SQL HA configuration. Described in text below.

Read replicas

If availability is a consideration for your read replicas, you can enable HA onthe replicas. When you promote such a replica to become a primary instance, it'salready set up as a highly available instance.

During a zonal outage, traffic stops to read replicas in that zone.After the zone becomes available again, any read replicasin the zone resume replication from the primary instance. If read replicas arenot located in a zone that is undergoing an outage, they connect to the standby instancewhen it becomes the primary instance.

As a best practice, consider putting some of your read replicas in a different zone from the primaryand standby instances. For example, if you have a primary instance in zone A anda standby instance in zone B, put a read replica in zone C to improve your reliability. This practiceensures that read replicas continue to operate even if the zone for the primaryinstance goes down. You should also add business logic in the client applicationto send reads to the primary instance when read replicas are unavailable.

Failover overview

If an HA-configured instance becomes unresponsive, Cloud SQL automaticallyswitches to serving data from the standby instance. To see if a failover hasoccurred, check youroperation logfailover history.

Learn more about how tobuildqueries in the Logs Explorer. If you need more detailed information aboutan operation, such as the user who performed the operation, you mustenable audit logging.

Click the tabs to see how failover affects your instance.

Normal

Diagram of healthy instance before failover

Failover

Diagram of instance when failover occurs

Post-Failover

Diagram of instance after failover

Failback

Diagram of instance after failback

Process

The following process occurs:

The primary instance or zone fails.
Each second, the heartbeat system detects whether the primary instance ishealthy. If multiple heartbeats aren't detected, failover is initiated.
The standby instance now serves data upon reconnection.
Through a shared static IP address with the primary instance, the standbyinstance now serves data from the secondary zone.

Note: If failover occurs, read replicas outside of the outage zone don't change zones; they continue to serve data even if they are in a different zone than the primary instance.Note: When a failover occurs, you can expect the instance to be unavailable forabout sixty seconds. This duration might differ based on your Cloud SQL environment.See Initiating failover.

Requirements

For Cloud SQL to allow a failover, the configuration must meet the followingrequirements:

The primary instance must be in a normal operating state (not stopped,undergoing maintenance, or performing a long-running Cloud SQLinstance operation such as a backup operation).
The secondary zone and standby instance must both be in a healthy state.When the standby instance is unresponsive, failover operations are blocked. After Cloud SQLrepairs the standby instance and the secondary zone isavailable, Cloud SQL allows failover.

Note: If both the primary and standby instances are unresponsive, Cloud SQLdoes not allow failover.

Backup and restore

Automated backups are highly recommended for high availability.

Recovery options for standalone instances

Cloud SQL doesn't recover standalone instances from a zonal outageautomatically. To re-establish an instance that isn't configured for high availability toa healthy zone, you must restore any zonal instances manually.You can recover a standalone instance from a zonal outage manually by using oneof the following options:

Perform point-in-time recoveryon the instance to a new instance that you create.To use this option, you must have enabled PITR on the zonal instance priorto the zonal outage. The transaction logs for the instance must be stored inCloud Storage. If the transaction logs are stored on disk, then you canswitch them to Cloud Storage. To use this option, follow the steps in Perform PITR on an unavailable instance.
If the instance has a read replica in a different zone, then you can promotethat read replica to replace the standalone instance that's experiencing the zonal outage.To use this option, follow the steps inPromote a replica

For both options, the following considerations apply:

Some recent transactions committed on the primary instance might not appearon the newly recovered instance. The interval of time where transactions mighthave been lost is the recovery point objective (RPO).
- For PITR recovery, the RPO is typically five minutes or less.
- For read replica promotion, the RPO varies based on the database workload.For more information on how to monitor and reduce replication lag,seeReplication lag.
After you perform either of the restoration options,you must reconfigure any clients of the instances that experience the zonal outagebecause the recovered instances will have different IP addresses andconnection names.

Applications and instances

There is no difference in working with non-HA and HA instances, so yourapplication does not need to be configured in any particular way. Whenfailover occurs, any existing connections to the primary instance and readreplicas are closed, and it will take approximately 60 seconds for connectionsto the primary instance to be reestablished. Your application reconnects using the same connectionstring or IP address, so you do not need to update your application afterfailover.

To see exactly how your applications are affected by failover,manually initiate failover.

Maintenance downtime

Maintenance events affect primary instances configured with HA in the same wayas other instances. You can expect primary instances to be down for a briefperiod of time. For more information on how maintenance affectsHA instances, seeHow maintenance works.To minimize impact to your service, changemaintenance settingsto control when downtime occurs.

What's next

Enable and disable high availability on an instance.
Initiate failover.
Learn more about managing your database connections.
Learn more aboutregions and zones in Cloud SQL.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-07-18 UTC.

Movatterモバイル変換

About high availability Stay organized with collections Save and categorize content based on your preferences.