Replication lag Stay organized with collections Save and categorize content based on your preferences.
This page describes how to troubleshoot and fix replication lag for Cloud SQLread replicas.
Overview
Cloud SQL read replicas useSQL Server Always-On Availability Groups for replication. Changes are written to the transaction log in the primaryinstance. The primary instance forwards transactions to any secondary replicainstances, where the changes are applied. The availability mode used isAsynchronous-commit mode.Replication lag can happen in a few scenarios, such as:
- The primary instance can't send the changes fast enough to the replica.
- The replica can't receive the changes quickly enough.
- The replica can't apply the changes quickly enough.
network_lagmetric. For more information about the metric, seeMonitor replication lag.Ensure replica is adequately provisioned
A replica instance that is smaller than the primary instance (for example, withless vCPUs and memory) can experience replication lag. A smaller replica mightalso have different default configuration flags compared to a larger primaryinstance. We recommend that the replica instance is at least as large as theprimary instance to have enough resources to handle the replication load.
High CPU utilization on the replica can also cause replication lag. If thereplica's CPU utilization is high (for example, over 90%), consider increasingthe replica's CPU capacity.
Optimize queries and schema
This section suggests some common query and schema optimizations you can make toimprove replication performance.
Long-running queries in the read replica
Long-running queries in the replica might block replication for Cloud SQL.You might want to have separate replicas for online transaction processing(OLTP) and online analytical processing (OLAP) purposes and only sendlong-running queries to the OLAP replica.
Exclusive locks due to DDL
Data definition language (DDL) commands, such asALTER TABLE andCREATE INDEX, can cause replication lag in the replica due toexclusive locks. To avoid lock contention, consider scheduling DDL executionduring times when the query load is lower on the replicas.
CREATE INDEX,ALTER INDEX, andINDEXMAINTENANCE can cause replication lag due to the large number of transactionlog records that these statements can generate.Overloaded replica
If a read replica is receiving too many queries, replication could be blocked.Consider splitting the reads among multiple replicas to reduce the load on eachone.
To avoid query spikes, consider throttling replica read queries in yourapplication logic or in a proxy layer if you use one.
If there are spikes of activity on the primary instance, consider spreading outupdates.
Monolithic primary database
Consider sharding the primary database vertically (or horizontally) to preventone or more lagging tables from holding back all the other tables.
Monitor replication lag
You can use thereplica_lag andnetwork_lag metrics to monitor replicationlag and identify whether the cause of the lag is in the primary database,the network, or the replica.
| Metric | Description |
|---|---|
| Network lag ( cloudsql.googleapis.com) | The difference between the timestamp of the last received log entry on the replica and the last sent log entry on the primary. |
Verify replication
To verify that replication is working, check the value of thecloudsql.googleapis.com/database/replication/state metric on the primaryinstance. If the state isRunning, then replication is healthy.What's next:
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-17 UTC.