|
| 1 | +#How to troubleshoot streaming replication lag |
| 2 | +Streaming replication in Postgres allows for continuous data replication from a primary server to standby servers, to ensure high availability and balance read-only workloads. However, replication lag can occur, leading to delays in data synchronization. This guide provides steps to troubleshoot and mitigate replication lag. |
| 3 | + |
| 4 | +##Identifying the lag |
| 5 | +To start investigation we need to understand where we actually have lag, on which stage of replication: |
| 6 | +- sending WAL stream to replica via network by`walsender` |
| 7 | +- receiving WAL stream on replica from network by`walreciever` |
| 8 | +- writing WAL on disk on replica by`walreciever` |
| 9 | +- applying (replaying) WAL as a recovery process |
| 10 | + |
| 11 | + Thus, streaming replication lag can be categorized into three types: |
| 12 | +-**Write Lag**: The delay between when a transaction is committed on the primary and when it is written to the WAL on the standby. |
| 13 | +-**Flush Lag**: The delay between when a transaction is written to the WAL on the standby and when it is flushed to the disk. |
| 14 | +-**Apply (Replay) Lag**: The delay between when a transaction is flushed to the disk and when it is applied to the database on the standby. |
| 15 | + |
| 16 | +###Analysis query |
| 17 | +To identify the lag, use the following SQL query: |
| 18 | +```sql |
| 19 | +select |
| 20 | + pid, |
| 21 | + client_addr, |
| 22 | + application_name, |
| 23 | + state, |
| 24 | + coalesce(pg_current_wal_lsn()- sent_lsn,0)AS sent_lag_bytes, |
| 25 | + coalesce(sent_lsn- write_lsn,0)AS write_lag_bytes, |
| 26 | + coalesce(write_lsn- flush_lsn,0)AS flush_lag_bytes, |
| 27 | + coalesce(flush_lsn- replay_lsn,0)AS replay_lag_bytes, |
| 28 | + coalesce(pg_current_wal_lsn()- replay_lsn,0)AS total_lag_bytes |
| 29 | +from pg_stat_replication; |
| 30 | +``` |
| 31 | + |
| 32 | +###Example |
| 33 | +We will get something like this: |
| 34 | +``` |
| 35 | +postgres=# select |
| 36 | + pid, |
| 37 | + client_addr, |
| 38 | + application_name, |
| 39 | + state, |
| 40 | + coalesce(pg_current_wal_lsn() - sent_lsn, 0) AS sent_lag_bytes, |
| 41 | + coalesce(sent_lsn - write_lsn, 0) AS write_lag_bytes, |
| 42 | + coalesce(write_lsn - flush_lsn, 0) AS flush_lag_bytes, |
| 43 | + coalesce(flush_lsn - replay_lsn, 0) AS replay_lag_bytes, |
| 44 | + coalesce(pg_current_wal_lsn() - replay_lsn, 0) AS total_lag_bytes |
| 45 | +from pg_stat_replication; |
| 46 | +
|
| 47 | + pid | client_addr | application_name | state | sent_lag_bytes | write_lag_bytes | flush_lag_bytes | replay_lag_bytes | total_lag_bytes |
| 48 | +---------+----------------+------------------+-----------+----------------+-----------------+-----------------+------------------+----------------- |
| 49 | + 3602908 | 10.122.224.101 | backupmachine1 | streaming | 0 | 728949184 | 0 | 0 | 0 |
| 50 | + 2490863 | 10.122.224.102 | backupmachine1 | streaming | 0 | 519580176 | 0 | 0 | 0 |
| 51 | + 2814582 | 10.122.224.103 | replica1 | streaming | 0 | 743384 | 0 | 1087208 | 1830592 |
| 52 | + 3596177 | 10.122.224.104 | replica2 | streaming | 0 | 2426856 | 0 | 4271952 | 6698808 |
| 53 | + 319473 | 10.122.224.105 | replica3 | streaming | 0 | 125080 | 162040 | 4186920 | 4474040 |
| 54 | +``` |
| 55 | + |
| 56 | +###How to read results |
| 57 | +Meaning of those`_lsn` |
| 58 | +-`sent_lsn`: How much WAL (lsn position) has already been sent over the network |
| 59 | +-`write_lsn`: How much WAL (lsn position) has been sent to the operating system (before flushing) |
| 60 | +-`flush_lsn`: How much WAL (lsn position) has been flushed to the disk (written on the disk) |
| 61 | +-`replay_lsn`: How much WAL (lsn position) has been applied (visible for queries) |
| 62 | + |
| 63 | +So lag is a gap between`pg_current_wal_lsn` and`replay_lsn` (`total_lag_bytes`, and it's a good idea to add it to monitoring, but for troubleshooting purposes we will need all 4 |
| 64 | + |
| 65 | +- Lag on`sent_lag_bytes` means we have issues with sending the data, i.e. CPU saturated`WALsender` or overloaded network socket on the primary side |
| 66 | +- Lag on`write_lag_bytes` means we have issues with receiving the data, i.e. CPU saturated`WALreceiver` or overloaded network socket on the replica side |
| 67 | +- Lag on`flush_lag_bytes` means we have issues with writing the data on the disk on replica side, i.e. CPU saturated or IO contention of`WALreceiver` |
| 68 | +- Lag`replay_lag_bytes` means we have issues with applying WAL on replica, usually CPU saturated or IO contention of postgres process |
| 69 | + |
| 70 | +Once we pinpointed the problem, we need to troubleshoot the process(es) on the OS level to find the bottleneck. |
| 71 | + |
| 72 | + |
| 73 | +##Possible bottlenecks |
| 74 | +TBD |
| 75 | + |
| 76 | +##Additional resources |
| 77 | +-[Streaming replication](https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION) (official Postgres docs) |
| 78 | +-[pg_stat_replication view](https://www.postgresql.org/docs/current/monitoring-stats.html#PG-STAT-REPLICATION-VIEW) (official Postgres docs) |
| 79 | +-[Replication configuration parameters](https://www.postgresql.org/docs/current/runtime-config-replication.html) (official Postgres docs) |