NotificationsYou must be signed in to change notification settings
Fork6
Star31

Commit985bd7d

committed

Support clean switchover.

In replication, when we shutdown the master, walsender tries to sendall the outstanding WAL records to the standby, and then to exit. Thisbasically means that all the WAL records are fully synced betweentwo servers after the clean shutdown of the master. So, afterpromoting the standby to new master, we can restart the stoppedmaster as new standby without the need for a fresh backup fromnew master.But there was one problem so far: though walsender tries to send allthe outstanding WAL records, it doesn't wait for them to be replicatedto the standby. Then, before receiving all the WAL records,walreceiver can detect the closure of connection and exit. We cannotguarantee that there is no missing WAL in the standby after cleanshutdown of the master. In this case, backup from new master isrequired when restarting the stopped master as new standby.This patch fixes this problem. It just changes walsender so that itwaits for all the outstanding WAL records to be replicated to thestandby before closing the replication connection.Per discussion, this is a fix that needs to get backpatched rather thannew feature. So, back-patch to 9.1 where enough infrastructure forthis exists.Patch by me, reviewed by Andres Freund.

1 parent4f14c86 commit985bd7dCopy full SHA for 985bd7d

File tree

1 file changed

-4

lines changed

src/backend/replication
- walsender.c

1 file changed

-4

lines changed

`‎src/backend/replication/walsender.c`

Lines changed: 8 additions & 4 deletions

Original file line number	Diff line number	Diff line change
`@@ -27,7 +27,8 @@`
`27`	`27`	`* If the server is shut down, postmaster sends us SIGUSR2 after all`
`28`	`28`	`* regular backends have exited and the shutdown checkpoint has been written.`
`29`	`29`	`* This instruct walsender to send any outstanding WAL, including the`
`30`		`- * shutdown checkpoint record, and then exit.`
	`30`	`+ * shutdown checkpoint record, wait for it to be replicated to the standby,`
	`31`	`+ * and then exit.`
`31`	`32`	`*`
`32`	`33`	`*`
`33`	`34`	`* Portions Copyright (c) 2010-2013, PostgreSQL Global Development Group`
`@@ -1045,15 +1046,17 @@ WalSndLoop(void)`
`1045`	`1046`
`1046`	`1047`	`/*`
`1047`	`1048`	`* When SIGUSR2 arrives, we send any outstanding logs up to the`
`1048`		`- * shutdown checkpoint record (i.e., the latest record) and exit.`
	`1049`	`+ * shutdown checkpoint record (i.e., the latest record), wait`
	`1050`	`+ * for them to be replicated to the standby, and exit.`
`1049`	`1051`	`* This may be a normal termination at shutdown, or a promotion,`
`1050`	`1052`	`* the walsender is not sure which.`
`1051`	`1053`	`*/`
`1052`	`1054`	`if (walsender_ready_to_stop)`
`1053`	`1055`	`{`
`1054`	`1056`	`/* ... let's just be real sure we're caught up ... */`
`1055`	`1057`	`XLogSend(&caughtup);`
`1056`		`-if (caughtup&& !pq_is_send_pending())`
	`1058`	`+if (caughtup&&sentPtr==MyWalSnd->flush&&`
	`1059`	`+!pq_is_send_pending())`
`1057`	`1060`	`{`
`1058`	`1061`	`/* Inform the standby that XLOG streaming is done */`
`1059`	`1062`	`EndCommand("COPY 0",DestRemote);`
`@@ -1728,7 +1731,8 @@ WalSndLastCycleHandler(SIGNAL_ARGS)`
`1728`	`1731`	`/*`
`1729`	`1732`	`* If replication has not yet started, die like with SIGTERM. If`
`1730`	`1733`	`* replication is active, only set a flag and wake up the main loop. It`
`1731`		`- * will send any outstanding WAL, and then exit gracefully.`
	`1734`	`+ * will send any outstanding WAL, wait for it to be replicated to`
	`1735`	`+ * the standby, and then exit gracefully.`
`1732`	`1736`	`*/`
`1733`	`1737`	`if (!replication_active)`
`1734`	`1738`	`kill(MyProcPid,SIGTERM);`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit985bd7d

File tree

1 file changed

1 file changed

`‎src/backend/replication/walsender.c`

0 commit comments