Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit17b2d5e

Browse files
committed
Fix unconditional WAL receiver shutdown during stream-archive transition
Commitb4f584f (affecting v15~, later backpatched down to 13 as of3635a0a) introduced an unconditional WAL receiver shutdown whenswitching from streaming to archive WAL sources. This causes problemsduring a timeline switch, when a WAL receiver enters WALRCV_WAITINGstate but remains alive, waiting for instructions.The unconditional shutdown can break some monitoring scenarios as theWAL receiver gets repeatedly terminated and re-spawned, causingpg_stat_wal_receiver.status to show a "streaming" instead of "waiting"status, masking the fact that the WAL receiver is waiting for a new TLIand a new LSN to be able to continue streaming.This commit changes the WAL receiver behavior so as the shutdown becomesconditional, with InstallXLogFileSegmentActive being always reset toprevent the regression fixed byb4f584f: only terminate the WALreceiver when it is actively streaming (WALRCV_STREAMING,WALRCV_STARTING, or WALRCV_RESTARTING). When in WALRCV_WAITING state,just reset InstallXLogFileSegmentActive flag to allow archiverestoration without killing the process. WALRCV_STOPPED andWALRCV_STOPPING are not reachable states in this code path. For thelatter, the startup process is the one in charge of settingWALRCV_STOPPING via ShutdownWalRcv(), waiting for the WAL receiver toreach a WALRCV_STOPPED state after switching walRcvState, soWaitForWALToBecomeAvailable() cannot be reached while a WAL receiver isin a WALRCV_STOPPING state.A regression test is added to check that a WAL receiver is not stoppedon timeline jump, that fails when the fix of this commit is reverted.Reported-by: Ryan Bird <ryanzxg@gmail.com>Author: Xuneng Zhou <xunengzhou@gmail.com>Reviewed-by: Noah Misch <noah@leadboat.com>Reviewed-by: Michael Paquier <michael@paquier.xyz>Discussion:https://postgr.es/m/19093-c4fff49a608f82a0@postgresql.orgBackpatch-through: 13
1 parent8b18ed6 commit17b2d5e

File tree

4 files changed

+31
-5
lines changed

4 files changed

+31
-5
lines changed

‎src/backend/access/transam/xlog.c‎

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9519,10 +9519,7 @@ void
95199519
XLogShutdownWalRcv(void)
95209520
{
95219521
ShutdownWalRcv();
9522-
9523-
LWLockAcquire(ControlFileLock,LW_EXCLUSIVE);
9524-
XLogCtl->InstallXLogFileSegmentActive= false;
9525-
LWLockRelease(ControlFileLock);
9522+
ResetInstallXLogFileSegmentActive();
95269523
}
95279524

95289525
/* Enable WAL file recycling and preallocation. */
@@ -9534,6 +9531,15 @@ SetInstallXLogFileSegmentActive(void)
95349531
LWLockRelease(ControlFileLock);
95359532
}
95369533

9534+
/* Disable WAL file recycling and preallocation. */
9535+
void
9536+
ResetInstallXLogFileSegmentActive(void)
9537+
{
9538+
LWLockAcquire(ControlFileLock,LW_EXCLUSIVE);
9539+
XLogCtl->InstallXLogFileSegmentActive= false;
9540+
LWLockRelease(ControlFileLock);
9541+
}
9542+
95379543
bool
95389544
IsInstallXLogFileSegmentActive(void)
95399545
{

‎src/backend/access/transam/xlogrecovery.c‎

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3687,8 +3687,19 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
36873687
* Before we leave XLOG_FROM_STREAM state, make sure that
36883688
* walreceiver is not active, so that it won't overwrite
36893689
* WAL that we restore from archive.
3690+
*
3691+
* If walreceiver is actively streaming (or attempting to
3692+
* connect), we must shut it down. However, if it's
3693+
* already in WAITING state (e.g., due to timeline
3694+
* divergence), we only need to reset the install flag to
3695+
* allow archive restoration.
36903696
*/
3691-
XLogShutdownWalRcv();
3697+
if (WalRcvStreaming())
3698+
XLogShutdownWalRcv();
3699+
else
3700+
{
3701+
ResetInstallXLogFileSegmentActive();
3702+
}
36923703

36933704
/*
36943705
* Before we sleep, re-scan for possible new timelines if

‎src/include/access/xlog.h‎

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -269,6 +269,7 @@ extern void SwitchIntoArchiveRecovery(XLogRecPtr EndRecPtr, TimeLineID replayTLI
269269
externvoidReachedEndOfBackup(XLogRecPtrEndRecPtr,TimeLineIDtli);
270270
externvoidSetInstallXLogFileSegmentActive(void);
271271
externboolIsInstallXLogFileSegmentActive(void);
272+
externvoidResetInstallXLogFileSegmentActive(void);
272273
externvoidXLogShutdownWalRcv(void);
273274

274275
/*

‎src/test/recovery/t/004_timeline_switch.pl‎

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,14 @@
6666
$node_standby_2->safe_psql('postgres',"SELECT count(*) FROM tab_int");
6767
is($result,qq(2000),'check content of standby 2');
6868

69+
# Check the logs, WAL receiver should not have been stopped while
70+
# transitioning to its new timeline. There is no need to rely on an
71+
# offset in this check of the server logs: a new log file is used on
72+
# node restart when primary_conninfo is updated above.
73+
ok( !$node_standby_2->log_contains(
74+
"FATAL: .* terminating walreceiver process due to administrator command"
75+
),
76+
'WAL receiver should not be stopped across timeline jumps');
6977

7078
# Ensure that a standby is able to follow a primary on a newer timeline
7179
# when WAL archiving is enabled.

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp