Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit8d68ee6

Browse files
committed
Prevent references to invalid relation pages after fresh promotion
If a standby crashes after promotion before having completed its firstpost-recovery checkpoint, then the minimal recovery point which marksthe LSN position where the cluster is able to reach consistency may beset to a position older than the first end-of-recovery checkpoint whileall the WAL available should be replayed. This leads to the instancethinking that it contains inconsistent pages, causing a PANIC and a hardinstance crash even if all the WAL available has not been replayed forcertain sets of records replayed. When in crash recovery,minRecoveryPoint is expected to always be set to InvalidXLogRecPtr,which forces the recovery to replay all the WAL available, so thiscommit makes sure that the local copy of minRecoveryPoint from thecontrol file is initialized properly and stays as it is while crashrecovery is performed. Once switching to archive recovery or if crashrecovery finishes, then the local copy minRecoveryPoint can be safelyupdated.Pavan Deolasee has reported and diagnosed the failure in the firstplace, and the base fix idea to rely on the local copy ofminRecoveryPoint comes from Kyotaro Horiguchi, which has been expandedinto a full-fledged patch by me. The test included in this commit hasbeen written by Álvaro Herrera and Pavan Deolasee, which I have modifiedto make it faster and more reliable with sleep phases.Backpatch down to all supported versions where the bug appears, aka 9.3which is where the end-of-recovery checkpoint is not run by the startupprocess anymore. The test gets easily supported down to 10, still ithas been tested on all branches.Reported-by: Pavan DeolaseeDiagnosed-by: Pavan DeolaseeReviewed-by: Pavan Deolasee, Kyotaro HoriguchiAuthor: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, ÁlvaroHerreraDiscussion:https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com
1 parent2adadf0 commit8d68ee6

File tree

1 file changed

+70
-31
lines changed
  • src/backend/access/transam

1 file changed

+70
-31
lines changed

‎src/backend/access/transam/xlog.c

Lines changed: 70 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -794,8 +794,14 @@ static XLogSource XLogReceiptSource = 0;/* XLOG_FROM_* code */
794794
staticXLogRecPtrReadRecPtr;/* start of last record read */
795795
staticXLogRecPtrEndRecPtr;/* end+1 of last record read */
796796

797-
staticXLogRecPtrminRecoveryPoint;/* local copy of
798-
* ControlFile->minRecoveryPoint */
797+
/*
798+
* Local copies of equivalent fields in the control file. When running
799+
* crash recovery, minRecoveryPoint is set to InvalidXLogRecPtr as we
800+
* expect to replay all the WAL available, and updateMinRecoveryPoint is
801+
* switched to false to prevent any updates while replaying records.
802+
* Those values are kept consistent as long as crash recovery runs.
803+
*/
804+
staticXLogRecPtrminRecoveryPoint;
799805
staticTimeLineIDminRecoveryPointTLI;
800806
staticboolupdateMinRecoveryPoint= true;
801807

@@ -2532,20 +2538,26 @@ UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force)
25322538
if (!updateMinRecoveryPoint|| (!force&&lsn <=minRecoveryPoint))
25332539
return;
25342540

2541+
/*
2542+
* An invalid minRecoveryPoint means that we need to recover all the WAL,
2543+
* i.e., we're doing crash recovery. We never modify the control file's
2544+
* value in that case, so we can short-circuit future checks here too. The
2545+
* local values of minRecoveryPoint and minRecoveryPointTLI should not be
2546+
* updated until crash recovery finishes.
2547+
*/
2548+
if (XLogRecPtrIsInvalid(minRecoveryPoint))
2549+
{
2550+
updateMinRecoveryPoint= false;
2551+
return;
2552+
}
2553+
25352554
LWLockAcquire(ControlFileLock,LW_EXCLUSIVE);
25362555

25372556
/* update local copy */
25382557
minRecoveryPoint=ControlFile->minRecoveryPoint;
25392558
minRecoveryPointTLI=ControlFile->minRecoveryPointTLI;
25402559

2541-
/*
2542-
* An invalid minRecoveryPoint means that we need to recover all the WAL,
2543-
* i.e., we're doing crash recovery. We never modify the control file's
2544-
* value in that case, so we can short-circuit future checks here too.
2545-
*/
2546-
if (minRecoveryPoint==0)
2547-
updateMinRecoveryPoint= false;
2548-
elseif (force||minRecoveryPoint<lsn)
2560+
if (force||minRecoveryPoint<lsn)
25492561
{
25502562
XLogRecPtrnewMinRecoveryPoint;
25512563
TimeLineIDnewMinRecoveryPointTLI;
@@ -2930,7 +2942,16 @@ XLogNeedsFlush(XLogRecPtr record)
29302942
*/
29312943
if (RecoveryInProgress())
29322944
{
2933-
/* Quick exit if already known updated */
2945+
/*
2946+
* An invalid minRecoveryPoint means that we need to recover all the
2947+
* WAL, i.e., we're doing crash recovery. We never modify the control
2948+
* file's value in that case, so we can short-circuit future checks
2949+
* here too.
2950+
*/
2951+
if (XLogRecPtrIsInvalid(minRecoveryPoint))
2952+
updateMinRecoveryPoint= false;
2953+
2954+
/* Quick exit if already known to be updated or cannot be updated */
29342955
if (record <=minRecoveryPoint|| !updateMinRecoveryPoint)
29352956
return false;
29362957

@@ -2944,20 +2965,8 @@ XLogNeedsFlush(XLogRecPtr record)
29442965
minRecoveryPointTLI=ControlFile->minRecoveryPointTLI;
29452966
LWLockRelease(ControlFileLock);
29462967

2947-
/*
2948-
* An invalid minRecoveryPoint means that we need to recover all the
2949-
* WAL, i.e., we're doing crash recovery. We never modify the control
2950-
* file's value in that case, so we can short-circuit future checks
2951-
* here too.
2952-
*/
2953-
if (minRecoveryPoint==0)
2954-
updateMinRecoveryPoint= false;
2955-
29562968
/* check again */
2957-
if (record <=minRecoveryPoint|| !updateMinRecoveryPoint)
2958-
return false;
2959-
else
2960-
return true;
2969+
returnrecord>minRecoveryPoint;
29612970
}
29622971

29632972
/* Quick exit if already known flushed */
@@ -4099,6 +4108,12 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
40994108
minRecoveryPoint=ControlFile->minRecoveryPoint;
41004109
minRecoveryPointTLI=ControlFile->minRecoveryPointTLI;
41014110

4111+
/*
4112+
* The startup process can update its local copy of
4113+
* minRecoveryPoint from this point.
4114+
*/
4115+
updateMinRecoveryPoint= true;
4116+
41024117
UpdateControlFile();
41034118
LWLockRelease(ControlFileLock);
41044119

@@ -6578,9 +6593,26 @@ StartupXLOG(void)
65786593
/* No need to hold ControlFileLock yet, we aren't up far enough */
65796594
UpdateControlFile();
65806595

6581-
/* initialize our local copy of minRecoveryPoint */
6582-
minRecoveryPoint=ControlFile->minRecoveryPoint;
6583-
minRecoveryPointTLI=ControlFile->minRecoveryPointTLI;
6596+
/*
6597+
* Initialize our local copy of minRecoveryPoint. When doing crash
6598+
* recovery we want to replay up to the end of WAL. Particularly, in
6599+
* the case of a promoted standby minRecoveryPoint value in the
6600+
* control file is only updated after the first checkpoint. However,
6601+
* if the instance crashes before the first post-recovery checkpoint
6602+
* is completed then recovery will use a stale location causing the
6603+
* startup process to think that there are still invalid page
6604+
* references when checking for data consistency.
6605+
*/
6606+
if (InArchiveRecovery)
6607+
{
6608+
minRecoveryPoint=ControlFile->minRecoveryPoint;
6609+
minRecoveryPointTLI=ControlFile->minRecoveryPointTLI;
6610+
}
6611+
else
6612+
{
6613+
minRecoveryPoint=InvalidXLogRecPtr;
6614+
minRecoveryPointTLI=0;
6615+
}
65846616

65856617
/*
65866618
* Reset pgstat data, because it may be invalid after recovery.
@@ -7520,6 +7552,8 @@ CheckRecoveryConsistency(void)
75207552
if (XLogRecPtrIsInvalid(minRecoveryPoint))
75217553
return;
75227554

7555+
Assert(InArchiveRecovery);
7556+
75237557
/*
75247558
* assume that we are called in the startup process, and hence don't need
75257559
* a lock to read lastReplayedEndRecPtr
@@ -9582,11 +9616,16 @@ xlog_redo(XLogReaderState *record)
95829616
* Update minRecoveryPoint to ensure that if recovery is aborted, we
95839617
* recover back up to this point before allowing hot standby again.
95849618
* This is important if the max_* settings are decreased, to ensure
9585-
* you don't run queries against the WAL preceding the change.
9619+
* you don't run queries against the WAL preceding the change. The
9620+
* local copies cannot be updated as long as crash recovery is
9621+
* happening and we expect all the WAL to be replayed.
95869622
*/
9587-
minRecoveryPoint=ControlFile->minRecoveryPoint;
9588-
minRecoveryPointTLI=ControlFile->minRecoveryPointTLI;
9589-
if (minRecoveryPoint!=0&&minRecoveryPoint<lsn)
9623+
if (InArchiveRecovery)
9624+
{
9625+
minRecoveryPoint=ControlFile->minRecoveryPoint;
9626+
minRecoveryPointTLI=ControlFile->minRecoveryPointTLI;
9627+
}
9628+
if (minRecoveryPoint!=InvalidXLogRecPtr&&minRecoveryPoint<lsn)
95909629
{
95919630
ControlFile->minRecoveryPoint=lsn;
95929631
ControlFile->minRecoveryPointTLI=ThisTimeLineID;

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp