Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitb4a0223

Browse files
committed
Simplify and improve ProcessStandbyHSFeedbackMessage logic.
There's no need to clamp the standby's xmin to be greater thanGetOldestXmin's result; if there were any such need this logic would behopelessly inadequate anyway, because it fails to account forwithin-database versus cluster-wide values of GetOldestXmin. So get rid ofthat, and just rely on sanity-checking that the xmin is not wrapped aroundrelative to the nextXid counter. Also, don't reset the walsender's xmin ifthe current feedback xmin is indeed out of range; that just creates moreproblems than we already had. Lastly, don't bother to take theProcArrayLock; there's no need to do that to set xmin.Also improve the comments about this in GetOldestXmin itself.
1 parentdce92c6 commitb4a0223

File tree

2 files changed

+79
-76
lines changed

2 files changed

+79
-76
lines changed

‎src/backend/replication/walsender.c

Lines changed: 48 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -642,76 +642,67 @@ static void
642642
ProcessStandbyHSFeedbackMessage(void)
643643
{
644644
StandbyHSFeedbackMessagereply;
645-
TransactionIdnewxmin=InvalidTransactionId;
645+
TransactionIdnextXid;
646+
uint32nextEpoch;
646647

647-
pq_copymsgbytes(&reply_message, (char*)&reply,sizeof(StandbyHSFeedbackMessage));
648+
/* Decipher the reply message */
649+
pq_copymsgbytes(&reply_message, (char*)&reply,
650+
sizeof(StandbyHSFeedbackMessage));
648651

649652
elog(DEBUG2,"hot standby feedback xmin %u epoch %u",
650653
reply.xmin,
651654
reply.epoch);
652655

656+
/* Ignore invalid xmin (can't actually happen with current walreceiver) */
657+
if (!TransactionIdIsNormal(reply.xmin))
658+
return;
659+
653660
/*
654-
* Update the WalSender's proc xmin to allow it to be visible to
655-
* snapshots. This will hold back the removal of dead rows and thereby
656-
* prevent the generation of cleanup conflicts on the standby server.
661+
* Check that the provided xmin/epoch are sane, that is, not in the future
662+
* and not so far back as to be already wrapped around. Ignore if not.
663+
*
664+
* Epoch of nextXid should be same as standby, or if the counter has
665+
* wrapped, then one greater than standby.
657666
*/
658-
if (TransactionIdIsValid(reply.xmin))
659-
{
660-
TransactionIdnextXid;
661-
uint32nextEpoch;
662-
boolepochOK= false;
663-
664-
GetNextXidAndEpoch(&nextXid,&nextEpoch);
665-
666-
/*
667-
* Epoch of oldestXmin should be same as standby or if the counter has
668-
* wrapped, then one less than reply.
669-
*/
670-
if (reply.xmin <=nextXid)
671-
{
672-
if (reply.epoch==nextEpoch)
673-
epochOK= true;
674-
}
675-
else
676-
{
677-
if (nextEpoch>0&&reply.epoch==nextEpoch-1)
678-
epochOK= true;
679-
}
680-
681-
/*
682-
* Feedback from standby must not go backwards, nor should it go
683-
* forwards further than our most recent xid.
684-
*/
685-
if (epochOK&&TransactionIdPrecedesOrEquals(reply.xmin,nextXid))
686-
{
687-
if (!TransactionIdIsValid(MyProc->xmin))
688-
{
689-
TransactionIdoldestXmin=GetOldestXmin(true, true);
667+
GetNextXidAndEpoch(&nextXid,&nextEpoch);
690668

691-
if (TransactionIdPrecedes(oldestXmin,reply.xmin))
692-
newxmin=reply.xmin;
693-
else
694-
newxmin=oldestXmin;
695-
}
696-
else
697-
{
698-
if (TransactionIdPrecedes(MyProc->xmin,reply.xmin))
699-
newxmin=reply.xmin;
700-
else
701-
newxmin=MyProc->xmin;/* stay the same */
702-
}
703-
}
669+
if (reply.xmin <=nextXid)
670+
{
671+
if (reply.epoch!=nextEpoch)
672+
return;
704673
}
674+
else
675+
{
676+
if (reply.epoch+1!=nextEpoch)
677+
return;
678+
}
679+
680+
if (!TransactionIdPrecedesOrEquals(reply.xmin,nextXid))
681+
return;/* epoch OK, but it's wrapped around */
705682

706683
/*
707-
* Grab the ProcArrayLock to set xmin, or invalidate for bad reply
684+
* Set the WalSender's xmin equal to the standby's requested xmin, so that
685+
* the xmin will be taken into account by GetOldestXmin. This will hold
686+
* back the removal of dead rows and thereby prevent the generation of
687+
* cleanup conflicts on the standby server.
688+
*
689+
* There is a small window for a race condition here: although we just
690+
* checked that reply.xmin precedes nextXid, the nextXid could have gotten
691+
* advanced between our fetching it and applying the xmin below, perhaps
692+
* far enough to make reply.xmin wrap around. In that case the xmin we
693+
* set here would be "in the future" and have no effect. No point in
694+
* worrying about this since it's too late to save the desired data
695+
* anyway. Assuming that the standby sends us an increasing sequence of
696+
* xmins, this could only happen during the first reply cycle, else our
697+
* own xmin would prevent nextXid from advancing so far.
698+
*
699+
* We don't bother taking the ProcArrayLock here. Setting the xmin field
700+
* is assumed atomic, and there's no real need to prevent a concurrent
701+
* GetOldestXmin. (If we're moving our xmin forward, this is obviously
702+
* safe, and if we're moving it backwards, well, the data is at risk
703+
* already since a VACUUM could have just finished calling GetOldestXmin.)
708704
*/
709-
if (MyProc->xmin!=newxmin)
710-
{
711-
LWLockAcquire(ProcArrayLock,LW_SHARED);
712-
MyProc->xmin=newxmin;
713-
LWLockRelease(ProcArrayLock);
714-
}
705+
MyProc->xmin=reply.xmin;
715706
}
716707

717708
/* Main loop of walsender process */

‎src/backend/storage/ipc/procarray.c

Lines changed: 31 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -997,22 +997,32 @@ TransactionIdIsActive(TransactionId xid)
997997
* This is also used to determine where to truncate pg_subtrans. allDbs
998998
* must be TRUE for that case, and ignoreVacuum FALSE.
999999
*
1000-
* Note: it's possible for the calculated value to move backwards on repeated
1001-
* calls. The calculated value is conservative, so that anything older is
1002-
* definitely not considered as running by anyone anymore, but the exact
1003-
* value calculated depends on a number of things. For example, if allDbs is
1004-
* TRUE and there are no transactions running in the current database,
1005-
* GetOldestXmin() returns latestCompletedXid. If a transaction begins after
1006-
* that, its xmin will include in-progress transactions in other databases
1007-
* that started earlier, so another call will return an lower value. The
1008-
* return value is also adjusted with vacuum_defer_cleanup_age, so increasing
1009-
* that setting on the fly is an easy way to have GetOldestXmin() move
1010-
* backwards.
1011-
*
10121000
* Note: we include all currently running xids in the set of considered xids.
10131001
* This ensures that if a just-started xact has not yet set its snapshot,
10141002
* when it does set the snapshot it cannot set xmin less than what we compute.
10151003
* See notes in src/backend/access/transam/README.
1004+
*
1005+
* Note: despite the above, it's possible for the calculated value to move
1006+
* backwards on repeated calls. The calculated value is conservative, so that
1007+
* anything older is definitely not considered as running by anyone anymore,
1008+
* but the exact value calculated depends on a number of things. For example,
1009+
* if allDbs is FALSE and there are no transactions running in the current
1010+
* database, GetOldestXmin() returns latestCompletedXid. If a transaction
1011+
* begins after that, its xmin will include in-progress transactions in other
1012+
* databases that started earlier, so another call will return a lower value.
1013+
* Nonetheless it is safe to vacuum a table in the current database with the
1014+
* first result. There are also replication-related effects: a walsender
1015+
* process can set its xmin based on transactions that are no longer running
1016+
* in the master but are still being replayed on the standby, thus possibly
1017+
* making the GetOldestXmin reading go backwards. In this case there is a
1018+
* possibility that we lose data that the standby would like to have, but
1019+
* there is little we can do about that --- data is only protected if the
1020+
* walsender runs continuously while queries are executed on the standby.
1021+
* (The Hot Standby code deals with such cases by failing standby queries
1022+
* that needed to access already-removed data, so there's no integrity bug.)
1023+
* The return value is also adjusted with vacuum_defer_cleanup_age, so
1024+
* increasing that setting on the fly is another easy way to make
1025+
* GetOldestXmin() move backwards, with no consequences for data integrity.
10161026
*/
10171027
TransactionId
10181028
GetOldestXmin(boolallDbs,boolignoreVacuum)
@@ -1045,7 +1055,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
10451055

10461056
if (allDbs||
10471057
proc->databaseId==MyDatabaseId||
1048-
proc->databaseId==0)/* include WalSender */
1058+
proc->databaseId==0)/*alwaysinclude WalSender */
10491059
{
10501060
/* Fetch xid just once - see GetNewTransactionId */
10511061
TransactionIdxid=proc->xid;
@@ -1091,16 +1101,18 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
10911101
LWLockRelease(ProcArrayLock);
10921102

10931103
/*
1094-
* Compute the cutoff XID, being careful not to generate a "permanent"
1095-
*XID. We need do this only on the primary, never on standby.
1104+
* Compute the cutoff XID by subtracting vacuum_defer_cleanup_age,
1105+
*being careful not to generate a "permanent" XID.
10961106
*
10971107
* vacuum_defer_cleanup_age provides some additional "slop" for the
10981108
* benefit of hot standby queries on slave servers. This is quick and
10991109
* dirty, and perhaps not all that useful unless the master has a
1100-
* predictable transaction rate, but it's what we've got. Note that
1101-
* we are assuming vacuum_defer_cleanup_age isn't large enough to
1102-
* cause wraparound --- so guc.c should limit it to no more than the
1103-
* xidStopLimit threshold in varsup.c.
1110+
* predictable transaction rate, but it offers some protection when
1111+
* there's no walsender connection. Note that we are assuming
1112+
* vacuum_defer_cleanup_age isn't large enough to cause wraparound ---
1113+
* so guc.c should limit it to no more than the xidStopLimit threshold
1114+
* in varsup.c. Also note that we intentionally don't apply
1115+
* vacuum_defer_cleanup_age on standby servers.
11041116
*/
11051117
result-=vacuum_defer_cleanup_age;
11061118
if (!TransactionIdIsNormal(result))

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp