Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit215ac4a

Browse files
committed
Truncate pg_multixact/'s contents during crash recovery
Commit9dc842f of 8.2 era prevented MultiXact truncation during crashrecovery, because there was no guarantee that enough state had beensetup, and because it wasn't deemed to be a good idea to remove dataduring crash recovery anyway. Since then, due to Hot-Standby, streamingreplication and PITR, the amount of time a cluster can spend doing crashrecovery has increased significantly, to the point that a cluster mayeven never come out of it. This has made not truncating the content ofpg_multixact/ not defensible anymore.To fix, take care to setup enough state for multixact truncation beforecrash recovery starts (easy since checkpoints contain the requiredinformation), and move the current end-of-recovery actions to a newTrimMultiXact() function, analogous to TrimCLOG().At some later point, this should probably done similarly to the wayclog.c is doing it, which is to just WAL log truncations, but we can'tdo that for the back branches.Back-patch to 9.0. 8.4 also has the problem, but since there's no hotstandby there, it's much less pressing. In 9.2 and earlier, this patchis simpler than in newer branches, because multixact access duringrecovery isn't required. Add appropriate checks to make sure that's nothappening.Andres Freund
1 parentf5f92bd commit215ac4a

File tree

3 files changed

+70
-14
lines changed

3 files changed

+70
-14
lines changed

‎src/backend/access/transam/multixact.c

Lines changed: 42 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1768,14 +1768,37 @@ MaybeExtendOffsetSlru(void)
17681768
*
17691769
* StartupXLOG has already established nextMXact/nextOffset by calling
17701770
* MultiXactSetNextMXact and/or MultiXactAdvanceNextMXact, and the oldestMulti
1771-
* info from pg_control and/or MultiXactAdvanceOldest.Note that we may
1772-
* already have replayed WAL data into the SLRU files.
1773-
*
1774-
* We don't need any locks here, really; the SLRU locks are taken
1775-
* only because slru.c expects to be called with locks held.
1771+
* info from pg_control and/or MultiXactAdvanceOldest, but we haven't yet
1772+
* replayed WAL.
17761773
*/
17771774
void
17781775
StartupMultiXact(void)
1776+
{
1777+
MultiXactIdmulti=MultiXactState->nextMXact;
1778+
MultiXactOffsetoffset=MultiXactState->nextOffset;
1779+
intpageno;
1780+
1781+
/*
1782+
* Initialize offset's idea of the latest page number.
1783+
*/
1784+
pageno=MultiXactIdToOffsetPage(multi);
1785+
MultiXactOffsetCtl->shared->latest_page_number=pageno;
1786+
1787+
/*
1788+
* Initialize member's idea of the latest page number.
1789+
*/
1790+
pageno=MXOffsetToMemberPage(offset);
1791+
MultiXactMemberCtl->shared->latest_page_number=pageno;
1792+
}
1793+
1794+
/*
1795+
* This must be called ONCE at the end of startup/recovery.
1796+
*
1797+
* We don't need any locks here, really; the SLRU locks are taken only because
1798+
* slru.c expects to be called with locks held.
1799+
*/
1800+
void
1801+
TrimMultiXact(void)
17791802
{
17801803
MultiXactIdmulti=MultiXactState->nextMXact;
17811804
MultiXactOffsetoffset=MultiXactState->nextOffset;
@@ -1785,7 +1808,9 @@ StartupMultiXact(void)
17851808

17861809
/*
17871810
* During a binary upgrade, make sure that the offsets SLRU is large
1788-
* enough to contain the next value that would be created.
1811+
* enough to contain the next value that would be created. It's fine to do
1812+
* this here and not in StartupMultiXact() since binary upgrades should
1813+
* never need crash recovery.
17891814
*/
17901815
if (IsBinaryUpgrade)
17911816
MaybeExtendOffsetSlru();
@@ -1794,7 +1819,7 @@ StartupMultiXact(void)
17941819
LWLockAcquire(MultiXactOffsetControlLock,LW_EXCLUSIVE);
17951820

17961821
/*
1797-
* Initialize our idea of the latest page number.
1822+
*(Re-)Initialize our idea of the latest page number.
17981823
*/
17991824
pageno=MultiXactIdToOffsetPage(multi);
18001825
MultiXactOffsetCtl->shared->latest_page_number=pageno;
@@ -1824,7 +1849,7 @@ StartupMultiXact(void)
18241849
LWLockAcquire(MultiXactMemberControlLock,LW_EXCLUSIVE);
18251850

18261851
/*
1827-
* Initialize our idea of the latest page number.
1852+
*(Re-)Initialize our idea of the latest page number.
18281853
*/
18291854
pageno=MXOffsetToMemberPage(offset);
18301855
MultiXactMemberCtl->shared->latest_page_number=pageno;
@@ -2258,9 +2283,15 @@ SlruScanDirCbFindEarliest(SlruCtl ctl, char *filename, int segpage, void *data)
22582283
* Remove all MultiXactOffset and MultiXactMember segments before the oldest
22592284
* ones still of interest.
22602285
*
2261-
* This is called by vacuum after it has successfully advanced a database's
2262-
* datminmxid value; the cutoff value we're passed is the minimum of all
2263-
* databases' datminmxid values.
2286+
* On a primary, this is called by vacuum after it has successfully advanced a
2287+
* database's datminmxid value; the cutoff value we're passed is the minimum of
2288+
* all databases' datminmxid values.
2289+
*
2290+
* During crash recovery, it's called from CreateRestartPoint() instead. We
2291+
* rely on the fact that xlog_redo() will already have called
2292+
* MultiXactAdvanceOldest(). Our latest_page_number will already have been
2293+
* initialized by StartupMultiXact() and kept up to date as new pages are
2294+
* zeroed.
22642295
*/
22652296
void
22662297
TruncateMultiXact(MultiXactIdoldestMXact)

‎src/backend/access/transam/xlog.c

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5195,6 +5195,14 @@ StartupXLOG(void)
51955195
XLogCtl->ckptXidEpoch=checkPoint.nextXidEpoch;
51965196
XLogCtl->ckptXid=checkPoint.nextXid;
51975197

5198+
/*
5199+
* Startup MultiXact. We need to do this early for two reasons: one
5200+
* is that we might try to access multixacts when we do tuple freezing,
5201+
* and the other is we need its state initialized because we attempt
5202+
* truncation during restartpoints.
5203+
*/
5204+
StartupMultiXact();
5205+
51985206
/*
51995207
* Initialize unlogged LSN. On a clean shutdown, it's restored from the
52005208
* control file. On recovery, all unlogged relations are blown away, so
@@ -5395,8 +5403,9 @@ StartupXLOG(void)
53955403
ProcArrayInitRecovery(ShmemVariableCache->nextXid);
53965404

53975405
/*
5398-
* Startup commit log and subtrans only. Other SLRUs are not
5399-
* maintained during recovery and need not be started yet.
5406+
* Startup commit log and subtrans only. MultiXact has already
5407+
* been started up and other SLRUs are not maintained during
5408+
* recovery and need not be started yet.
54005409
*/
54015410
StartupCLOG();
54025411
StartupSUBTRANS(oldestActiveXID);
@@ -6061,8 +6070,8 @@ StartupXLOG(void)
60616070
/*
60626071
* Perform end of recovery actions for any SLRUs that need it.
60636072
*/
6064-
StartupMultiXact();
60656073
TrimCLOG();
6074+
TrimMultiXact();
60666075

60676076
/* Reload shared-memory state for prepared transactions */
60686077
RecoverPreparedTransactions();
@@ -7459,6 +7468,21 @@ CreateRestartPoint(int flags)
74597468
}
74607469
LWLockRelease(ControlFileLock);
74617470

7471+
/*
7472+
* Due to an historical accident multixact truncations are not WAL-logged,
7473+
* but just performed everytime the mxact horizon is increased. So, unless
7474+
* we explicitly execute truncations on a standby it will never clean out
7475+
* /pg_multixact which obviously is bad, both because it uses space and
7476+
* because we can wrap around into pre-existing data...
7477+
*
7478+
* We can only do the truncation here, after the UpdateControlFile()
7479+
* above, because we've now safely established a restart point, that
7480+
* guarantees we will not need need to access those multis.
7481+
*
7482+
* It's probably worth improving this.
7483+
*/
7484+
TruncateMultiXact(lastCheckPoint.oldestMulti);
7485+
74627486
/*
74637487
* Delete old log files (those no longer needed even for previous
74647488
* checkpoint/restartpoint) to prevent the disk holding the xlog from

‎src/include/access/multixact.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,7 @@ extern Size MultiXactShmemSize(void);
9898
externvoidMultiXactShmemInit(void);
9999
externvoidBootStrapMultiXact(void);
100100
externvoidStartupMultiXact(void);
101+
externvoidTrimMultiXact(void);
101102
externvoidShutdownMultiXact(void);
102103
externvoidSetMultiXactIdLimit(MultiXactIdoldest_datminmxid,
103104
Oidoldest_datoid);

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp