Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit566372b

Browse files
committed
Prevent concurrent SimpleLruTruncate() for any given SLRU.
The SimpleLruTruncate() header comment states the new coding rule. Toachieve this, add locktype "frozenid" and two LWLocks. This closes arare opportunity for data loss, which manifested as "apparentwraparound" or "could not access status of transaction" errors. Dataloss is more likely in pg_multixact, due to released branches' thinmargin between multiStopLimit and multiWrapLimit. If a user's physicalreplication primary logged ": apparent wraparound" messages, the usershould rebuild standbys of that primary regardless of symptoms. At lessrisk is a cluster having emitted "not accepting commands" errors or"must be vacuumed" warnings at some point. One can test a cluster forthis data loss by running VACUUM FREEZE in every database. Back-patchto 9.5 (all supported versions).Discussion:https://postgr.es/m/20190218073103.GA1434723@rfd.leadboat.com
1 parentd4d443b commit566372b

File tree

11 files changed

+117
-13
lines changed

11 files changed

+117
-13
lines changed

‎doc/src/sgml/catalogs.sgml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10226,7 +10226,8 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
1022610226
and general database objects (identified by class OID and object OID,
1022710227
in the same way as in <structname>pg_description</structname> or
1022810228
<structname>pg_depend</structname>). Also, the right to extend a
10229-
relation is represented as a separate lockable object.
10229+
relation is represented as a separate lockable object, as is the right to
10230+
update <structname>pg_database</structname>.<structfield>datfrozenxid</structfield>.
1023010231
Also, <quote>advisory</quote> locks can be taken on numbers that have
1023110232
user-defined meanings.
1023210233
</para>
@@ -10254,6 +10255,7 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
1025410255
Type of the lockable object:
1025510256
<literal>relation</literal>,
1025610257
<literal>extend</literal>,
10258+
<literal>frozenid</literal>,
1025710259
<literal>page</literal>,
1025810260
<literal>tuple</literal>,
1025910261
<literal>transactionid</literal>,

‎doc/src/sgml/monitoring.sgml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1742,6 +1742,12 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
17421742
<entry><literal>extend</literal></entry>
17431743
<entry>Waiting to extend a relation.</entry>
17441744
</row>
1745+
<row>
1746+
<entry><literal>frozenid</literal></entry>
1747+
<entry>Waiting to
1748+
update <structname>pg_database</structname>.<structfield>datfrozenxid</structfield>
1749+
and <structname>pg_database</structname>.<structfield>datminmxid</structfield>.</entry>
1750+
</row>
17451751
<row>
17461752
<entry><literal>object</literal></entry>
17471753
<entry>Waiting to acquire a lock on a non-relation database object.</entry>
@@ -1910,6 +1916,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
19101916
<entry><literal>NotifyQueue</literal></entry>
19111917
<entry>Waiting to read or update <command>NOTIFY</command> messages.</entry>
19121918
</row>
1919+
<row>
1920+
<entry><literal>NotifyQueueTail</literal></entry>
1921+
<entry>Waiting to update limit on <command>NOTIFY</command> message
1922+
storage.</entry>
1923+
</row>
19131924
<row>
19141925
<entry><literal>NotifySLRU</literal></entry>
19151926
<entry>Waiting to access the <command>NOTIFY</command> message SLRU
@@ -2086,6 +2097,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
20862097
<entry><literal>WALWrite</literal></entry>
20872098
<entry>Waiting for WAL buffers to be written to disk.</entry>
20882099
</row>
2100+
<row>
2101+
<entry><literal>WrapLimitsVacuum</literal></entry>
2102+
<entry>Waiting to update limits on transaction id and multixact
2103+
consumption.</entry>
2104+
</row>
20892105
<row>
20902106
<entry><literal>XactBuffer</literal></entry>
20912107
<entry>Waiting for I/O on a transaction status SLRU buffer.</entry>

‎src/backend/access/transam/slru.c

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1191,6 +1191,14 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied)
11911191

11921192
/*
11931193
* Remove all segments before the one holding the passed page number
1194+
*
1195+
* All SLRUs prevent concurrent calls to this function, either with an LWLock
1196+
* or by calling it only as part of a checkpoint. Mutual exclusion must begin
1197+
* before computing cutoffPage. Mutual exclusion must end after any limit
1198+
* update that would permit other backends to write fresh data into the
1199+
* segment immediately preceding the one containing cutoffPage. Otherwise,
1200+
* when the SLRU is quite full, SimpleLruTruncate() might delete that segment
1201+
* after it has accrued freshly-written data.
11941202
*/
11951203
void
11961204
SimpleLruTruncate(SlruCtlctl,intcutoffPage)

‎src/backend/access/transam/subtrans.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -349,8 +349,8 @@ ExtendSUBTRANS(TransactionId newestXact)
349349
/*
350350
* Remove all SUBTRANS segments before the one holding the passed transaction ID
351351
*
352-
*This isnormally called during checkpoint, with oldestXact being the
353-
*oldest TransactionXmin of any running transaction.
352+
*oldestXact isthe oldest TransactionXmin of any running transaction. This
353+
*is called only during checkpoint.
354354
*/
355355
void
356356
TruncateSUBTRANS(TransactionIdoldestXact)

‎src/backend/commands/async.c

Lines changed: 27 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -244,19 +244,22 @@ typedef struct QueueBackendStatus
244244
/*
245245
* Shared memory state for LISTEN/NOTIFY (excluding its SLRU stuff)
246246
*
247-
* The AsyncQueueControl structure is protected by the NotifyQueueLock.
247+
* The AsyncQueueControl structure is protected by the NotifyQueueLock and
248+
* NotifyQueueTailLock.
248249
*
249-
* When holdingthe lockin SHARED mode, backends may only inspect their own
250-
* entries as well as the head and tail pointers. Consequently we can allow a
251-
* backend to update its own record while holding only SHARED lock (since no
252-
* other backend will inspect it).
250+
* When holdingNotifyQueueLockin SHARED mode, backends may only inspect
251+
*their ownentries as well as the head and tail pointers. Consequently we
252+
*can allow abackend to update its own record while holding only SHARED lock
253+
*(since noother backend will inspect it).
253254
*
254-
* When holding the lock in EXCLUSIVE mode, backends can inspect the entries
255-
* of other backends and also change the head and tail pointers.
255+
* When holding NotifyQueueLock in EXCLUSIVE mode, backends can inspect the
256+
* entries of other backends and also change the head pointer. When holding
257+
* both NotifyQueueLock and NotifyQueueTailLock in EXCLUSIVE mode, backends
258+
* can change the tail pointer.
256259
*
257260
* NotifySLRULock is used as the control lock for the pg_notify SLRU buffers.
258-
* In order to avoid deadlocks, whenever we needboth locks, wealwaysfirst
259-
*getNotifyQueueLock andthen NotifySLRULock.
261+
* In order to avoid deadlocks, whenever we needmultiple locks, we first get
262+
*NotifyQueueTailLock, thenNotifyQueueLock, andlastly NotifySLRULock.
260263
*
261264
* Each backend uses the backend[] array entry with index equal to its
262265
* BackendId (which can range from 1 to MaxBackends). We rely on this to make
@@ -2177,6 +2180,10 @@ asyncQueueAdvanceTail(void)
21772180
intnewtailpage;
21782181
intboundary;
21792182

2183+
/* Restrict task to one backend per cluster; see SimpleLruTruncate(). */
2184+
LWLockAcquire(NotifyQueueTailLock,LW_EXCLUSIVE);
2185+
2186+
/* Compute the new tail. */
21802187
LWLockAcquire(NotifyQueueLock,LW_EXCLUSIVE);
21812188
min=QUEUE_HEAD;
21822189
for (BackendIdi=QUEUE_FIRST_LISTENER;i>0;i=QUEUE_NEXT_LISTENER(i))
@@ -2185,7 +2192,6 @@ asyncQueueAdvanceTail(void)
21852192
min=QUEUE_POS_MIN(min,QUEUE_BACKEND_POS(i));
21862193
}
21872194
oldtailpage=QUEUE_POS_PAGE(QUEUE_TAIL);
2188-
QUEUE_TAIL=min;
21892195
LWLockRelease(NotifyQueueLock);
21902196

21912197
/*
@@ -2205,6 +2211,17 @@ asyncQueueAdvanceTail(void)
22052211
*/
22062212
SimpleLruTruncate(NotifyCtl,newtailpage);
22072213
}
2214+
2215+
/*
2216+
* Advertise the new tail. This changes asyncQueueIsFull()'s verdict for
2217+
* the segment immediately prior to the new tail, allowing fresh data into
2218+
* that segment.
2219+
*/
2220+
LWLockAcquire(NotifyQueueLock,LW_EXCLUSIVE);
2221+
QUEUE_TAIL=min;
2222+
LWLockRelease(NotifyQueueLock);
2223+
2224+
LWLockRelease(NotifyQueueTailLock);
22082225
}
22092226

22102227
/*

‎src/backend/commands/vacuum.c

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1361,6 +1361,14 @@ vac_update_datfrozenxid(void)
13611361
boolbogus= false;
13621362
booldirty= false;
13631363

1364+
/*
1365+
* Restrict this task to one backend per database. This avoids race
1366+
* conditions that would move datfrozenxid or datminmxid backward. It
1367+
* avoids calling vac_truncate_clog() with a datfrozenxid preceding a
1368+
* datfrozenxid passed to an earlier vac_truncate_clog() call.
1369+
*/
1370+
LockDatabaseFrozenIds(ExclusiveLock);
1371+
13641372
/*
13651373
* Initialize the "min" calculation with
13661374
* GetOldestNonRemovableTransactionId(), which is a reasonable
@@ -1551,6 +1559,9 @@ vac_truncate_clog(TransactionId frozenXID,
15511559
boolbogus= false;
15521560
boolfrozenAlreadyWrapped= false;
15531561

1562+
/* Restrict task to one backend per cluster; see SimpleLruTruncate(). */
1563+
LWLockAcquire(WrapLimitsVacuumLock,LW_EXCLUSIVE);
1564+
15541565
/* init oldest datoids to sync with my frozenXID/minMulti values */
15551566
oldestxid_datoid=MyDatabaseId;
15561567
minmulti_datoid=MyDatabaseId;
@@ -1660,6 +1671,8 @@ vac_truncate_clog(TransactionId frozenXID,
16601671
*/
16611672
SetTransactionIdLimit(frozenXID,oldestxid_datoid);
16621673
SetMultiXactIdLimit(minMulti,minmulti_datoid, false);
1674+
1675+
LWLockRelease(WrapLimitsVacuumLock);
16631676
}
16641677

16651678

‎src/backend/storage/lmgr/lmgr.c

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -460,6 +460,21 @@ UnlockRelationForExtension(Relation relation, LOCKMODE lockmode)
460460
LockRelease(&tag,lockmode, false);
461461
}
462462

463+
/*
464+
*LockDatabaseFrozenIds
465+
*
466+
* This allows one backend per database to execute vac_update_datfrozenxid().
467+
*/
468+
void
469+
LockDatabaseFrozenIds(LOCKMODElockmode)
470+
{
471+
LOCKTAGtag;
472+
473+
SET_LOCKTAG_DATABASE_FROZEN_IDS(tag,MyDatabaseId);
474+
475+
(void)LockAcquire(&tag,lockmode, false, false);
476+
}
477+
463478
/*
464479
*LockPage
465480
*
@@ -1098,6 +1113,11 @@ DescribeLockTag(StringInfo buf, const LOCKTAG *tag)
10981113
tag->locktag_field2,
10991114
tag->locktag_field1);
11001115
break;
1116+
caseLOCKTAG_DATABASE_FROZEN_IDS:
1117+
appendStringInfo(buf,
1118+
_("pg_database.datfrozenxid of database %u"),
1119+
tag->locktag_field1);
1120+
break;
11011121
caseLOCKTAG_PAGE:
11021122
appendStringInfo(buf,
11031123
_("page %u of relation %u of database %u"),

‎src/backend/storage/lmgr/lwlocknames.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,3 +50,6 @@ MultiXactTruncationLock41
5050
OldSnapshotTimeMapLock42
5151
LogicalRepWorkerLock43
5252
XactTruncationLock44
53+
# 45 was XactTruncationLock until removal of BackendRandomLock
54+
WrapLimitsVacuumLock46
55+
NotifyQueueTailLock47

‎src/backend/utils/adt/lockfuncs.c

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
constchar*constLockTagTypeNames[]= {
3030
"relation",
3131
"extend",
32+
"frozenid",
3233
"page",
3334
"tuple",
3435
"transactionid",
@@ -254,6 +255,17 @@ pg_lock_status(PG_FUNCTION_ARGS)
254255
nulls[8]= true;
255256
nulls[9]= true;
256257
break;
258+
caseLOCKTAG_DATABASE_FROZEN_IDS:
259+
values[1]=ObjectIdGetDatum(instance->locktag.locktag_field1);
260+
nulls[2]= true;
261+
nulls[3]= true;
262+
nulls[4]= true;
263+
nulls[5]= true;
264+
nulls[6]= true;
265+
nulls[7]= true;
266+
nulls[8]= true;
267+
nulls[9]= true;
268+
break;
257269
caseLOCKTAG_PAGE:
258270
values[1]=ObjectIdGetDatum(instance->locktag.locktag_field1);
259271
values[2]=ObjectIdGetDatum(instance->locktag.locktag_field2);

‎src/include/storage/lmgr.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,9 @@ extern bool ConditionalLockRelationForExtension(Relation relation,
5959
LOCKMODElockmode);
6060
externintRelationExtensionLockWaiterCount(Relationrelation);
6161

62+
/* Lock to recompute pg_database.datfrozenxid in the current database */
63+
externvoidLockDatabaseFrozenIds(LOCKMODElockmode);
64+
6265
/* Lock a page (currently only used within indexes) */
6366
externvoidLockPage(Relationrelation,BlockNumberblkno,LOCKMODElockmode);
6467
externboolConditionalLockPage(Relationrelation,BlockNumberblkno,LOCKMODElockmode);

‎src/include/storage/lock.h

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,7 @@ typedef enum LockTagType
138138
{
139139
LOCKTAG_RELATION,/* whole relation */
140140
LOCKTAG_RELATION_EXTEND,/* the right to extend a relation */
141+
LOCKTAG_DATABASE_FROZEN_IDS,/* pg_database.datfrozenxid */
141142
LOCKTAG_PAGE,/* one page of a relation */
142143
LOCKTAG_TUPLE,/* one physical tuple */
143144
LOCKTAG_TRANSACTION,/* transaction (for waiting for xact done) */
@@ -194,6 +195,15 @@ typedef struct LOCKTAG
194195
(locktag).locktag_type = LOCKTAG_RELATION_EXTEND, \
195196
(locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
196197

198+
/* ID info for frozen IDs is DB OID */
199+
#defineSET_LOCKTAG_DATABASE_FROZEN_IDS(locktag,dboid) \
200+
((locktag).locktag_field1 = (dboid), \
201+
(locktag).locktag_field2 = 0, \
202+
(locktag).locktag_field3 = 0, \
203+
(locktag).locktag_field4 = 0, \
204+
(locktag).locktag_type = LOCKTAG_DATABASE_FROZEN_IDS, \
205+
(locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
206+
197207
/* ID info for a page is RELATION info + BlockNumber */
198208
#defineSET_LOCKTAG_PAGE(locktag,dboid,reloid,blocknum) \
199209
((locktag).locktag_field1 = (dboid), \

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp