Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit592a589

Browse files
committed
Prevent concurrent SimpleLruTruncate() for any given SLRU.
The SimpleLruTruncate() header comment states the new coding rule. Toachieve this, add locktype "frozenid" and two LWLocks. This closes arare opportunity for data loss, which manifested as "apparentwraparound" or "could not access status of transaction" errors. Dataloss is more likely in pg_multixact, due to released branches' thinmargin between multiStopLimit and multiWrapLimit. If a user's physicalreplication primary logged ": apparent wraparound" messages, the usershould rebuild standbys of that primary regardless of symptoms. At lessrisk is a cluster having emitted "not accepting commands" errors or"must be vacuumed" warnings at some point. One can test a cluster forthis data loss by running VACUUM FREEZE in every database. Back-patchto 9.5 (all supported versions).Discussion:https://postgr.es/m/20190218073103.GA1434723@rfd.leadboat.com
1 parentb538e83 commit592a589

File tree

11 files changed

+117
-13
lines changed

11 files changed

+117
-13
lines changed

‎doc/src/sgml/catalogs.sgml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10215,7 +10215,8 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
1021510215
and general database objects (identified by class OID and object OID,
1021610216
in the same way as in <structname>pg_description</structname> or
1021710217
<structname>pg_depend</structname>). Also, the right to extend a
10218-
relation is represented as a separate lockable object.
10218+
relation is represented as a separate lockable object, as is the right to
10219+
update <structname>pg_database</structname>.<structfield>datfrozenxid</structfield>.
1021910220
Also, <quote>advisory</quote> locks can be taken on numbers that have
1022010221
user-defined meanings.
1022110222
</para>
@@ -10243,6 +10244,7 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
1024310244
Type of the lockable object:
1024410245
<literal>relation</literal>,
1024510246
<literal>extend</literal>,
10247+
<literal>frozenid</literal>,
1024610248
<literal>page</literal>,
1024710249
<literal>tuple</literal>,
1024810250
<literal>transactionid</literal>,

‎doc/src/sgml/monitoring.sgml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1738,6 +1738,12 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
17381738
<entry><literal>extend</literal></entry>
17391739
<entry>Waiting to extend a relation.</entry>
17401740
</row>
1741+
<row>
1742+
<entry><literal>frozenid</literal></entry>
1743+
<entry>Waiting to
1744+
update <structname>pg_database</structname>.<structfield>datfrozenxid</structfield>
1745+
and <structname>pg_database</structname>.<structfield>datminmxid</structfield>.</entry>
1746+
</row>
17411747
<row>
17421748
<entry><literal>object</literal></entry>
17431749
<entry>Waiting to acquire a lock on a non-relation database object.</entry>
@@ -1906,6 +1912,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
19061912
<entry><literal>NotifyQueue</literal></entry>
19071913
<entry>Waiting to read or update <command>NOTIFY</command> messages.</entry>
19081914
</row>
1915+
<row>
1916+
<entry><literal>NotifyQueueTail</literal></entry>
1917+
<entry>Waiting to update limit on <command>NOTIFY</command> message
1918+
storage.</entry>
1919+
</row>
19091920
<row>
19101921
<entry><literal>NotifySLRU</literal></entry>
19111922
<entry>Waiting to access the <command>NOTIFY</command> message SLRU
@@ -2082,6 +2093,11 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser
20822093
<entry><literal>WALWrite</literal></entry>
20832094
<entry>Waiting for WAL buffers to be written to disk.</entry>
20842095
</row>
2096+
<row>
2097+
<entry><literal>WrapLimitsVacuum</literal></entry>
2098+
<entry>Waiting to update limits on transaction id and multixact
2099+
consumption.</entry>
2100+
</row>
20852101
<row>
20862102
<entry><literal>XactBuffer</literal></entry>
20872103
<entry>Waiting for I/O on a transaction status SLRU buffer.</entry>

‎src/backend/access/transam/slru.c

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1208,6 +1208,14 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied)
12081208

12091209
/*
12101210
* Remove all segments before the one holding the passed page number
1211+
*
1212+
* All SLRUs prevent concurrent calls to this function, either with an LWLock
1213+
* or by calling it only as part of a checkpoint. Mutual exclusion must begin
1214+
* before computing cutoffPage. Mutual exclusion must end after any limit
1215+
* update that would permit other backends to write fresh data into the
1216+
* segment immediately preceding the one containing cutoffPage. Otherwise,
1217+
* when the SLRU is quite full, SimpleLruTruncate() might delete that segment
1218+
* after it has accrued freshly-written data.
12111219
*/
12121220
void
12131221
SimpleLruTruncate(SlruCtlctl,intcutoffPage)

‎src/backend/access/transam/subtrans.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -349,8 +349,8 @@ ExtendSUBTRANS(TransactionId newestXact)
349349
/*
350350
* Remove all SUBTRANS segments before the one holding the passed transaction ID
351351
*
352-
*This isnormally called during checkpoint, with oldestXact being the
353-
*oldest TransactionXmin of any running transaction.
352+
*oldestXact isthe oldest TransactionXmin of any running transaction. This
353+
*is called only during checkpoint.
354354
*/
355355
void
356356
TruncateSUBTRANS(TransactionIdoldestXact)

‎src/backend/commands/async.c

Lines changed: 27 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -244,19 +244,22 @@ typedef struct QueueBackendStatus
244244
/*
245245
* Shared memory state for LISTEN/NOTIFY (excluding its SLRU stuff)
246246
*
247-
* The AsyncQueueControl structure is protected by the NotifyQueueLock.
247+
* The AsyncQueueControl structure is protected by the NotifyQueueLock and
248+
* NotifyQueueTailLock.
248249
*
249-
* When holdingthe lockin SHARED mode, backends may only inspect their own
250-
* entries as well as the head and tail pointers. Consequently we can allow a
251-
* backend to update its own record while holding only SHARED lock (since no
252-
* other backend will inspect it).
250+
* When holdingNotifyQueueLockin SHARED mode, backends may only inspect
251+
*their ownentries as well as the head and tail pointers. Consequently we
252+
*can allow abackend to update its own record while holding only SHARED lock
253+
*(since noother backend will inspect it).
253254
*
254-
* When holding the lock in EXCLUSIVE mode, backends can inspect the entries
255-
* of other backends and also change the head and tail pointers.
255+
* When holding NotifyQueueLock in EXCLUSIVE mode, backends can inspect the
256+
* entries of other backends and also change the head pointer. When holding
257+
* both NotifyQueueLock and NotifyQueueTailLock in EXCLUSIVE mode, backends
258+
* can change the tail pointer.
256259
*
257260
* NotifySLRULock is used as the control lock for the pg_notify SLRU buffers.
258-
* In order to avoid deadlocks, whenever we needboth locks, wealwaysfirst
259-
*getNotifyQueueLock andthen NotifySLRULock.
261+
* In order to avoid deadlocks, whenever we needmultiple locks, we first get
262+
*NotifyQueueTailLock, thenNotifyQueueLock, andlastly NotifySLRULock.
260263
*
261264
* Each backend uses the backend[] array entry with index equal to its
262265
* BackendId (which can range from 1 to MaxBackends). We rely on this to make
@@ -2177,6 +2180,10 @@ asyncQueueAdvanceTail(void)
21772180
intnewtailpage;
21782181
intboundary;
21792182

2183+
/* Restrict task to one backend per cluster; see SimpleLruTruncate(). */
2184+
LWLockAcquire(NotifyQueueTailLock,LW_EXCLUSIVE);
2185+
2186+
/* Compute the new tail. */
21802187
LWLockAcquire(NotifyQueueLock,LW_EXCLUSIVE);
21812188
min=QUEUE_HEAD;
21822189
for (BackendIdi=QUEUE_FIRST_LISTENER;i>0;i=QUEUE_NEXT_LISTENER(i))
@@ -2185,7 +2192,6 @@ asyncQueueAdvanceTail(void)
21852192
min=QUEUE_POS_MIN(min,QUEUE_BACKEND_POS(i));
21862193
}
21872194
oldtailpage=QUEUE_POS_PAGE(QUEUE_TAIL);
2188-
QUEUE_TAIL=min;
21892195
LWLockRelease(NotifyQueueLock);
21902196

21912197
/*
@@ -2205,6 +2211,17 @@ asyncQueueAdvanceTail(void)
22052211
*/
22062212
SimpleLruTruncate(NotifyCtl,newtailpage);
22072213
}
2214+
2215+
/*
2216+
* Advertise the new tail. This changes asyncQueueIsFull()'s verdict for
2217+
* the segment immediately prior to the new tail, allowing fresh data into
2218+
* that segment.
2219+
*/
2220+
LWLockAcquire(NotifyQueueLock,LW_EXCLUSIVE);
2221+
QUEUE_TAIL=min;
2222+
LWLockRelease(NotifyQueueLock);
2223+
2224+
LWLockRelease(NotifyQueueTailLock);
22082225
}
22092226

22102227
/*

‎src/backend/commands/vacuum.c

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1344,6 +1344,14 @@ vac_update_datfrozenxid(void)
13441344
boolbogus= false;
13451345
booldirty= false;
13461346

1347+
/*
1348+
* Restrict this task to one backend per database. This avoids race
1349+
* conditions that would move datfrozenxid or datminmxid backward. It
1350+
* avoids calling vac_truncate_clog() with a datfrozenxid preceding a
1351+
* datfrozenxid passed to an earlier vac_truncate_clog() call.
1352+
*/
1353+
LockDatabaseFrozenIds(ExclusiveLock);
1354+
13471355
/*
13481356
* Initialize the "min" calculation with GetOldestXmin, which is a
13491357
* reasonable approximation to the minimum relfrozenxid for not-yet-
@@ -1533,6 +1541,9 @@ vac_truncate_clog(TransactionId frozenXID,
15331541
boolbogus= false;
15341542
boolfrozenAlreadyWrapped= false;
15351543

1544+
/* Restrict task to one backend per cluster; see SimpleLruTruncate(). */
1545+
LWLockAcquire(WrapLimitsVacuumLock,LW_EXCLUSIVE);
1546+
15361547
/* init oldest datoids to sync with my frozenXID/minMulti values */
15371548
oldestxid_datoid=MyDatabaseId;
15381549
minmulti_datoid=MyDatabaseId;
@@ -1642,6 +1653,8 @@ vac_truncate_clog(TransactionId frozenXID,
16421653
*/
16431654
SetTransactionIdLimit(frozenXID,oldestxid_datoid);
16441655
SetMultiXactIdLimit(minMulti,minmulti_datoid, false);
1656+
1657+
LWLockRelease(WrapLimitsVacuumLock);
16451658
}
16461659

16471660

‎src/backend/storage/lmgr/lmgr.c

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -460,6 +460,21 @@ UnlockRelationForExtension(Relation relation, LOCKMODE lockmode)
460460
LockRelease(&tag,lockmode, false);
461461
}
462462

463+
/*
464+
*LockDatabaseFrozenIds
465+
*
466+
* This allows one backend per database to execute vac_update_datfrozenxid().
467+
*/
468+
void
469+
LockDatabaseFrozenIds(LOCKMODElockmode)
470+
{
471+
LOCKTAGtag;
472+
473+
SET_LOCKTAG_DATABASE_FROZEN_IDS(tag,MyDatabaseId);
474+
475+
(void)LockAcquire(&tag,lockmode, false, false);
476+
}
477+
463478
/*
464479
*LockPage
465480
*
@@ -1098,6 +1113,11 @@ DescribeLockTag(StringInfo buf, const LOCKTAG *tag)
10981113
tag->locktag_field2,
10991114
tag->locktag_field1);
11001115
break;
1116+
caseLOCKTAG_DATABASE_FROZEN_IDS:
1117+
appendStringInfo(buf,
1118+
_("pg_database.datfrozenxid of database %u"),
1119+
tag->locktag_field1);
1120+
break;
11011121
caseLOCKTAG_PAGE:
11021122
appendStringInfo(buf,
11031123
_("page %u of relation %u of database %u"),

‎src/backend/storage/lmgr/lwlocknames.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,3 +50,6 @@ MultiXactTruncationLock41
5050
OldSnapshotTimeMapLock42
5151
LogicalRepWorkerLock43
5252
XactTruncationLock44
53+
# 45 was XactTruncationLock until removal of BackendRandomLock
54+
WrapLimitsVacuumLock46
55+
NotifyQueueTailLock47

‎src/backend/utils/adt/lockfuncs.c

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
constchar*constLockTagTypeNames[]= {
3030
"relation",
3131
"extend",
32+
"frozenid",
3233
"page",
3334
"tuple",
3435
"transactionid",
@@ -254,6 +255,17 @@ pg_lock_status(PG_FUNCTION_ARGS)
254255
nulls[8]= true;
255256
nulls[9]= true;
256257
break;
258+
caseLOCKTAG_DATABASE_FROZEN_IDS:
259+
values[1]=ObjectIdGetDatum(instance->locktag.locktag_field1);
260+
nulls[2]= true;
261+
nulls[3]= true;
262+
nulls[4]= true;
263+
nulls[5]= true;
264+
nulls[6]= true;
265+
nulls[7]= true;
266+
nulls[8]= true;
267+
nulls[9]= true;
268+
break;
257269
caseLOCKTAG_PAGE:
258270
values[1]=ObjectIdGetDatum(instance->locktag.locktag_field1);
259271
values[2]=ObjectIdGetDatum(instance->locktag.locktag_field2);

‎src/include/storage/lmgr.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,9 @@ extern bool ConditionalLockRelationForExtension(Relation relation,
5959
LOCKMODElockmode);
6060
externintRelationExtensionLockWaiterCount(Relationrelation);
6161

62+
/* Lock to recompute pg_database.datfrozenxid in the current database */
63+
externvoidLockDatabaseFrozenIds(LOCKMODElockmode);
64+
6265
/* Lock a page (currently only used within indexes) */
6366
externvoidLockPage(Relationrelation,BlockNumberblkno,LOCKMODElockmode);
6467
externboolConditionalLockPage(Relationrelation,BlockNumberblkno,LOCKMODElockmode);

‎src/include/storage/lock.h

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,7 @@ typedef enum LockTagType
138138
{
139139
LOCKTAG_RELATION,/* whole relation */
140140
LOCKTAG_RELATION_EXTEND,/* the right to extend a relation */
141+
LOCKTAG_DATABASE_FROZEN_IDS,/* pg_database.datfrozenxid */
141142
LOCKTAG_PAGE,/* one page of a relation */
142143
LOCKTAG_TUPLE,/* one physical tuple */
143144
LOCKTAG_TRANSACTION,/* transaction (for waiting for xact done) */
@@ -194,6 +195,15 @@ typedef struct LOCKTAG
194195
(locktag).locktag_type = LOCKTAG_RELATION_EXTEND, \
195196
(locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
196197

198+
/* ID info for frozen IDs is DB OID */
199+
#defineSET_LOCKTAG_DATABASE_FROZEN_IDS(locktag,dboid) \
200+
((locktag).locktag_field1 = (dboid), \
201+
(locktag).locktag_field2 = 0, \
202+
(locktag).locktag_field3 = 0, \
203+
(locktag).locktag_field4 = 0, \
204+
(locktag).locktag_type = LOCKTAG_DATABASE_FROZEN_IDS, \
205+
(locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
206+
197207
/* ID info for a page is RELATION info + BlockNumber */
198208
#defineSET_LOCKTAG_PAGE(locktag,dboid,reloid,blocknum) \
199209
((locktag).locktag_field1 = (dboid), \

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp