Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit0a51e70

Browse files
committed
Don't take ProcArrayLock while exiting a transaction that has no XID; there is
no need for serialization against snapshot-taking because the xact doesn'taffect anyone else's snapshot anyway. Per discussion. Also, move variousinfo about the interlocking of transactions and snapshots out of code commentsand into a hopefully-more-cohesive discussion in access/transam/README.Also, remove a couple of now-obsolete comments about having to force some WALto be written to persuade RecordTransactionCommit to do its thing.
1 parent85e79a4 commit0a51e70

File tree

6 files changed

+216
-144
lines changed

6 files changed

+216
-144
lines changed

‎src/backend/access/heap/heapam.c

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/access/heap/heapam.c,v 1.238 2007/09/05 18:10:47 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/access/heap/heapam.c,v 1.239 2007/09/07 20:59:26 tgl Exp $
1212
*
1313
*
1414
* INTERFACE ROUTINES
@@ -1546,9 +1546,8 @@ UpdateXmaxHintBits(HeapTupleHeader tuple, Buffer buffer, TransactionId xid)
15461546
* If use_wal is false, the new tuple is not logged in WAL, even for a
15471547
* non-temp relation. Safe usage of this behavior requires that we arrange
15481548
* that all new tuples go into new pages not containing any tuples from other
1549-
* transactions, that the relation gets fsync'd before commit, and that the
1550-
* transaction emits at least one WAL record to ensure RecordTransactionCommit
1551-
* will decide to WAL-log the commit. (See also heap_sync() comments)
1549+
* transactions, and that the relation gets fsync'd before commit.
1550+
* (See also heap_sync() comments)
15521551
*
15531552
* use_fsm is passed directly to RelationGetBufferForTuple, which see for
15541553
* more info.

‎src/backend/access/transam/README

Lines changed: 105 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
$PostgreSQL: pgsql/src/backend/access/transam/README,v 1.7 2007/09/05 18:10:47 tgl Exp $
1+
$PostgreSQL: pgsql/src/backend/access/transam/README,v 1.8 2007/09/07 20:59:26 tgl Exp $
22

33
The Transaction System
44
----------------------
@@ -221,6 +221,110 @@ InvalidSubTransactionId.) Note that subtransactions do not have their
221221
own VXIDs; they use the parent top transaction's VXID.
222222

223223

224+
Interlocking transaction begin, transaction end, and snapshots
225+
--------------------------------------------------------------
226+
227+
We try hard to minimize the amount of overhead and lock contention involved
228+
in the frequent activities of beginning/ending a transaction and taking a
229+
snapshot. Unfortunately, we must have some interlocking for this, because
230+
we must ensure consistency about the commit order of transactions.
231+
For example, suppose an UPDATE in xact A is blocked by xact B's prior
232+
update of the same row, and xact B is doing commit while xact C gets a
233+
snapshot. Xact A can complete and commit as soon as B releases its locks.
234+
If xact C's GetSnapshotData sees xact B as still running, then it had
235+
better see xact A as still running as well, or it will be able to see two
236+
tuple versions - one deleted by xact B and one inserted by xact A. Another
237+
reason why this would be bad is that C would see (in the row inserted by A)
238+
earlier changes by B, and it would be inconsistent for C not to see any
239+
of B's changes elsewhere in the database.
240+
241+
Formally, the correctness requirement is "if A sees B as committed,
242+
and B sees C as committed, then A must see C as committed".
243+
244+
What we actually enforce is strict serialization of commits and rollbacks
245+
with snapshot-taking: we do not allow any transaction to exit the set of
246+
running transactions while a snapshot is being taken. (This rule is
247+
stronger than necessary for consistency, but is relatively simple to
248+
enforce, and it assists with some other issues as explained below.) The
249+
implementation of this is that GetSnapshotData takes the ProcArrayLock in
250+
shared mode (so that multiple backends can take snapshots in parallel),
251+
but xact.c must take the ProcArrayLock in exclusive mode while clearing
252+
MyProc->xid at transaction end (either commit or abort).
253+
254+
GetSnapshotData must in fact acquire ProcArrayLock before it calls
255+
ReadNewTransactionId. Otherwise it would be possible for a transaction A
256+
postdating the xmax to commit, and then an existing transaction B that saw
257+
A as committed to commit, before GetSnapshotData is able to acquire
258+
ProcArrayLock and finish taking its snapshot. This would violate the
259+
consistency requirement, because A would be still running and B not
260+
according to this snapshot.
261+
262+
In short, then, the rule is that no transaction may exit the set of
263+
currently-running transactions between the time we fetch xmax and the time
264+
we finish building our snapshot. However, this restriction only applies
265+
to transactions that have an XID --- read-only transactions can end without
266+
acquiring ProcArrayLock, since they don't affect anyone else's snapshot.
267+
268+
Transaction start, per se, doesn't have any interlocking with these
269+
considerations, since we no longer assign an XID immediately at transaction
270+
start. But when we do decide to allocate an XID, we must require
271+
GetNewTransactionId to store the new XID into the shared ProcArray before
272+
releasing XidGenLock. This ensures that when GetSnapshotData calls
273+
ReadNewTransactionId (which also takes XidGenLock), all active XIDs before
274+
the returned value of nextXid are already present in the ProcArray and
275+
can't be missed by GetSnapshotData. Unfortunately, we can't have
276+
GetNewTransactionId take ProcArrayLock to do this, else it could deadlock
277+
against GetSnapshotData. Therefore, we simply let GetNewTransactionId
278+
store into MyProc->xid without any lock. We are thereby relying on
279+
fetch/store of an XID to be atomic, else other backends might see a
280+
partially-set XID. (NOTE: for multiprocessors that need explicit memory
281+
access fence instructions, this means that acquiring/releasing XidGenLock
282+
is just as necessary as acquiring/releasing ProcArrayLock for
283+
GetSnapshotData to ensure it sees up-to-date xid fields.) This also means
284+
that readers of the ProcArray xid fields must be careful to fetch a value
285+
only once, rather than assume they can read it multiple times and get the
286+
same answer each time.
287+
288+
Another important activity that uses the shared ProcArray is GetOldestXmin,
289+
which must determine a lower bound for the oldest xmin of any active MVCC
290+
snapshot, system-wide. Each individual backend advertises the smallest
291+
xmin of its own snapshots in MyProc->xmin, or zero if it currently has no
292+
live snapshots (eg, if it's between transactions or hasn't yet set a
293+
snapshot for a new transaction). GetOldestXmin takes the MIN() of the
294+
valid xmin fields. It does this with only shared lock on ProcArrayLock,
295+
which means there is a potential race condition against other backends
296+
doing GetSnapshotData concurrently: we must be certain that a concurrent
297+
backend that is about to set its xmin does not compute an xmin less than
298+
what GetOldestXmin returns. We ensure that by including all the active
299+
XIDs into the MIN() calculation, along with the valid xmins. The rule that
300+
transactions can't exit without taking exclusive ProcArrayLock ensures that
301+
concurrent holders of shared ProcArrayLock will compute the same minimum of
302+
currently-active XIDs: no xact, in particular not the oldest, can exit
303+
while we hold shared ProcArrayLock. So GetOldestXmin's view of the minimum
304+
active XID will be the same as that of any concurrent GetSnapshotData, and
305+
so it can't produce an overestimate. If there is no active transaction at
306+
all, GetOldestXmin returns the result of ReadNewTransactionId. Note that
307+
two concurrent executions of GetOldestXmin might not see the same result
308+
from ReadNewTransactionId --- but if there is a difference, the intervening
309+
execution(s) of GetNewTransactionId must have stored their XIDs into the
310+
ProcArray, so the later execution of GetOldestXmin will see them and
311+
compute the same global xmin anyway.
312+
313+
GetSnapshotData also performs an oldest-xmin calculation (which had better
314+
match GetOldestXmin's) and stores that into RecentGlobalXmin, which is used
315+
for some tuple age cutoff checks where a fresh call of GetOldestXmin seems
316+
too expensive. Note that while it is certain that two concurrent
317+
executions of GetSnapshotData will compute the same xmin for their own
318+
snapshots, as argued above, it is not certain that they will arrive at the
319+
same estimate of RecentGlobalXmin. This is because we allow XID-less
320+
transactions to clear their MyProc->xmin asynchronously (without taking
321+
ProcArrayLock), so one execution might see what had been the oldest xmin,
322+
and another not. This is OK since RecentGlobalXmin need only be a valid
323+
lower bound. As noted above, we are already assuming that fetch/store
324+
of the xid fields is atomic, so assuming it for xmin as well is no extra
325+
risk.
326+
327+
224328
pg_clog and pg_subtrans
225329
-----------------------
226330

‎src/backend/access/transam/xact.c

Lines changed: 86 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
*
1111
*
1212
* IDENTIFICATION
13-
* $PostgreSQL: pgsql/src/backend/access/transam/xact.c,v 1.248 2007/09/05 18:10:47 tgl Exp $
13+
* $PostgreSQL: pgsql/src/backend/access/transam/xact.c,v 1.249 2007/09/07 20:59:26 tgl Exp $
1414
*
1515
*-------------------------------------------------------------------------
1616
*/
@@ -747,6 +747,8 @@ AtSubStart_ResourceOwner(void)
747747

748748
/*
749749
*RecordTransactionCommit
750+
*
751+
* This is exported only to support an ugly hack in VACUUM FULL.
750752
*/
751753
void
752754
RecordTransactionCommit(void)
@@ -1552,46 +1554,53 @@ CommitTransaction(void)
15521554
*/
15531555
RecordTransactionCommit();
15541556

1555-
/*----------
1557+
PG_TRACE1(transaction__commit,MyProc->lxid);
1558+
1559+
/*
15561560
* Let others know about no transaction in progress by me. Note that
15571561
* this must be done _before_ releasing locks we hold and _after_
15581562
* RecordTransactionCommit.
15591563
*
1560-
* LWLockAcquire(ProcArrayLock) is required; consider this example:
1561-
*UPDATE with xid 0 is blocked by xid 1's UPDATE.
1562-
*xid 1 is doing commit while xid 2 gets snapshot.
1563-
* If xid 2's GetSnapshotData sees xid 1 as running then it must see
1564-
* xid 0 as running as well, or it will be able to see two tuple versions
1565-
* - one deleted by xid 1 and one inserted by xid 0. See notes in
1566-
* GetSnapshotData.
1567-
*
15681564
* Note: MyProc may be null during bootstrap.
1569-
*----------
15701565
*/
15711566
if (MyProc!=NULL)
15721567
{
1573-
/*
1574-
* Lock ProcArrayLock because that's what GetSnapshotData uses.
1575-
* You might assume that we can skip this step if we had no
1576-
* transaction id assigned, because the failure case outlined
1577-
* in GetSnapshotData cannot happen in that case. This is true,
1578-
* but we *still* need the lock guarantee that two concurrent
1579-
* computations of the *oldest* xmin will get the same result.
1580-
*/
1581-
LWLockAcquire(ProcArrayLock,LW_EXCLUSIVE);
1582-
MyProc->xid=InvalidTransactionId;
1583-
MyProc->lxid=InvalidLocalTransactionId;
1584-
MyProc->xmin=InvalidTransactionId;
1585-
MyProc->inVacuum= false;/* must be cleared with xid/xmin */
1568+
if (TransactionIdIsValid(MyProc->xid))
1569+
{
1570+
/*
1571+
* We must lock ProcArrayLock while clearing MyProc->xid, so
1572+
* that we do not exit the set of "running" transactions while
1573+
* someone else is taking a snapshot. See discussion in
1574+
* src/backend/access/transam/README.
1575+
*/
1576+
LWLockAcquire(ProcArrayLock,LW_EXCLUSIVE);
15861577

1587-
/* Clear the subtransaction-XID cache too while holding the lock */
1588-
MyProc->subxids.nxids=0;
1589-
MyProc->subxids.overflowed= false;
1578+
MyProc->xid=InvalidTransactionId;
1579+
MyProc->lxid=InvalidLocalTransactionId;
1580+
MyProc->xmin=InvalidTransactionId;
1581+
MyProc->inVacuum= false;/* must be cleared with xid/xmin */
15901582

1591-
LWLockRelease(ProcArrayLock);
1592-
}
1583+
/* Clear the subtransaction-XID cache too while holding the lock */
1584+
MyProc->subxids.nxids=0;
1585+
MyProc->subxids.overflowed= false;
15931586

1594-
PG_TRACE1(transaction__commit,s->transactionId);
1587+
LWLockRelease(ProcArrayLock);
1588+
}
1589+
else
1590+
{
1591+
/*
1592+
* If we have no XID, we don't need to lock, since we won't
1593+
* affect anyone else's calculation of a snapshot. We might
1594+
* change their estimate of global xmin, but that's OK.
1595+
*/
1596+
MyProc->lxid=InvalidLocalTransactionId;
1597+
MyProc->xmin=InvalidTransactionId;
1598+
MyProc->inVacuum= false;/* must be cleared with xid/xmin */
1599+
1600+
Assert(MyProc->subxids.nxids==0);
1601+
Assert(MyProc->subxids.overflowed== false);
1602+
}
1603+
}
15951604

15961605
/*
15971606
* This is all post-commit cleanup. Note that if an error is raised here,
@@ -1815,28 +1824,21 @@ PrepareTransaction(void)
18151824
* Let others know about no transaction in progress by me.This has to be
18161825
* done *after* the prepared transaction has been marked valid, else
18171826
* someone may think it is unlocked and recyclable.
1827+
*
1828+
* We can skip locking ProcArrayLock here, because this action does not
1829+
* actually change anyone's view of the set of running XIDs: our entry
1830+
* is duplicate with the gxact that has already been inserted into the
1831+
* ProcArray.
18181832
*/
1819-
1820-
/*
1821-
* Lock ProcArrayLock because that's what GetSnapshotData uses.
1822-
* You might assume that we can skip this step if we have no
1823-
* transaction id assigned, because the failure case outlined
1824-
* in GetSnapshotData cannot happen in that case. This is true,
1825-
* but we *still* need the lock guarantee that two concurrent
1826-
* computations of the *oldest* xmin will get the same result.
1827-
*/
1828-
LWLockAcquire(ProcArrayLock,LW_EXCLUSIVE);
18291833
MyProc->xid=InvalidTransactionId;
18301834
MyProc->lxid=InvalidLocalTransactionId;
18311835
MyProc->xmin=InvalidTransactionId;
18321836
MyProc->inVacuum= false;/* must be cleared with xid/xmin */
18331837

1834-
/* Clear the subtransaction-XID cache toowhile holding the lock*/
1838+
/* Clear the subtransaction-XID cache too */
18351839
MyProc->subxids.nxids=0;
18361840
MyProc->subxids.overflowed= false;
18371841

1838-
LWLockRelease(ProcArrayLock);
1839-
18401842
/*
18411843
* This is all post-transaction cleanup. Note that if an error is raised
18421844
* here, it's too late to abort the transaction. This should be just
@@ -1987,36 +1989,55 @@ AbortTransaction(void)
19871989
*/
19881990
RecordTransactionAbort(false);
19891991

1992+
PG_TRACE1(transaction__abort,MyProc->lxid);
1993+
19901994
/*
19911995
* Let others know about no transaction in progress by me. Note that this
19921996
* must be done _before_ releasing locks we hold and _after_
19931997
* RecordTransactionAbort.
1998+
*
1999+
* Note: MyProc may be null during bootstrap.
19942000
*/
19952001
if (MyProc!=NULL)
19962002
{
1997-
/*
1998-
* Lock ProcArrayLock because that's what GetSnapshotData uses.
1999-
* You might assume that we can skip this step if we have no
2000-
* transaction id assigned, because the failure case outlined
2001-
* in GetSnapshotData cannot happen in that case. This is true,
2002-
* but we *still* need the lock guarantee that two concurrent
2003-
* computations of the *oldest* xmin will get the same result.
2004-
*/
2005-
LWLockAcquire(ProcArrayLock,LW_EXCLUSIVE);
2006-
MyProc->xid=InvalidTransactionId;
2007-
MyProc->lxid=InvalidLocalTransactionId;
2008-
MyProc->xmin=InvalidTransactionId;
2009-
MyProc->inVacuum= false;/* must be cleared with xid/xmin */
2010-
MyProc->inCommit= false;/* be sure this gets cleared */
2011-
2012-
/* Clear the subtransaction-XID cache too while holding the lock */
2013-
MyProc->subxids.nxids=0;
2014-
MyProc->subxids.overflowed= false;
2015-
2016-
LWLockRelease(ProcArrayLock);
2017-
}
2003+
if (TransactionIdIsValid(MyProc->xid))
2004+
{
2005+
/*
2006+
* We must lock ProcArrayLock while clearing MyProc->xid, so
2007+
* that we do not exit the set of "running" transactions while
2008+
* someone else is taking a snapshot. See discussion in
2009+
* src/backend/access/transam/README.
2010+
*/
2011+
LWLockAcquire(ProcArrayLock,LW_EXCLUSIVE);
20182012

2019-
PG_TRACE1(transaction__abort,s->transactionId);
2013+
MyProc->xid=InvalidTransactionId;
2014+
MyProc->lxid=InvalidLocalTransactionId;
2015+
MyProc->xmin=InvalidTransactionId;
2016+
MyProc->inVacuum= false;/* must be cleared with xid/xmin */
2017+
MyProc->inCommit= false;/* be sure this gets cleared */
2018+
2019+
/* Clear the subtransaction-XID cache too while holding the lock */
2020+
MyProc->subxids.nxids=0;
2021+
MyProc->subxids.overflowed= false;
2022+
2023+
LWLockRelease(ProcArrayLock);
2024+
}
2025+
else
2026+
{
2027+
/*
2028+
* If we have no XID, we don't need to lock, since we won't
2029+
* affect anyone else's calculation of a snapshot. We might
2030+
* change their estimate of global xmin, but that's OK.
2031+
*/
2032+
MyProc->lxid=InvalidLocalTransactionId;
2033+
MyProc->xmin=InvalidTransactionId;
2034+
MyProc->inVacuum= false;/* must be cleared with xid/xmin */
2035+
MyProc->inCommit= false;/* be sure this gets cleared */
2036+
2037+
Assert(MyProc->subxids.nxids==0);
2038+
Assert(MyProc->subxids.overflowed== false);
2039+
}
2040+
}
20202041

20212042
/*
20222043
* Post-abort cleanup.See notes in CommitTransaction() concerning

‎src/backend/commands/copy.c

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/commands/copy.c,v 1.285 2007/06/20 02:02:49 neilc Exp $
11+
* $PostgreSQL: pgsql/src/backend/commands/copy.c,v 1.286 2007/09/07 20:59:26 tgl Exp $
1212
*
1313
*-------------------------------------------------------------------------
1414
*/
@@ -1678,13 +1678,6 @@ CopyFrom(CopyState cstate)
16781678
* rd_newRelfilenodeSubid can be cleared before the end of the transaction.
16791679
* However this is OK since at worst we will fail to make the optimization.
16801680
*
1681-
* When skipping WAL it's entirely possible that COPY itself will write no
1682-
* WAL records at all. This is of concern because RecordTransactionCommit
1683-
* might decide it doesn't need to log our eventual commit, which we
1684-
* certainly need it to do. However, we need no special action here for
1685-
* that, because if we have a new table or new relfilenode then there
1686-
* must have been a WAL-logged pg_class update earlier in the transaction.
1687-
*
16881681
* Also, if the target file is new-in-transaction, we assume that checking
16891682
* FSM for free space is a waste of time, even if we must use WAL because
16901683
* of archiving. This could possibly be wrong, but it's unlikely.

‎src/backend/executor/execMain.c

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
*
2727
*
2828
* IDENTIFICATION
29-
* $PostgreSQL: pgsql/src/backend/executor/execMain.c,v 1.296 2007/08/15 21:39:50 tgl Exp $
29+
* $PostgreSQL: pgsql/src/backend/executor/execMain.c,v 1.297 2007/09/07 20:59:26 tgl Exp $
3030
*
3131
*-------------------------------------------------------------------------
3232
*/
@@ -2635,12 +2635,6 @@ OpenIntoRel(QueryDesc *queryDesc)
26352635

26362636
/*
26372637
* We can skip WAL-logging the insertions, unless PITR is in use.
2638-
*
2639-
* Note that for a non-temp INTO table, this is safe only because we know
2640-
* that the catalog changes above will have been WAL-logged, and so
2641-
* RecordTransactionCommit will think it needs to WAL-log the eventual
2642-
* transaction commit.Else the commit might be lost, even though all the
2643-
* data is safely fsync'd ...
26442638
*/
26452639
estate->es_into_relation_use_wal=XLogArchivingActive();
26462640
estate->es_into_relation_descriptor=intoRelationDesc;

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp