Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit82a83ec

Browse files
committed
Fix race condition in preparing a transaction for two-phase commit.
To lock a prepared transaction's shared memory entry, we used to mark itwith the XID of the backend. When the XID was no longer active accordingto the proc array, the entry was implicitly considered as not lockedanymore. However, when preparing a transaction, the backend's proc arrayentry was cleared before transfering the locks (and some other state) tothe prepared transaction's dummy PGPROC entry, so there was a window whereanother backend could finish the transaction before it was in fact fullyprepared.To fix, rewrite the locking mechanism of global transaction entries. Insteadof an XID, just have simple locked-or-not flag in each entry (we store thelocking backend's backend id rather than a simple boolean, but that's justfor debugging purposes). The backend is responsible for explicitly unlockingthe entry, and to make sure that that happens, install a callback to unlockit on abort or process exit.Backpatch to all supported versions.
1 parent5e79847 commit82a83ec

File tree

3 files changed

+144
-47
lines changed

3 files changed

+144
-47
lines changed

‎src/backend/access/transam/twophase.c

Lines changed: 125 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@
5656
#include"pg_trace.h"
5757
#include"pgstat.h"
5858
#include"storage/fd.h"
59+
#include"storage/ipc.h"
5960
#include"storage/procarray.h"
6061
#include"storage/sinvaladt.h"
6162
#include"storage/smgr.h"
@@ -82,25 +83,25 @@ intmax_prepared_xacts = 0;
8283
*
8384
* The lifecycle of a global transaction is:
8485
*
85-
* 1. After checking that the requested GID is not in use, set up an
86-
*entry inthe TwoPhaseState->prepXacts array with the correctXID andGID,
87-
*with locking_xid = my own XID and valid = false.
86+
* 1. After checking that the requested GID is not in use, set up an entry in
87+
* the TwoPhaseState->prepXacts array with the correctGID andvalid = false,
88+
*and mark it as locked by my backend.
8889
*
8990
* 2. After successfully completing prepare, set valid = true and enter the
9091
* contained PGPROC into the global ProcArray.
9192
*
92-
* 3. To begin COMMIT PREPARED or ROLLBACK PREPARED, check that the entry
93-
*isvalid andits locking_xid is no longer active, then store my current
94-
*XIDintolocking_xid. This prevents concurrent attempts to commit or
95-
* rollback the same prepared xact.
93+
* 3. To begin COMMIT PREPARED or ROLLBACK PREPARED, check that the entry is
94+
* valid andnot locked, then mark the entry as locked by storing my current
95+
*backend IDintolocking_backend. This prevents concurrent attempts to
96+
*commit orrollback the same prepared xact.
9697
*
9798
* 4. On completion of COMMIT PREPARED or ROLLBACK PREPARED, remove the entry
9899
* from the ProcArray and the TwoPhaseState->prepXacts array and return it to
99100
* the freelist.
100101
*
101102
* Note that if the preparing transaction fails between steps 1 and 2, the
102-
* entrywill remain in prepXacts until recycled. We can detect recyclable
103-
*entries by checking for valid = false and locking_xid no longer active.
103+
* entrymust be removed so that the GID and the GlobalTransaction struct
104+
*can be reused. See AtAbort_Twophase().
104105
*
105106
* typedef struct GlobalTransactionData *GlobalTransaction appears in
106107
* twophase.h
@@ -114,8 +115,8 @@ typedef struct GlobalTransactionData
114115
TimestampTzprepared_at;/* time of preparation */
115116
XLogRecPtrprepare_lsn;/* XLOG offset of prepare record */
116117
Oidowner;/* ID of user that executed the xact */
117-
TransactionIdlocking_xid;/*top-level XID ofbackend working on xact */
118-
boolvalid;/* TRUE iffully prepared */
118+
BackendIdlocking_backend;/* backendcurrentlyworking on the xact */
119+
boolvalid;/* TRUE ifPGPROC entry is in proc array */
119120
chargid[GIDSIZE];/* The GID assigned to the prepared xact */
120121
}GlobalTransactionData;
121122

@@ -140,6 +141,12 @@ typedef struct TwoPhaseStateData
140141

141142
staticTwoPhaseStateData*TwoPhaseState;
142143

144+
/*
145+
* Global transaction entry currently locked by us, if any.
146+
*/
147+
staticGlobalTransactionMyLockedGxact=NULL;
148+
149+
staticbooltwophaseExitRegistered= false;
143150

144151
staticvoidRecordTransactionCommitPrepared(TransactionIdxid,
145152
intnchildren,
@@ -156,6 +163,7 @@ static void RecordTransactionAbortPrepared(TransactionId xid,
156163
RelFileNode*rels);
157164
staticvoidProcessRecords(char*bufptr,TransactionIdxid,
158165
constTwoPhaseCallbackcallbacks[]);
166+
staticvoidRemoveGXact(GlobalTransactiongxact);
159167

160168

161169
/*
@@ -225,6 +233,74 @@ TwoPhaseShmemInit(void)
225233
Assert(found);
226234
}
227235

236+
/*
237+
* Exit hook to unlock the global transaction entry we're working on.
238+
*/
239+
staticvoid
240+
AtProcExit_Twophase(intcode,Datumarg)
241+
{
242+
/* same logic as abort */
243+
AtAbort_Twophase();
244+
}
245+
246+
/*
247+
* Abort hook to unlock the global transaction entry we're working on.
248+
*/
249+
void
250+
AtAbort_Twophase(void)
251+
{
252+
if (MyLockedGxact==NULL)
253+
return;
254+
255+
/*
256+
* What to do with the locked global transaction entry? If we were in
257+
* the process of preparing the transaction, but haven't written the WAL
258+
* record and state file yet, the transaction must not be considered as
259+
* prepared. Likewise, if we are in the process of finishing an
260+
* already-prepared transaction, and fail after having already written
261+
* the 2nd phase commit or rollback record to the WAL, the transaction
262+
* should not be considered as prepared anymore. In those cases, just
263+
* remove the entry from shared memory.
264+
*
265+
* Otherwise, the entry must be left in place so that the transaction
266+
* can be finished later, so just unlock it.
267+
*
268+
* If we abort during prepare, after having written the WAL record, we
269+
* might not have transfered all locks and other state to the prepared
270+
* transaction yet. Likewise, if we abort during commit or rollback,
271+
* after having written the WAL record, we might not have released
272+
* all the resources held by the transaction yet. In those cases, the
273+
* in-memory state can be wrong, but it's too late to back out.
274+
*/
275+
if (!MyLockedGxact->valid)
276+
{
277+
RemoveGXact(MyLockedGxact);
278+
}
279+
else
280+
{
281+
LWLockAcquire(TwoPhaseStateLock,LW_EXCLUSIVE);
282+
283+
MyLockedGxact->locking_backend=InvalidBackendId;
284+
285+
LWLockRelease(TwoPhaseStateLock);
286+
}
287+
MyLockedGxact=NULL;
288+
}
289+
290+
/*
291+
* This is called after we have finished transfering state to the prepared
292+
* PGXACT entry.
293+
*/
294+
void
295+
PostPrepare_Twophase()
296+
{
297+
LWLockAcquire(TwoPhaseStateLock,LW_EXCLUSIVE);
298+
MyLockedGxact->locking_backend=InvalidBackendId;
299+
LWLockRelease(TwoPhaseStateLock);
300+
301+
MyLockedGxact=NULL;
302+
}
303+
228304

229305
/*
230306
* MarkAsPreparing
@@ -254,29 +330,15 @@ MarkAsPreparing(TransactionId xid, const char *gid,
254330
errmsg("prepared transactions are disabled"),
255331
errhint("Set max_prepared_transactions to a nonzero value.")));
256332

257-
LWLockAcquire(TwoPhaseStateLock,LW_EXCLUSIVE);
258-
259-
/*
260-
* First, find and recycle any gxacts that failed during prepare. We do
261-
* this partly to ensure we don't mistakenly say their GIDs are still
262-
* reserved, and partly so we don't fail on out-of-slots unnecessarily.
263-
*/
264-
for (i=0;i<TwoPhaseState->numPrepXacts;i++)
333+
/* on first call, register the exit hook */
334+
if (!twophaseExitRegistered)
265335
{
266-
gxact=TwoPhaseState->prepXacts[i];
267-
if (!gxact->valid&& !TransactionIdIsActive(gxact->locking_xid))
268-
{
269-
/* It's dead Jim ... remove from the active array */
270-
TwoPhaseState->numPrepXacts--;
271-
TwoPhaseState->prepXacts[i]=TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts];
272-
/* and put it back in the freelist */
273-
gxact->proc.links.next= (SHM_QUEUE*)TwoPhaseState->freeGXacts;
274-
TwoPhaseState->freeGXacts=gxact;
275-
/* Back up index count too, so we don't miss scanning one */
276-
i--;
277-
}
336+
on_shmem_exit(AtProcExit_Twophase,0);
337+
twophaseExitRegistered= true;
278338
}
279339

340+
LWLockAcquire(TwoPhaseStateLock,LW_EXCLUSIVE);
341+
280342
/* Check for conflicting GID */
281343
for (i=0;i<TwoPhaseState->numPrepXacts;i++)
282344
{
@@ -330,14 +392,20 @@ MarkAsPreparing(TransactionId xid, const char *gid,
330392
gxact->prepare_lsn.xlogid=0;
331393
gxact->prepare_lsn.xrecoff=0;
332394
gxact->owner=owner;
333-
gxact->locking_xid=xid;
395+
gxact->locking_backend=MyBackendId;
334396
gxact->valid= false;
335397
strcpy(gxact->gid,gid);
336398

337399
/* And insert it into the active array */
338400
Assert(TwoPhaseState->numPrepXacts<max_prepared_xacts);
339401
TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts++]=gxact;
340402

403+
/*
404+
* Remember that we have this GlobalTransaction entry locked for us.
405+
* If we abort after this, we must release it.
406+
*/
407+
MyLockedGxact=gxact;
408+
341409
LWLockRelease(TwoPhaseStateLock);
342410

343411
returngxact;
@@ -397,6 +465,13 @@ LockGXact(const char *gid, Oid user)
397465
{
398466
inti;
399467

468+
/* on first call, register the exit hook */
469+
if (!twophaseExitRegistered)
470+
{
471+
on_shmem_exit(AtProcExit_Twophase,0);
472+
twophaseExitRegistered= true;
473+
}
474+
400475
LWLockAcquire(TwoPhaseStateLock,LW_EXCLUSIVE);
401476

402477
for (i=0;i<TwoPhaseState->numPrepXacts;i++)
@@ -410,15 +485,11 @@ LockGXact(const char *gid, Oid user)
410485
continue;
411486

412487
/* Found it, but has someone else got it locked? */
413-
if (TransactionIdIsValid(gxact->locking_xid))
414-
{
415-
if (TransactionIdIsActive(gxact->locking_xid))
416-
ereport(ERROR,
417-
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
418-
errmsg("prepared transaction with identifier \"%s\" is busy",
419-
gid)));
420-
gxact->locking_xid=InvalidTransactionId;
421-
}
488+
if (gxact->locking_backend!=InvalidBackendId)
489+
ereport(ERROR,
490+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
491+
errmsg("prepared transaction with identifier \"%s\" is busy",
492+
gid)));
422493

423494
if (user!=gxact->owner&& !superuser_arg(user))
424495
ereport(ERROR,
@@ -439,7 +510,8 @@ LockGXact(const char *gid, Oid user)
439510
errhint("Connect to the database where the transaction was prepared to finish it.")));
440511

441512
/* OK for me to lock it */
442-
gxact->locking_xid=GetTopTransactionId();
513+
gxact->locking_backend=MyBackendId;
514+
MyLockedGxact=gxact;
443515

444516
LWLockRelease(TwoPhaseStateLock);
445517

@@ -1060,6 +1132,13 @@ EndPrepare(GlobalTransaction gxact)
10601132
*/
10611133
MyProc->inCommit= false;
10621134

1135+
/*
1136+
* Remember that we have this GlobalTransaction entry locked for us. If
1137+
* we crash after this point, it's too late to abort, but we must unlock
1138+
* it so that the prepared transaction can be committed or rolled back.
1139+
*/
1140+
MyLockedGxact=gxact;
1141+
10631142
END_CRIT_SECTION();
10641143

10651144
records.tail=records.head=NULL;
@@ -1294,8 +1373,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
12941373

12951374
/*
12961375
* In case we fail while running the callbacks, mark the gxact invalid so
1297-
* no one else will try to commit/rollback, and so it can be recycled
1298-
* properly later. It is still locked by our XID so it won't go away yet.
1376+
* no one else will try to commit/rollback, and so it will be recycled
1377+
* if we fail after this point. It is still locked by our backend so it
1378+
* won't go away yet.
12991379
*
13001380
* (We assume it's safe to do this without taking TwoPhaseStateLock.)
13011381
*/
@@ -1357,6 +1437,7 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
13571437
RemoveTwoPhaseFile(xid, true);
13581438

13591439
RemoveGXact(gxact);
1440+
MyLockedGxact=NULL;
13601441

13611442
pfree(buf);
13621443
}

‎src/backend/access/transam/xact.c

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2062,9 +2062,13 @@ PrepareTransaction(void)
20622062
ProcArrayClearTransaction(MyProc);
20632063

20642064
/*
2065-
* This is all post-transaction cleanup. Note that if an error is raised
2066-
* here, it's too late to abort the transaction. This should be just
2067-
* noncritical resource releasing. See notes in CommitTransaction.
2065+
* In normal commit-processing, this is all non-critical post-transaction
2066+
* cleanup. When the transaction is prepared, however, it's important that
2067+
* the locks and other per-backend resources are transfered to the
2068+
* prepared transaction's PGPROC entry. Note that if an error is raised
2069+
* here, it's too late to abort the transaction. XXX: This probably should
2070+
* be in a critical section, to force a PANIC if any of this fails, but
2071+
* that cure could be worse than the disease.
20682072
*/
20692073

20702074
CallXactCallbacks(XACT_EVENT_PREPARE);
@@ -2101,6 +2105,14 @@ PrepareTransaction(void)
21012105
RESOURCE_RELEASE_AFTER_LOCKS,
21022106
true, true);
21032107

2108+
/*
2109+
* Allow another backend to finish the transaction. After
2110+
* PostPrepare_Twophase(), the transaction is completely detached from
2111+
* our backend. The rest is just non-critical cleanup of backend-local
2112+
* state.
2113+
*/
2114+
PostPrepare_Twophase();
2115+
21042116
/* Check we've released all catcache entries */
21052117
AtEOXact_CatCache(true);
21062118

@@ -2211,6 +2223,7 @@ AbortTransaction(void)
22112223
AtEOXact_LargeObject(false);
22122224
AtAbort_Notify();
22132225
AtEOXact_RelationMap(false);
2226+
AtAbort_Twophase();
22142227

22152228
/*
22162229
* Advertise the fact that we aborted in pg_clog (assuming that we got as

‎src/include/access/twophase.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,9 @@ extern intmax_prepared_xacts;
3131
externSizeTwoPhaseShmemSize(void);
3232
externvoidTwoPhaseShmemInit(void);
3333

34+
externvoidAtAbort_Twophase(void);
35+
externvoidPostPrepare_Twophase(void);
36+
3437
externPGPROC*TwoPhaseGetDummyProc(TransactionIdxid);
3538
externBackendIdTwoPhaseGetDummyBackendId(TransactionIdxid);
3639

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp