Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit1e74e31

Browse files
committed
WAL-log inplace update before revealing it to other sessions.
A buffer lock won't stop a reader having already checked tuplevisibility. If a vac_update_datfrozenid() and then a crash happenedduring inplace update of a relfrozenxid value, datfrozenxid couldovertake relfrozenxid. That could lead to "could not access status oftransaction" errors. Back-patch to v12 (all supported versions). Inv14 and earlier, this also back-patches the assertion removal fromcommit7fcf2fa.Discussion:https://postgr.es/m/20240620012908.92.nmisch@google.com
1 parent0ea9d40 commit1e74e31

File tree

3 files changed

+46
-18
lines changed

3 files changed

+46
-18
lines changed

‎src/backend/access/heap/README.tuplock

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -203,6 +203,4 @@ Inplace updates create an exception to the rule that tuple data won't change
203203
under a reader holding a pin. A reader of a heap_fetch() result tuple may
204204
witness a torn read. Current inplace-updated fields are aligned and are no
205205
wider than four bytes, and current readers don't need consistency across
206-
fields. Hence, they get by with just fetching each field once. XXX such a
207-
caller may also read a value that has not reached WAL; see
208-
systable_inplace_update_finish().
206+
fields. Hence, they get by with just fetching each field once.

‎src/backend/access/heap/heapam.c

Lines changed: 45 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -6104,13 +6104,18 @@ heap_inplace_update_and_unlock(Relation relation,
61046104
HeapTupleHeaderhtup=oldtup->t_data;
61056105
uint32oldlen;
61066106
uint32newlen;
6107+
char*dst;
6108+
char*src;
61076109

61086110
Assert(ItemPointerEquals(&oldtup->t_self,&tuple->t_self));
61096111
oldlen=oldtup->t_len-htup->t_hoff;
61106112
newlen=tuple->t_len-tuple->t_data->t_hoff;
61116113
if (oldlen!=newlen||htup->t_hoff!=tuple->t_data->t_hoff)
61126114
elog(ERROR,"wrong tuple length");
61136115

6116+
dst= (char*)htup+htup->t_hoff;
6117+
src= (char*)tuple->t_data+tuple->t_data->t_hoff;
6118+
61146119
/*
61156120
* Construct shared cache inval if necessary. Note that because we only
61166121
* pass the new version of the tuple, this mustn't be used for any
@@ -6129,15 +6134,15 @@ heap_inplace_update_and_unlock(Relation relation,
61296134
*/
61306135
PreInplace_Inval();
61316136

6132-
/* NO EREPORT(ERROR) from here till changes are logged */
6133-
START_CRIT_SECTION();
6134-
6135-
memcpy((char*)htup+htup->t_hoff,
6136-
(char*)tuple->t_data+tuple->t_data->t_hoff,
6137-
newlen);
6138-
61396137
/*----------
6140-
* XXX A crash here can allow datfrozenxid() to get ahead of relfrozenxid:
6138+
* NO EREPORT(ERROR) from here till changes are complete
6139+
*
6140+
* Our buffer lock won't stop a reader having already pinned and checked
6141+
* visibility for this tuple. Hence, we write WAL first, then mutate the
6142+
* buffer. Like in MarkBufferDirtyHint() or RecordTransactionCommit(),
6143+
* checkpoint delay makes that acceptable. With the usual order of
6144+
* changes, a crash after memcpy() and before XLogInsert() could allow
6145+
* datfrozenxid to overtake relfrozenxid:
61416146
*
61426147
* ["D" is a VACUUM (ONLY_DATABASE_STATS)]
61436148
* ["R" is a VACUUM tbl]
@@ -6147,31 +6152,57 @@ heap_inplace_update_and_unlock(Relation relation,
61476152
* D: raise pg_database.datfrozenxid, XLogInsert(), finish
61486153
* [crash]
61496154
* [recovery restores datfrozenxid w/o relfrozenxid]
6155+
*
6156+
* Like in MarkBufferDirtyHint() subroutine XLogSaveBufferForHint(), copy
6157+
* the buffer to the stack before logging. Here, that facilitates a FPI
6158+
* of the post-mutation block before we accept other sessions seeing it.
61506159
*/
6151-
6152-
MarkBufferDirty(buffer);
6160+
Assert(!MyProc->delayChkpt);
6161+
START_CRIT_SECTION();
6162+
MyProc->delayChkpt= true;
61536163

61546164
/* XLOG stuff */
61556165
if (RelationNeedsWAL(relation))
61566166
{
61576167
xl_heap_inplacexlrec;
6168+
PGAlignedBlockcopied_buffer;
6169+
char*origdata= (char*)BufferGetBlock(buffer);
6170+
Pagepage=BufferGetPage(buffer);
6171+
uint16lower= ((PageHeader)page)->pd_lower;
6172+
uint16upper= ((PageHeader)page)->pd_upper;
6173+
uintptr_tdst_offset_in_block;
6174+
RelFileNodernode;
6175+
ForkNumberforkno;
6176+
BlockNumberblkno;
61586177
XLogRecPtrrecptr;
61596178

61606179
xlrec.offnum=ItemPointerGetOffsetNumber(&tuple->t_self);
61616180

61626181
XLogBeginInsert();
61636182
XLogRegisterData((char*)&xlrec,SizeOfHeapInplace);
61646183

6165-
XLogRegisterBuffer(0,buffer,REGBUF_STANDARD);
6166-
XLogRegisterBufData(0, (char*)htup+htup->t_hoff,newlen);
6184+
/* register block matching what buffer will look like after changes */
6185+
memcpy(copied_buffer.data,origdata,lower);
6186+
memcpy(copied_buffer.data+upper,origdata+upper,BLCKSZ-upper);
6187+
dst_offset_in_block=dst-origdata;
6188+
memcpy(copied_buffer.data+dst_offset_in_block,src,newlen);
6189+
BufferGetTag(buffer,&rnode,&forkno,&blkno);
6190+
Assert(forkno==MAIN_FORKNUM);
6191+
XLogRegisterBlock(0,&rnode,forkno,blkno,copied_buffer.data,
6192+
REGBUF_STANDARD);
6193+
XLogRegisterBufData(0,src,newlen);
61676194

61686195
/* inplace updates aren't decoded atm, don't log the origin */
61696196

61706197
recptr=XLogInsert(RM_HEAP_ID,XLOG_HEAP_INPLACE);
61716198

6172-
PageSetLSN(BufferGetPage(buffer),recptr);
6199+
PageSetLSN(page,recptr);
61736200
}
61746201

6202+
memcpy(dst,src,newlen);
6203+
6204+
MarkBufferDirty(buffer);
6205+
61756206
LockBuffer(buffer,BUFFER_LOCK_UNLOCK);
61766207

61776208
/*
@@ -6184,6 +6215,7 @@ heap_inplace_update_and_unlock(Relation relation,
61846215
*/
61856216
AtInplace_Inval();
61866217

6218+
MyProc->delayChkpt= false;
61876219
END_CRIT_SECTION();
61886220
UnlockTuple(relation,&tuple->t_self,InplaceUpdateTupleLock);
61896221

‎src/backend/access/transam/xloginsert.c

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -269,8 +269,6 @@ XLogRegisterBlock(uint8 block_id, RelFileNode *rnode, ForkNumber forknum,
269269
{
270270
registered_buffer*regbuf;
271271

272-
/* This is currently only used to WAL-log a full-page image of a page */
273-
Assert(flags&REGBUF_FORCE_IMAGE);
274272
Assert(begininsert_called);
275273

276274
if (block_id >=max_registered_block_id)

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp