Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit51ff46d

Browse files
committed
For inplace update durability, make heap_update() callers wait.
The previous commit fixed some ways of losing an inplace update. Itremained possible to lose one when a backend working toward aheap_update() copied a tuple into memory just before inplace update ofthat tuple. In catalogs eligible for inplace update, use LOCKTAG_TUPLEto govern admission to the steps of copying an old tuple, modifying it,and issuing heap_update(). This includes MERGE commands. To avoidchanging most of the pg_class DDL, don't require LOCKTAG_TUPLE whenholding a relation lock sufficient to exclude inplace updaters.Back-patch to v12 (all supported versions). In v13 and v12, "UPDATEpg_class" or "UPDATE pg_database" can still lose an inplace update. Thev14+ UPDATE fix needs commit86dc900,and it wasn't worth reimplementing that fix without such infrastructure.Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas.Discussion:https://postgr.es/m/20231027214946.79.nmisch@google.com
1 parent63f0198 commit51ff46d

File tree

19 files changed

+488
-49
lines changed

19 files changed

+488
-49
lines changed

‎src/backend/access/heap/README.tuplock

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,48 @@ The following infomask bits are applicable:
154154
We currently never set the HEAP_XMAX_COMMITTED when the HEAP_XMAX_IS_MULTI bit
155155
is set.
156156

157+
Locking to write inplace-updated tables
158+
---------------------------------------
159+
160+
If IsInplaceUpdateRelation() returns true for a table, the table is a system
161+
catalog that receives systable_inplace_update_begin() calls. Preparing a
162+
heap_update() of these tables follows additional locking rules, to ensure we
163+
don't lose the effects of an inplace update. In particular, consider a moment
164+
when a backend has fetched the old tuple to modify, not yet having called
165+
heap_update(). Another backend's inplace update starting then can't conclude
166+
until the heap_update() places its new tuple in a buffer. We enforce that
167+
using locktags as follows. While DDL code is the main audience, the executor
168+
follows these rules to make e.g. "MERGE INTO pg_class" safer. Locking rules
169+
are per-catalog:
170+
171+
pg_class systable_inplace_update_begin() callers: before the call, acquire a
172+
lock on the relation in mode ShareUpdateExclusiveLock or stricter. If the
173+
update targets a row of RELKIND_INDEX (but not RELKIND_PARTITIONED_INDEX),
174+
that lock must be on the table. Locking the index rel is not necessary.
175+
(This allows VACUUM to overwrite per-index pg_class while holding a lock on
176+
the table alone.) systable_inplace_update_begin() acquires and releases
177+
LOCKTAG_TUPLE in InplaceUpdateTupleLock, an alias for ExclusiveLock, on each
178+
tuple it overwrites.
179+
180+
pg_class heap_update() callers: before copying the tuple to modify, take a
181+
lock on the tuple, a ShareUpdateExclusiveLock on the relation, or a
182+
ShareRowExclusiveLock or stricter on the relation.
183+
184+
SearchSysCacheLocked1() is one convenient way to acquire the tuple lock.
185+
Most heap_update() callers already hold a suitable lock on the relation for
186+
other reasons and can skip the tuple lock. If you do acquire the tuple
187+
lock, release it immediately after the update.
188+
189+
190+
pg_database: before copying the tuple to modify, all updaters of pg_database
191+
rows acquire LOCKTAG_TUPLE. (Few updaters acquire LOCKTAG_OBJECT on the
192+
database OID, so it wasn't worth extending that as a second option.)
193+
194+
Ideally, DDL might want to perform permissions checks before LockTuple(), as
195+
we do with RangeVarGetRelidExtended() callbacks. We typically don't bother.
196+
LOCKTAG_TUPLE acquirers release it after each row, so the potential
197+
inconvenience is lower.
198+
157199
Reading inplace-updated columns
158200
-------------------------------
159201

‎src/backend/access/heap/heapam.c

Lines changed: 149 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,8 @@
5252
#include"access/xloginsert.h"
5353
#include"access/xlogutils.h"
5454
#include"catalog/catalog.h"
55+
#include"catalog/pg_database.h"
56+
#include"catalog/pg_database_d.h"
5557
#include"commands/vacuum.h"
5658
#include"miscadmin.h"
5759
#include"pgstat.h"
@@ -79,6 +81,12 @@ static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf,
7981
Buffernewbuf,HeapTupleoldtup,
8082
HeapTuplenewtup,HeapTupleold_key_tuple,
8183
boolall_visible_cleared,boolnew_all_visible_cleared);
84+
#ifdefUSE_ASSERT_CHECKING
85+
staticvoidcheck_lock_if_inplace_updateable_rel(Relationrelation,
86+
ItemPointerotid,
87+
HeapTuplenewtup);
88+
staticvoidcheck_inplace_rel_lock(HeapTupleoldtup);
89+
#endif
8290
staticBitmapset*HeapDetermineColumnsInfo(Relationrelation,
8391
Bitmapset*interesting_cols,
8492
Bitmapset*external_cols,
@@ -123,6 +131,8 @@ static HeapTuple ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool ke
123131
* heavyweight lock mode and MultiXactStatus values to use for any particular
124132
* tuple lock strength.
125133
*
134+
* These interact with InplaceUpdateTupleLock, an alias for ExclusiveLock.
135+
*
126136
* Don't look at lockstatus/updstatus directly! Use get_mxact_status_for_lock
127137
* instead.
128138
*/
@@ -3051,6 +3061,10 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
30513061
(errcode(ERRCODE_INVALID_TRANSACTION_STATE),
30523062
errmsg("cannot update tuples during a parallel operation")));
30533063

3064+
#ifdefUSE_ASSERT_CHECKING
3065+
check_lock_if_inplace_updateable_rel(relation,otid,newtup);
3066+
#endif
3067+
30543068
/*
30553069
* Fetch the list of attributes to be checked for various operations.
30563070
*
@@ -3915,6 +3929,128 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
39153929
returnTM_Ok;
39163930
}
39173931

3932+
#ifdefUSE_ASSERT_CHECKING
3933+
/*
3934+
* Confirm adequate lock held during heap_update(), per rules from
3935+
* README.tuplock section "Locking to write inplace-updated tables".
3936+
*/
3937+
staticvoid
3938+
check_lock_if_inplace_updateable_rel(Relationrelation,
3939+
ItemPointerotid,
3940+
HeapTuplenewtup)
3941+
{
3942+
/* LOCKTAG_TUPLE acceptable for any catalog */
3943+
switch (RelationGetRelid(relation))
3944+
{
3945+
caseRelationRelationId:
3946+
caseDatabaseRelationId:
3947+
{
3948+
LOCKTAGtuptag;
3949+
3950+
SET_LOCKTAG_TUPLE(tuptag,
3951+
relation->rd_lockInfo.lockRelId.dbId,
3952+
relation->rd_lockInfo.lockRelId.relId,
3953+
ItemPointerGetBlockNumber(otid),
3954+
ItemPointerGetOffsetNumber(otid));
3955+
if (LockHeldByMe(&tuptag,InplaceUpdateTupleLock))
3956+
return;
3957+
}
3958+
break;
3959+
default:
3960+
Assert(!IsInplaceUpdateRelation(relation));
3961+
return;
3962+
}
3963+
3964+
switch (RelationGetRelid(relation))
3965+
{
3966+
caseRelationRelationId:
3967+
{
3968+
/* LOCKTAG_TUPLE or LOCKTAG_RELATION ok */
3969+
Form_pg_classclassForm= (Form_pg_class)GETSTRUCT(newtup);
3970+
Oidrelid=classForm->oid;
3971+
Oiddbid;
3972+
LOCKTAGtag;
3973+
3974+
if (IsSharedRelation(relid))
3975+
dbid=InvalidOid;
3976+
else
3977+
dbid=MyDatabaseId;
3978+
3979+
if (classForm->relkind==RELKIND_INDEX)
3980+
{
3981+
Relationirel=index_open(relid,AccessShareLock);
3982+
3983+
SET_LOCKTAG_RELATION(tag,dbid,irel->rd_index->indrelid);
3984+
index_close(irel,AccessShareLock);
3985+
}
3986+
else
3987+
SET_LOCKTAG_RELATION(tag,dbid,relid);
3988+
3989+
if (!LockHeldByMe(&tag,ShareUpdateExclusiveLock)&&
3990+
!LockOrStrongerHeldByMe(&tag,ShareRowExclusiveLock))
3991+
elog(WARNING,
3992+
"missing lock for relation \"%s\" (OID %u, relkind %c) @ TID (%u,%u)",
3993+
NameStr(classForm->relname),
3994+
relid,
3995+
classForm->relkind,
3996+
ItemPointerGetBlockNumber(otid),
3997+
ItemPointerGetOffsetNumber(otid));
3998+
}
3999+
break;
4000+
caseDatabaseRelationId:
4001+
{
4002+
/* LOCKTAG_TUPLE required */
4003+
Form_pg_databasedbForm= (Form_pg_database)GETSTRUCT(newtup);
4004+
4005+
elog(WARNING,
4006+
"missing lock on database \"%s\" (OID %u) @ TID (%u,%u)",
4007+
NameStr(dbForm->datname),
4008+
dbForm->oid,
4009+
ItemPointerGetBlockNumber(otid),
4010+
ItemPointerGetOffsetNumber(otid));
4011+
}
4012+
break;
4013+
}
4014+
}
4015+
4016+
/*
4017+
* Confirm adequate relation lock held, per rules from README.tuplock section
4018+
* "Locking to write inplace-updated tables".
4019+
*/
4020+
staticvoid
4021+
check_inplace_rel_lock(HeapTupleoldtup)
4022+
{
4023+
Form_pg_classclassForm= (Form_pg_class)GETSTRUCT(oldtup);
4024+
Oidrelid=classForm->oid;
4025+
Oiddbid;
4026+
LOCKTAGtag;
4027+
4028+
if (IsSharedRelation(relid))
4029+
dbid=InvalidOid;
4030+
else
4031+
dbid=MyDatabaseId;
4032+
4033+
if (classForm->relkind==RELKIND_INDEX)
4034+
{
4035+
Relationirel=index_open(relid,AccessShareLock);
4036+
4037+
SET_LOCKTAG_RELATION(tag,dbid,irel->rd_index->indrelid);
4038+
index_close(irel,AccessShareLock);
4039+
}
4040+
else
4041+
SET_LOCKTAG_RELATION(tag,dbid,relid);
4042+
4043+
if (!LockOrStrongerHeldByMe(&tag,ShareUpdateExclusiveLock))
4044+
elog(WARNING,
4045+
"missing lock for relation \"%s\" (OID %u, relkind %c) @ TID (%u,%u)",
4046+
NameStr(classForm->relname),
4047+
relid,
4048+
classForm->relkind,
4049+
ItemPointerGetBlockNumber(&oldtup->t_self),
4050+
ItemPointerGetOffsetNumber(&oldtup->t_self));
4051+
}
4052+
#endif
4053+
39184054
/*
39194055
* Check if the specified attribute's values are the same. Subroutine for
39204056
* HeapDetermineColumnsInfo.
@@ -5928,15 +6064,21 @@ heap_inplace_lock(Relation relation,
59286064
TM_Resultresult;
59296065
boolret;
59306066

6067+
#ifdefUSE_ASSERT_CHECKING
6068+
if (RelationGetRelid(relation)==RelationRelationId)
6069+
check_inplace_rel_lock(oldtup_ptr);
6070+
#endif
6071+
59316072
Assert(BufferIsValid(buffer));
59326073

6074+
LockTuple(relation,&oldtup.t_self,InplaceUpdateTupleLock);
59336075
LockBuffer(buffer,BUFFER_LOCK_EXCLUSIVE);
59346076

59356077
/*----------
59366078
* Interpret HeapTupleSatisfiesUpdate() like heap_update() does, except:
59376079
*
59386080
* - wait unconditionally
5939-
* -notuplelocks
6081+
* -already lockedtupleabove, since inplace needs that unconditionally
59406082
* - don't recheck header after wait: simpler to defer to next iteration
59416083
* - don't try to continue even if the updater aborts: likewise
59426084
* - no crosscheck
@@ -6020,7 +6162,10 @@ heap_inplace_lock(Relation relation,
60206162
* don't bother optimizing that.
60216163
*/
60226164
if (!ret)
6165+
{
6166+
UnlockTuple(relation,&oldtup.t_self,InplaceUpdateTupleLock);
60236167
InvalidateCatalogSnapshot();
6168+
}
60246169
returnret;
60256170
}
60266171

@@ -6029,6 +6174,8 @@ heap_inplace_lock(Relation relation,
60296174
*
60306175
* The tuple cannot change size, and therefore its header fields and null
60316176
* bitmap (if any) don't change either.
6177+
*
6178+
* Since we hold LOCKTAG_TUPLE, no updater has a local copy of this tuple.
60326179
*/
60336180
void
60346181
heap_inplace_update_and_unlock(Relationrelation,
@@ -6112,6 +6259,7 @@ heap_inplace_unlock(Relation relation,
61126259
HeapTupleoldtup,Bufferbuffer)
61136260
{
61146261
LockBuffer(buffer,BUFFER_LOCK_UNLOCK);
6262+
UnlockTuple(relation,&oldtup->t_self,InplaceUpdateTupleLock);
61156263
}
61166264

61176265
/*

‎src/backend/access/index/genam.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -755,7 +755,9 @@ systable_endscan_ordered(SysScanDesc sysscan)
755755
*
756756
* Overwriting violates both MVCC and transactional safety, so the uses of
757757
* this function in Postgres are extremely limited. Nonetheless we find some
758-
* places to use it. Standard flow:
758+
* places to use it. See README.tuplock section "Locking to write
759+
* inplace-updated tables" and later sections for expectations of readers and
760+
* writers of a table that gets inplace updates. Standard flow:
759761
*
760762
* ... [any slow preparation not requiring oldtup] ...
761763
* systable_inplace_update_begin([...], &tup, &inplace_state);

‎src/backend/catalog/aclchk.c

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@
7171
#include"nodes/makefuncs.h"
7272
#include"parser/parse_func.h"
7373
#include"parser/parse_type.h"
74+
#include"storage/lmgr.h"
7475
#include"utils/acl.h"
7576
#include"utils/aclchk_internal.h"
7677
#include"utils/builtins.h"
@@ -1827,7 +1828,7 @@ ExecGrant_Relation(InternalGrant *istmt)
18271828
HeapTupletuple;
18281829
ListCell*cell_colprivs;
18291830

1830-
tuple=SearchSysCache1(RELOID,ObjectIdGetDatum(relOid));
1831+
tuple=SearchSysCacheLocked1(RELOID,ObjectIdGetDatum(relOid));
18311832
if (!HeapTupleIsValid(tuple))
18321833
elog(ERROR,"cache lookup failed for relation %u",relOid);
18331834
pg_class_tuple= (Form_pg_class)GETSTRUCT(tuple);
@@ -2039,6 +2040,7 @@ ExecGrant_Relation(InternalGrant *istmt)
20392040
values,nulls,replaces);
20402041

20412042
CatalogTupleUpdate(relation,&newtuple->t_self,newtuple);
2043+
UnlockTuple(relation,&tuple->t_self,InplaceUpdateTupleLock);
20422044

20432045
/* Update initial privileges for extensions */
20442046
recordExtensionInitPriv(relOid,RelationRelationId,0,new_acl);
@@ -2051,6 +2053,8 @@ ExecGrant_Relation(InternalGrant *istmt)
20512053

20522054
pfree(new_acl);
20532055
}
2056+
else
2057+
UnlockTuple(relation,&tuple->t_self,InplaceUpdateTupleLock);
20542058

20552059
/*
20562060
* Handle column-level privileges, if any were specified or implied.
@@ -2164,7 +2168,7 @@ ExecGrant_common(InternalGrant *istmt, Oid classid, AclMode default_privs,
21642168
Oid*oldmembers;
21652169
Oid*newmembers;
21662170

2167-
tuple=SearchSysCache1(cacheid,ObjectIdGetDatum(objectid));
2171+
tuple=SearchSysCacheLocked1(cacheid,ObjectIdGetDatum(objectid));
21682172
if (!HeapTupleIsValid(tuple))
21692173
elog(ERROR,"cache lookup failed for %s %u",get_object_class_descr(classid),objectid);
21702174

@@ -2240,6 +2244,7 @@ ExecGrant_common(InternalGrant *istmt, Oid classid, AclMode default_privs,
22402244
nulls,replaces);
22412245

22422246
CatalogTupleUpdate(relation,&newtuple->t_self,newtuple);
2247+
UnlockTuple(relation,&tuple->t_self,InplaceUpdateTupleLock);
22432248

22442249
/* Update initial privileges for extensions */
22452250
recordExtensionInitPriv(objectid,classid,0,new_acl);

‎src/backend/catalog/catalog.c

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,15 @@ IsCatalogRelationOid(Oid relid)
140140
/*
141141
* IsInplaceUpdateRelation
142142
*True iff core code performs inplace updates on the relation.
143+
*
144+
*This is used for assertions and for making the executor follow the
145+
*locking protocol described at README.tuplock section "Locking to write
146+
*inplace-updated tables". Extensions may inplace-update other heap
147+
*tables, but concurrent SQL UPDATE on the same table may overwrite
148+
*those modifications.
149+
*
150+
*The executor can assume these are not partitions or partitioned and
151+
*have no triggers.
143152
*/
144153
bool
145154
IsInplaceUpdateRelation(Relationrelation)

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp