Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit0ac5ad5

Browse files
committed
Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FORKEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block eachother, in contrast with already existing "SELECT FOR SHARE" and "SELECTFOR UPDATE". UPDATE commands that do not modify the values stored inthe columns that are part of the key of the tuple now grab a SELECT FORNO KEY UPDATE lock on the tuple, allowing them to proceed concurrentlywith tuple locks of the FOR KEY SHARE variety.Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; thismeans the concurrency improvement applies to them, which is the wholepoint of this patch.The added tuple lock semantics require some rejiggering of the multixactmodule, so that the locking level that each transaction is holding canbe stored alongside its Xid. Also, multixacts now need to persistacross server restarts and crashes, because they can now represent notonly tuple locks, but also tuple updates. This means we need morecareful tracking of lifetime of pg_multixact SLRU files; since they nowpersist longer, we require more infrastructure to figure out when theycan be removed. pg_upgrade also needs to be careful to copypg_multixact files over from the old server to the new, or at least partof multixact.c state, depending on the versions of the old and newservers.Tuple time qualification rules (HeapTupleSatisfies routines) need to becareful not to consider tuples with the "is multi" infomask bit set asbeing only locked; they might need to look up MultiXact values (i.e.possibly do pg_multixact I/O) to find out the Xid that updated a tuple,whereas they previously were assured to only use information readilyavailable from the tuple header. This is considered acceptable, becausethe extra I/O would involve cases that would previously cause somecommands to block waiting for concurrent transactions to finish.Another important change is the fact that locking tuples that havepreviously been updated causes the future versions to be marked aslocked, too; this is essential for correctness of foreign key checks.This causes additional WAL-logging, also (there was previously a singleWAL record for a locked tuple; now there are as many as updated copiesof the tuple there exist.)With all this in place, contention related to tuples being checked byforeign key rules should be much reduced.As a bonus, the old behavior that a subtransaction grabbing a strongertuple lock than the parent (sub)transaction held on a given tuple andlater aborting caused the weaker lock to be lost, has been fixed.Many new spec files were added for isolation tester framework, to ensureoverall behavior is sane. There's probably room for several more tests.There were several reviewers of this patch; in particular, Noah Mischand Andres Freund spent considerable time in it. Original idea for thepatch came from Simon Riggs, after a problem report by Joel Jacobson.Most code is from me, with contributions from Marti Raudsepp, AlexanderShulgin, Noah Misch and Andres Freund.This patch was discussed in several pgsql-hackers threads; the mostimportant start at the following message-ids:AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com1290721684-sup-3951@alvh.no-ip.org1294953201-sup-2099@alvh.no-ip.org1320343602-sup-2290@alvh.no-ip.org1339690386-sup-8927@alvh.no-ip.org4FE5FF020200002500048A3D@gw.wicourts.gov4FEAB90A0200002500048B7D@gw.wicourts.gov
1 parentf925c79 commit0ac5ad5

File tree

106 files changed

+6023
-1487
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

106 files changed

+6023
-1487
lines changed

‎contrib/file_fdw/output/file_fdw.source

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,7 @@ ERROR: cannot change foreign table "agg_csv"
191191
DELETE FROM agg_csv WHERE a = 100;
192192
ERROR: cannot change foreign table "agg_csv"
193193
SELECT * FROM agg_csv FOR UPDATE OF agg_csv;
194-
ERROR: SELECT FOR UPDATE/SHARE cannot be used with foreign table "agg_csv"
194+
ERROR: SELECT FOR UPDATE/SHARE/KEY UPDATE/KEY SHARE cannot be used with foreign table "agg_csv"
195195
LINE 1: SELECT * FROM agg_csv FOR UPDATE OF agg_csv;
196196
^
197197
-- but this should be ignored

‎contrib/pageinspect/heapfuncs.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ heap_page_items(PG_FUNCTION_ARGS)
163163
tuphdr= (HeapTupleHeader)PageGetItem(page,id);
164164

165165
values[4]=UInt32GetDatum(HeapTupleHeaderGetXmin(tuphdr));
166-
values[5]=UInt32GetDatum(HeapTupleHeaderGetXmax(tuphdr));
166+
values[5]=UInt32GetDatum(HeapTupleHeaderGetRawXmax(tuphdr));
167167
values[6]=UInt32GetDatum(HeapTupleHeaderGetRawCommandId(tuphdr));/* shared with xvac */
168168
values[7]=PointerGetDatum(&tuphdr->t_ctid);
169169
values[8]=UInt32GetDatum(tuphdr->t_infomask2);

‎contrib/pg_upgrade/controldata.c

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,9 @@ get_control_data(ClusterInfo *cluster, bool live_check)
4040
boolgot_xid= false;
4141
boolgot_oid= false;
4242
boolgot_nextxlogfile= false;
43+
boolgot_multi= false;
44+
boolgot_mxoff= false;
45+
boolgot_oldestmulti= false;
4346
boolgot_log_id= false;
4447
boolgot_log_seg= false;
4548
boolgot_tli= false;
@@ -246,6 +249,39 @@ get_control_data(ClusterInfo *cluster, bool live_check)
246249
cluster->controldata.chkpnt_nxtoid=str2uint(p);
247250
got_oid= true;
248251
}
252+
elseif ((p=strstr(bufin,"Latest checkpoint's NextMultiXactId:"))!=NULL)
253+
{
254+
p=strchr(p,':');
255+
256+
if (p==NULL||strlen(p) <=1)
257+
pg_log(PG_FATAL,"%d: controldata retrieval problem\n",__LINE__);
258+
259+
p++;/* removing ':' char */
260+
cluster->controldata.chkpnt_nxtmulti=str2uint(p);
261+
got_multi= true;
262+
}
263+
elseif ((p=strstr(bufin,"Latest checkpoint's oldestMultiXid:"))!=NULL)
264+
{
265+
p=strchr(p,':');
266+
267+
if (p==NULL||strlen(p) <=1)
268+
pg_log(PG_FATAL,"%d: controldata retrieval problem\n",__LINE__);
269+
270+
p++;/* removing ':' char */
271+
cluster->controldata.chkpnt_oldstMulti=str2uint(p);
272+
got_oldestmulti= true;
273+
}
274+
elseif ((p=strstr(bufin,"Latest checkpoint's NextMultiOffset:"))!=NULL)
275+
{
276+
p=strchr(p,':');
277+
278+
if (p==NULL||strlen(p) <=1)
279+
pg_log(PG_FATAL,"%d: controldata retrieval problem\n",__LINE__);
280+
281+
p++;/* removing ':' char */
282+
cluster->controldata.chkpnt_nxtmxoff=str2uint(p);
283+
got_mxoff= true;
284+
}
249285
elseif ((p=strstr(bufin,"Maximum data alignment:"))!=NULL)
250286
{
251287
p=strchr(p,':');
@@ -433,6 +469,7 @@ get_control_data(ClusterInfo *cluster, bool live_check)
433469

434470
/* verify that we got all the mandatory pg_control data */
435471
if (!got_xid|| !got_oid||
472+
!got_multi|| !got_mxoff|| !got_oldestmulti||
436473
(!live_check&& !got_nextxlogfile)||
437474
!got_tli||
438475
!got_align|| !got_blocksz|| !got_largesz|| !got_walsz||
@@ -448,6 +485,15 @@ get_control_data(ClusterInfo *cluster, bool live_check)
448485
if (!got_oid)
449486
pg_log(PG_REPORT," latest checkpoint next OID\n");
450487

488+
if (!got_multi)
489+
pg_log(PG_REPORT," latest checkpoint next MultiXactId\n");
490+
491+
if (!got_mxoff)
492+
pg_log(PG_REPORT," latest checkpoint next MultiXactOffset\n");
493+
494+
if (!got_oldestmulti)
495+
pg_log(PG_REPORT," latest checkpoint oldest MultiXactId\n");
496+
451497
if (!live_check&& !got_nextxlogfile)
452498
pg_log(PG_REPORT," first WAL segment after reset\n");
453499

‎contrib/pg_upgrade/pg_upgrade.c

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -382,6 +382,52 @@ copy_clog_xlog_xid(void)
382382
new_cluster.pgdata);
383383
check_ok();
384384

385+
/*
386+
* If both new and old are after the pg_multixact change commit, copy those
387+
* files too. If the old server is before that change and the new server
388+
* is after, then we don't copy anything but we need to reset pg_control so
389+
* that the new server doesn't attempt to read multis older than the cutoff
390+
* value.
391+
*/
392+
if (old_cluster.controldata.cat_ver >=MULTIXACT_FORMATCHANGE_CAT_VER&&
393+
new_cluster.controldata.cat_ver >=MULTIXACT_FORMATCHANGE_CAT_VER)
394+
{
395+
copy_subdir_files("pg_multixact/offsets");
396+
copy_subdir_files("pg_multixact/members");
397+
prep_status("Setting next multixact ID and offset for new cluster");
398+
/*
399+
* we preserve all files and contents, so we must preserve both "next"
400+
* counters here and the oldest multi present on system.
401+
*/
402+
exec_prog(UTILITY_LOG_FILE,NULL, true,
403+
"\"%s/pg_resetxlog\" -O %u -m %u,%u \"%s\"",
404+
new_cluster.bindir,
405+
old_cluster.controldata.chkpnt_nxtmxoff,
406+
old_cluster.controldata.chkpnt_nxtmulti,
407+
old_cluster.controldata.chkpnt_oldstMulti,
408+
new_cluster.pgdata);
409+
check_ok();
410+
}
411+
elseif (new_cluster.controldata.cat_ver >=MULTIXACT_FORMATCHANGE_CAT_VER)
412+
{
413+
prep_status("Setting oldest multixact ID on new cluster");
414+
/*
415+
* We don't preserve files in this case, but it's important that the
416+
* oldest multi is set to the latest value used by the old system, so
417+
* that multixact.c returns the empty set for multis that might be
418+
* present on disk. We set next multi to the value following that; it
419+
* might end up wrapped around (i.e. 0) if the old cluster had
420+
* next=MaxMultiXactId, but multixact.c can cope with that just fine.
421+
*/
422+
exec_prog(UTILITY_LOG_FILE,NULL, true,
423+
"\"%s/pg_resetxlog\" -m %u,%u \"%s\"",
424+
new_cluster.bindir,
425+
old_cluster.controldata.chkpnt_nxtmulti+1,
426+
old_cluster.controldata.chkpnt_nxtmulti,
427+
new_cluster.pgdata);
428+
check_ok();
429+
}
430+
385431
/* now reset the wal archives in the new cluster */
386432
prep_status("Resetting WAL archives");
387433
exec_prog(UTILITY_LOG_FILE,NULL, true,

‎contrib/pg_upgrade/pg_upgrade.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,10 @@ extern char *output_files[];
108108
*/
109109
#defineVISIBILITY_MAP_CRASHSAFE_CAT_VER 201107031
110110

111+
/*
112+
* pg_multixact format changed in this catversion:
113+
*/
114+
#defineMULTIXACT_FORMATCHANGE_CAT_VER 201301231
111115

112116
/*
113117
* Each relation is represented by a relinfo structure.
@@ -182,6 +186,9 @@ typedef struct
182186
uint32chkpnt_tli;
183187
uint32chkpnt_nxtxid;
184188
uint32chkpnt_nxtoid;
189+
uint32chkpnt_nxtmulti;
190+
uint32chkpnt_nxtmxoff;
191+
uint32chkpnt_oldstMulti;
185192
uint32align;
186193
uint32blocksz;
187194
uint32largesz;

‎contrib/pgrowlocks/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ MODULE_big= pgrowlocks
44
OBJS= pgrowlocks.o
55

66
EXTENSION = pgrowlocks
7-
DATA = pgrowlocks--1.0.sql pgrowlocks--unpackaged--1.0.sql
7+
DATA = pgrowlocks--1.1.sql pgrowlocks--1.0--1.1.sql pgrowlocks--unpackaged--1.0.sql
88

99
ifdefUSE_PGXS
1010
PG_CONFIG = pg_config
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
/* contrib/pgrowlocks/pgrowlocks--1.0--1.1.sql*/
2+
3+
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
4+
\echo Use"CREATE EXTENSION pgrowlocks" to load this file. \quit
5+
6+
ALTER EXTENSION pgrowlocks DROP FUNCTION pgrowlocks(text);
7+
DROPFUNCTION pgrowlocks(text);
8+
CREATEFUNCTIONpgrowlocks(IN relnametext,
9+
OUT locked_row TID,-- row TID
10+
OUT locker XID,-- locking XID
11+
OUT multi bool,-- multi XID?
12+
OUT xids xid[],-- multi XIDs
13+
OUT modestext[],-- multi XID statuses
14+
OUT pidsINTEGER[])-- locker's process id
15+
RETURNS SETOF record
16+
AS'MODULE_PATHNAME','pgrowlocks'
17+
LANGUAGE C STRICT;

‎contrib/pgrowlocks/pgrowlocks--1.0.sqlrenamed to‎contrib/pgrowlocks/pgrowlocks--1.1.sql

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
1-
/* contrib/pgrowlocks/pgrowlocks--1.0.sql*/
1+
/* contrib/pgrowlocks/pgrowlocks--1.1.sql*/
22

33
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
44
\echo Use"CREATE EXTENSION pgrowlocks" to load this file. \quit
55

66
CREATEFUNCTIONpgrowlocks(IN relnametext,
77
OUT locked_row TID,-- row TID
8-
OUT lock_typeTEXT,-- lock type
98
OUT locker XID,-- locking XID
109
OUT multi bool,-- multi XID?
1110
OUT xids xid[],-- multi XIDs
11+
OUT modestext[],-- multi XID statuses
1212
OUT pidsINTEGER[])-- locker's process id
1313
RETURNS SETOF record
1414
AS'MODULE_PATHNAME','pgrowlocks'

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp