Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitdc7420c

Browse files
committed
snapshot scalability: Don't compute global horizons while building snapshots.
To make GetSnapshotData() more scalable, it cannot not look at at each proc'sxmin: While snapshot contents do not need to change whenever a read-onlytransaction commits or a snapshot is released, a proc's xmin is modified inthose cases. The frequency of xmin modifications leads to, particularly onhigher core count systems, many cache misses inside GetSnapshotData(), despitethe data underlying a snapshot not changing. That is the mostsignificant source of GetSnapshotData() scaling poorly on larger systems.Without accessing xmins, GetSnapshotData() cannot calculate accurate horizons /thresholds as it has so far. But we don't really have to: The horizons don'tactually change that much between GetSnapshotData() calls. Nor are the horizonsactually used every time a snapshot is built.The trick this commit introduces is to delay computation of accurate horizonsuntil there use and using horizon boundaries to determine whether accuratehorizons need to be computed.The use of RecentGlobal[Data]Xmin to decide whether a row version could beremoved has been replaces with new GlobalVisTest* functions. These use twothresholds to determine whether a row can be pruned:1) definitely_needed, indicating that rows deleted by XIDs >= definitely_needed are definitely still visible.2) maybe_needed, indicating that rows deleted by XIDs < maybe_needed can definitely be removedGetSnapshotData() updates definitely_needed to be the xmin of the computedsnapshot.When testing whether a row can be removed (with GlobalVisTestIsRemovableXid())and the tested XID falls in between the two (i.e. XID >= maybe_needed && XID <definitely_needed) the boundaries can be recomputed to be more accurate. As itis not cheap to compute accurate boundaries, we limit the number of times thathappens in short succession. As the boundaries used byGlobalVisTestIsRemovableXid() are never reset (with maybe_needed updated byGetSnapshotData()), it is likely that further test can benefit from an earliercomputation of accurate horizons.To avoid regressing performance when old_snapshot_threshold is set (as thatrequires an accurate horizon to be computed), heap_page_prune_opt() doesn'tunconditionally call TransactionIdLimitedForOldSnapshots() anymore. Both thecomputation of the limited horizon, and the triggering of errors (withSetOldSnapshotThresholdTimestamp()) is now only done when necessary to removetuples.This commit just removes the accesses to PGXACT->xmin fromGetSnapshotData(), but other members of PGXACT residing in the samecache line are accessed. Therefore this in itself does not result in asignificant improvement. Subsequent commits will take advantage of thefact that GetSnapshotData() now does not need to access xmins anymore.Note: This contains a workaround in heap_page_prune_opt() to keep thesnapshot_too_old tests working. While that workaround is ugly, the testscurrently are not meaningful, and it seems best to address them separately.Author: Andres Freund <andres@anarazel.de>Reviewed-By: Robert Haas <robertmhaas@gmail.com>Reviewed-By: Thomas Munro <thomas.munro@gmail.com>Reviewed-By: David Rowley <dgrowleyml@gmail.com>Discussion:https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
1 parent1f42d35 commitdc7420c

File tree

38 files changed

+1462
-566
lines changed

38 files changed

+1462
-566
lines changed

‎contrib/amcheck/verify_nbtree.c

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -434,10 +434,10 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
434434
RelationGetRelationName(rel));
435435

436436
/*
437-
*RecentGlobalXmin assertion matches index_getnext_tid(). Seenote on
438-
*RecentGlobalXmin/B-Tree page deletion.
437+
*This assertion matchesthe one inindex_getnext_tid(). Seepage
438+
*recycling/"visible to everyone" notes in nbtree README.
439439
*/
440-
Assert(TransactionIdIsValid(RecentGlobalXmin));
440+
Assert(TransactionIdIsValid(RecentXmin));
441441

442442
/*
443443
* Initialize state for entire verification operation
@@ -1581,7 +1581,7 @@ bt_right_page_check_scankey(BtreeCheckState *state)
15811581
* does not occur until no possible index scan could land on the page.
15821582
* Index scans can follow links with nothing more than their snapshot as
15831583
* an interlock and be sure of at least that much. (See page
1584-
* recycling/RecentGlobalXmin notes in nbtree README.)
1584+
* recycling/"visible to everyone" notes in nbtree README.)
15851585
*
15861586
* Furthermore, it's okay if we follow a rightlink and find a half-dead or
15871587
* dead (ignorable) page one or more times. There will either be a

‎contrib/pg_visibility/pg_visibility.c

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -563,17 +563,14 @@ collect_corrupt_items(Oid relid, bool all_visible, bool all_frozen)
563563
BufferAccessStrategybstrategy=GetAccessStrategy(BAS_BULKREAD);
564564
TransactionIdOldestXmin=InvalidTransactionId;
565565

566-
if (all_visible)
567-
{
568-
/* Don't pass rel; that will fail in recovery. */
569-
OldestXmin=GetOldestXmin(NULL,PROCARRAY_FLAGS_VACUUM);
570-
}
571-
572566
rel=relation_open(relid,AccessShareLock);
573567

574568
/* Only some relkinds have a visibility map */
575569
check_relation_relkind(rel);
576570

571+
if (all_visible)
572+
OldestXmin=GetOldestNonRemovableTransactionId(rel);
573+
577574
nblocks=RelationGetNumberOfBlocks(rel);
578575

579576
/*
@@ -679,11 +676,12 @@ collect_corrupt_items(Oid relid, bool all_visible, bool all_frozen)
679676
* From a concurrency point of view, it sort of sucks to
680677
* retake ProcArrayLock here while we're holding the buffer
681678
* exclusively locked, but it should be safe against
682-
* deadlocks, because surely GetOldestXmin() should never take
683-
* a buffer lock. And this shouldn't happen often, so it's
684-
* worth being careful so as to avoid false positives.
679+
* deadlocks, because surely
680+
* GetOldestNonRemovableTransactionId() should never take a
681+
* buffer lock. And this shouldn't happen often, so it's worth
682+
* being careful so as to avoid false positives.
685683
*/
686-
RecomputedOldestXmin=GetOldestXmin(NULL,PROCARRAY_FLAGS_VACUUM);
684+
RecomputedOldestXmin=GetOldestNonRemovableTransactionId(rel);
687685

688686
if (!TransactionIdPrecedes(OldestXmin,RecomputedOldestXmin))
689687
record_corrupt_item(items,&tuple.t_self);

‎contrib/pgstattuple/pgstatapprox.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ statapprox_heap(Relation rel, output_type *stat)
7171
BufferAccessStrategybstrategy;
7272
TransactionIdOldestXmin;
7373

74-
OldestXmin=GetOldestXmin(rel,PROCARRAY_FLAGS_VACUUM);
74+
OldestXmin=GetOldestNonRemovableTransactionId(rel);
7575
bstrategy=GetAccessStrategy(BAS_BULKREAD);
7676

7777
nblocks=RelationGetNumberOfBlocks(rel);

‎src/backend/access/gin/ginvacuum.c

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -793,3 +793,29 @@ ginvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
793793

794794
returnstats;
795795
}
796+
797+
/*
798+
* Return whether Page can safely be recycled.
799+
*/
800+
bool
801+
GinPageIsRecyclable(Pagepage)
802+
{
803+
TransactionIddelete_xid;
804+
805+
if (PageIsNew(page))
806+
return true;
807+
808+
if (!GinPageIsDeleted(page))
809+
return false;
810+
811+
delete_xid=GinPageGetDeleteXid(page);
812+
813+
if (!TransactionIdIsValid(delete_xid))
814+
return true;
815+
816+
/*
817+
* If no backend still could view delete_xid as in running, all scans
818+
* concurrent with ginDeletePage() must have finished.
819+
*/
820+
returnGlobalVisCheckRemovableXid(NULL,delete_xid);
821+
}

‎src/backend/access/gist/gistutil.c

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -891,15 +891,13 @@ gistPageRecyclable(Page page)
891891
* As long as that can happen, we must keep the deleted page around as
892892
* a tombstone.
893893
*
894-
*Comparethe deletion XIDwith RecentGlobalXmin. If deleteXid <
895-
*RecentGlobalXmin, then no scan that's still in progress could have
894+
*For that check ifthe deletion XIDcould still be visible to
895+
*anyone. If not, then no scan that's still in progress could have
896896
* seen its downlink, and we can recycle it.
897897
*/
898898
FullTransactionIddeletexid_full=GistPageGetDeleteXid(page);
899-
FullTransactionIdrecentxmin_full=GetFullRecentGlobalXmin();
900899

901-
if (FullTransactionIdPrecedes(deletexid_full,recentxmin_full))
902-
return true;
900+
returnGlobalVisIsRemovableFullXid(NULL,deletexid_full);
903901
}
904902
return false;
905903
}

‎src/backend/access/gist/gistxlog.c

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -387,11 +387,11 @@ gistRedoPageReuse(XLogReaderState *record)
387387
* PAGE_REUSE records exist to provide a conflict point when we reuse
388388
* pages in the index via the FSM. That's all they do though.
389389
*
390-
* latestRemovedXid was the page's deleteXid. The deleteXid <
391-
*RecentGlobalXmin test in gistPageRecyclable() conceptually mirrors the
392-
* pgxact->xmin > limitXmin test in GetConflictingVirtualXIDs().
393-
* Consequently, one XID value achieves the same exclusion effect on
394-
* primary and standby.
390+
* latestRemovedXid was the page's deleteXid. The
391+
*GlobalVisIsRemovableFullXid(deleteXid) test in gistPageRecyclable()
392+
*conceptually mirrors thepgxact->xmin > limitXmin test in
393+
*GetConflictingVirtualXIDs().Consequently, one XID value achieves the
394+
*same exclusion effect onprimary and standby.
395395
*/
396396
if (InHotStandby)
397397
{

‎src/backend/access/heap/heapam.c

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1517,6 +1517,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
15171517
boolat_chain_start;
15181518
boolvalid;
15191519
boolskip;
1520+
GlobalVisState*vistest=NULL;
15201521

15211522
/* If this is not the first call, previous call returned a (live!) tuple */
15221523
if (all_dead)
@@ -1527,7 +1528,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
15271528
at_chain_start=first_call;
15281529
skip= !first_call;
15291530

1530-
Assert(TransactionIdIsValid(RecentGlobalXmin));
1531+
/* XXX: we should assert that a snapshot is pushed or registered */
1532+
Assert(TransactionIdIsValid(RecentXmin));
15311533
Assert(BufferGetBlockNumber(buffer)==blkno);
15321534

15331535
/* Scan through possible multiple members of HOT-chain */
@@ -1616,9 +1618,14 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
16161618
* Note: if you change the criterion here for what is "dead", fix the
16171619
* planner's get_actual_variable_range() function to match.
16181620
*/
1619-
if (all_dead&&*all_dead&&
1620-
!HeapTupleIsSurelyDead(heapTuple,RecentGlobalXmin))
1621-
*all_dead= false;
1621+
if (all_dead&&*all_dead)
1622+
{
1623+
if (!vistest)
1624+
vistest=GlobalVisTestFor(relation);
1625+
1626+
if (!HeapTupleIsSurelyDead(heapTuple,vistest))
1627+
*all_dead= false;
1628+
}
16221629

16231630
/*
16241631
* Check to see if HOT chain continues past this tuple; if so fetch

‎src/backend/access/heap/heapam_handler.c

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1203,7 +1203,7 @@ heapam_index_build_range_scan(Relation heapRelation,
12031203

12041204
/* okay to ignore lazy VACUUMs here */
12051205
if (!IsBootstrapProcessingMode()&& !indexInfo->ii_Concurrent)
1206-
OldestXmin=GetOldestXmin(heapRelation,PROCARRAY_FLAGS_VACUUM);
1206+
OldestXmin=GetOldestNonRemovableTransactionId(heapRelation);
12071207

12081208
if (!scan)
12091209
{
@@ -1244,6 +1244,17 @@ heapam_index_build_range_scan(Relation heapRelation,
12441244

12451245
hscan= (HeapScanDesc)scan;
12461246

1247+
/*
1248+
* Must have called GetOldestNonRemovableTransactionId() if using
1249+
* SnapshotAny. Shouldn't have for an MVCC snapshot. (It's especially
1250+
* worth checking this for parallel builds, since ambuild routines that
1251+
* support parallel builds must work these details out for themselves.)
1252+
*/
1253+
Assert(snapshot==SnapshotAny||IsMVCCSnapshot(snapshot));
1254+
Assert(snapshot==SnapshotAny ?TransactionIdIsValid(OldestXmin) :
1255+
!TransactionIdIsValid(OldestXmin));
1256+
Assert(snapshot==SnapshotAny|| !anyvisible);
1257+
12471258
/* Publish number of blocks to scan */
12481259
if (progress)
12491260
{
@@ -1263,17 +1274,6 @@ heapam_index_build_range_scan(Relation heapRelation,
12631274
nblocks);
12641275
}
12651276

1266-
/*
1267-
* Must call GetOldestXmin() with SnapshotAny. Should never call
1268-
* GetOldestXmin() with MVCC snapshot. (It's especially worth checking
1269-
* this for parallel builds, since ambuild routines that support parallel
1270-
* builds must work these details out for themselves.)
1271-
*/
1272-
Assert(snapshot==SnapshotAny||IsMVCCSnapshot(snapshot));
1273-
Assert(snapshot==SnapshotAny ?TransactionIdIsValid(OldestXmin) :
1274-
!TransactionIdIsValid(OldestXmin));
1275-
Assert(snapshot==SnapshotAny|| !anyvisible);
1276-
12771277
/* set our scan endpoints */
12781278
if (!allow_sync)
12791279
heap_setscanlimits(scan,start_blockno,numblocks);

‎src/backend/access/heap/heapam_visibility.c

Lines changed: 73 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1154,19 +1154,56 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
11541154
*we mainly want to know is if a tuple is potentially visible to *any*
11551155
*running transaction. If so, it can't be removed yet by VACUUM.
11561156
*
1157-
* OldestXmin is a cutoff XID (obtained from GetOldestXmin()). Tuples
1158-
* deleted by XIDs >= OldestXmin are deemed "recently dead"; they might
1159-
* still be visible to some open transaction, so we can't remove them,
1160-
* even if we see that the deleting transaction has committed.
1157+
* OldestXmin is a cutoff XID (obtained from
1158+
* GetOldestNonRemovableTransactionId()). Tuples deleted by XIDs >=
1159+
* OldestXmin are deemed "recently dead"; they might still be visible to some
1160+
* open transaction, so we can't remove them, even if we see that the deleting
1161+
* transaction has committed.
11611162
*/
11621163
HTSV_Result
11631164
HeapTupleSatisfiesVacuum(HeapTuplehtup,TransactionIdOldestXmin,
11641165
Bufferbuffer)
1166+
{
1167+
TransactionIddead_after=InvalidTransactionId;
1168+
HTSV_Resultres;
1169+
1170+
res=HeapTupleSatisfiesVacuumHorizon(htup,buffer,&dead_after);
1171+
1172+
if (res==HEAPTUPLE_RECENTLY_DEAD)
1173+
{
1174+
Assert(TransactionIdIsValid(dead_after));
1175+
1176+
if (TransactionIdPrecedes(dead_after,OldestXmin))
1177+
res=HEAPTUPLE_DEAD;
1178+
}
1179+
else
1180+
Assert(!TransactionIdIsValid(dead_after));
1181+
1182+
returnres;
1183+
}
1184+
1185+
/*
1186+
* Work horse for HeapTupleSatisfiesVacuum and similar routines.
1187+
*
1188+
* In contrast to HeapTupleSatisfiesVacuum this routine, when encountering a
1189+
* tuple that could still be visible to some backend, stores the xid that
1190+
* needs to be compared with the horizon in *dead_after, and returns
1191+
* HEAPTUPLE_RECENTLY_DEAD. The caller then can perform the comparison with
1192+
* the horizon. This is e.g. useful when comparing with different horizons.
1193+
*
1194+
* Note: HEAPTUPLE_DEAD can still be returned here, e.g. if the inserting
1195+
* transaction aborted.
1196+
*/
1197+
HTSV_Result
1198+
HeapTupleSatisfiesVacuumHorizon(HeapTuplehtup,Bufferbuffer,TransactionId*dead_after)
11651199
{
11661200
HeapTupleHeadertuple=htup->t_data;
11671201

11681202
Assert(ItemPointerIsValid(&htup->t_self));
11691203
Assert(htup->t_tableOid!=InvalidOid);
1204+
Assert(dead_after!=NULL);
1205+
1206+
*dead_after=InvalidTransactionId;
11701207

11711208
/*
11721209
* Has inserting transaction committed?
@@ -1323,17 +1360,15 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
13231360
elseif (TransactionIdDidCommit(xmax))
13241361
{
13251362
/*
1326-
* The multixact might still be running due to lockers.If the
1327-
*updater isbelow the xid horizon, we have to return DEAD
1328-
*regardless --otherwise we could end up with a tuple where the
1329-
*updater has tobe removed due to the horizon, but is not pruned
1330-
*away. It'snot a problem to prune that tuple, because any
1331-
*remaininglockers will also be present in newer tuple versions.
1363+
* The multixact might still be running due to lockers.Need to
1364+
*allow for pruning ifbelow the xid horizon regardless --
1365+
* otherwise we could end up with a tuple where the updater has to
1366+
* be removed due to the horizon, but is not pruned away. It's
1367+
* not a problem to prune that tuple, because any remaining
1368+
* lockers will also be present in newer tuple versions.
13321369
*/
1333-
if (!TransactionIdPrecedes(xmax,OldestXmin))
1334-
returnHEAPTUPLE_RECENTLY_DEAD;
1335-
1336-
returnHEAPTUPLE_DEAD;
1370+
*dead_after=xmax;
1371+
returnHEAPTUPLE_RECENTLY_DEAD;
13371372
}
13381373
elseif (!MultiXactIdIsRunning(HeapTupleHeaderGetRawXmax(tuple), false))
13391374
{
@@ -1372,14 +1407,11 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
13721407
}
13731408

13741409
/*
1375-
* Deleter committed,but perhapsit was recent enough that some open
1376-
* transactions could still see the tuple.
1410+
* Deleter committed,allow caller to check ifit was recent enough that
1411+
*some opentransactions could still see the tuple.
13771412
*/
1378-
if (!TransactionIdPrecedes(HeapTupleHeaderGetRawXmax(tuple),OldestXmin))
1379-
returnHEAPTUPLE_RECENTLY_DEAD;
1380-
1381-
/* Otherwise, it's dead and removable */
1382-
returnHEAPTUPLE_DEAD;
1413+
*dead_after=HeapTupleHeaderGetRawXmax(tuple);
1414+
returnHEAPTUPLE_RECENTLY_DEAD;
13831415
}
13841416

13851417

@@ -1393,14 +1425,28 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
13931425
*
13941426
*This is an interface to HeapTupleSatisfiesVacuum that's callable via
13951427
*HeapTupleSatisfiesSnapshot, so it can be used through a Snapshot.
1396-
*snapshot->xmin must have been set up with the xmin horizon to use.
1428+
*snapshot->vistest must have been set up with the horizon to use.
13971429
*/
13981430
staticbool
13991431
HeapTupleSatisfiesNonVacuumable(HeapTuplehtup,Snapshotsnapshot,
14001432
Bufferbuffer)
14011433
{
1402-
returnHeapTupleSatisfiesVacuum(htup,snapshot->xmin,buffer)
1403-
!=HEAPTUPLE_DEAD;
1434+
TransactionIddead_after=InvalidTransactionId;
1435+
HTSV_Resultres;
1436+
1437+
res=HeapTupleSatisfiesVacuumHorizon(htup,buffer,&dead_after);
1438+
1439+
if (res==HEAPTUPLE_RECENTLY_DEAD)
1440+
{
1441+
Assert(TransactionIdIsValid(dead_after));
1442+
1443+
if (GlobalVisTestIsRemovableXid(snapshot->vistest,dead_after))
1444+
res=HEAPTUPLE_DEAD;
1445+
}
1446+
else
1447+
Assert(!TransactionIdIsValid(dead_after));
1448+
1449+
returnres!=HEAPTUPLE_DEAD;
14041450
}
14051451

14061452

@@ -1418,7 +1464,7 @@ HeapTupleSatisfiesNonVacuumable(HeapTuple htup, Snapshot snapshot,
14181464
*if the tuple is removable.
14191465
*/
14201466
bool
1421-
HeapTupleIsSurelyDead(HeapTuplehtup,TransactionIdOldestXmin)
1467+
HeapTupleIsSurelyDead(HeapTuplehtup,GlobalVisState*vistest)
14221468
{
14231469
HeapTupleHeadertuple=htup->t_data;
14241470

@@ -1459,7 +1505,8 @@ HeapTupleIsSurelyDead(HeapTuple htup, TransactionId OldestXmin)
14591505
return false;
14601506

14611507
/* Deleter committed, so tuple is dead if the XID is old enough. */
1462-
returnTransactionIdPrecedes(HeapTupleHeaderGetRawXmax(tuple),OldestXmin);
1508+
returnGlobalVisTestIsRemovableXid(vistest,
1509+
HeapTupleHeaderGetRawXmax(tuple));
14631510
}
14641511

14651512
/*

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp