Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitfccebe4

Browse files
committed
Use SnapshotDirty rather than an active snapshot to probe index endpoints.
If there are lots of uncommitted tuples at the end of the index range,get_actual_variable_range() ends up fetching each one and doing an MVCCvisibility check on it, until it finally hits a visible tuple. This isbad enough in isolation, considering that we don't need an exact answeronly an approximate one. But because the tuples are not yet committed,each visibility check does a TransactionIdIsInProgress() test, whichinvolves scanning the ProcArray. When multiple sessions do thisconcurrently, the ensuing contention results in horrid performance loss.20X overall throughput loss on not-too-complicated queries is easy todemonstrate in the back branches (though someone's made it noticeablyless bad in HEAD).We can dodge the problem fairly effectively by using SnapshotDirty ratherthan a normal MVCC snapshot. This will cause the index probe to takeuncommitted tuples as good, so that we incur only one tuple fetch and testeven if there are many such tuples. The extent to which this degrades theestimate is debatable: it's possible the result is actually a more accurateprediction than before, if the endmost tuple has become committed by thetime we actually execute the query being planned. In any case, it's notvery likely that it makes the estimate a lot worse.SnapshotDirty will still reject tuples that are known committed dead, sowe won't give bogus answers if an invalid outlier has been deleted but notyet vacuumed from the index. (Because btrees know how to mark such tuplesdead in the index, we shouldn't have a big performance problem in the casethat there are many of them at the end of the range.) This considerationmotivates not using SnapshotAny, which was also considered as a fix.Note: the back branches were using SnapshotNow instead of an MVCC snapshot,but the problem and solution are the same.Per performance complaints from Bartlomiej Romanski, Josh Berkus, andothers. Back-patch to 9.0, where the issue was introduced (by commit40608e7).
1 parentcf6aa68 commitfccebe4

File tree

1 file changed

+21
-5
lines changed

1 file changed

+21
-5
lines changed

‎src/backend/utils/adt/selfuncs.c

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,6 @@
133133
#include"utils/pg_locale.h"
134134
#include"utils/rel.h"
135135
#include"utils/selfuncs.h"
136-
#include"utils/snapmgr.h"
137136
#include"utils/spccache.h"
138137
#include"utils/syscache.h"
139138
#include"utils/timestamp.h"
@@ -4962,6 +4961,7 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
49624961
HeapTupletup;
49634962
Datumvalues[INDEX_MAX_KEYS];
49644963
boolisnull[INDEX_MAX_KEYS];
4964+
SnapshotDataSnapshotDirty;
49654965

49664966
estate=CreateExecutorState();
49674967
econtext=GetPerTupleExprContext(estate);
@@ -4984,6 +4984,7 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
49844984
slot=MakeSingleTupleTableSlot(RelationGetDescr(heapRel));
49854985
econtext->ecxt_scantuple=slot;
49864986
get_typlenbyval(vardata->atttype,&typLen,&typByVal);
4987+
InitDirtySnapshot(SnapshotDirty);
49874988

49884989
/* set up an IS NOT NULL scan key so that we ignore nulls */
49894990
ScanKeyEntryInitialize(&scankeys[0],
@@ -5000,8 +5001,23 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
50005001
/* If min is requested ... */
50015002
if (min)
50025003
{
5003-
index_scan=index_beginscan(heapRel,indexRel,
5004-
GetActiveSnapshot(),1,0);
5004+
/*
5005+
* In principle, we should scan the index with our current
5006+
* active snapshot, which is the best approximation we've got
5007+
* to what the query will see when executed. But that won't
5008+
* be exact if a new snap is taken before running the query,
5009+
* and it can be very expensive if a lot of uncommitted rows
5010+
* exist at the end of the index (because we'll laboriously
5011+
* fetch each one and reject it). What seems like a good
5012+
* compromise is to use SnapshotDirty.That will accept
5013+
* uncommitted rows, and thus avoid fetching multiple heap
5014+
* tuples in this scenario. On the other hand, it will reject
5015+
* known-dead rows, and thus not give a bogus answer when the
5016+
* extreme value has been deleted; that case motivates not
5017+
* using SnapshotAny here.
5018+
*/
5019+
index_scan=index_beginscan(heapRel,indexRel,&SnapshotDirty,
5020+
1,0);
50055021
index_rescan(index_scan,scankeys,1,NULL,0);
50065022

50075023
/* Fetch first tuple in sortop's direction */
@@ -5032,8 +5048,8 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
50325048
/* If max is requested, and we didn't find the index is empty */
50335049
if (max&&have_data)
50345050
{
5035-
index_scan=index_beginscan(heapRel,indexRel,
5036-
GetActiveSnapshot(),1,0);
5051+
index_scan=index_beginscan(heapRel,indexRel,&SnapshotDirty,
5052+
1,0);
50375053
index_rescan(index_scan,scankeys,1,NULL,0);
50385054

50395055
/* Fetch first tuple in reverse direction */

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp