Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitd364e88

Browse files
committed
Fix ancient thinko in mergejoin cost estimation.
"rescanratio" was computed as 1 + rescanned-tuples / total-inner-tuples,which is sensible if it's to be multiplied by total-inner-tuples or a costvalue corresponding to scanning all the inner tuples. But in reality itwas (mostly) multiplied by inner_rows or a related cost, numbers that takeinto account the possibility of stopping short of scanning the whole innerrelation thanks to a limited key range in the outer relation. This'dstill make sense if we could expect that stopping short would result in aproportional decrease in the number of tuples that have to be rescanned.It does not, however. The argument that establishes the validity of ourestimate for that number is independent of whether we scan all of the innerrelation or stop short, and experimentation also shows that stopping shortdoesn't reduce the number of rescanned tuples. So the correct calculationis 1 + rescanned-tuples / inner_rows, and we should be sure to multiplythat by inner_rows or a corresponding cost value.Most of the time this doesn't make much difference, but if we haveboth a high rescan rate (due to lots of duplicate values) and an outerkey range much smaller than the inner key range, then the error canbe significant, leading to a large underestimate of the cost associatedwith rescanning.Per report from Vijaykumar Jain. This thinko appears to go all the wayback to the introduction of the rescan estimation logic in commit70fba70, so back-patch to all supported branches.Discussion:https://postgr.es/m/CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com
1 parentf94cec6 commitd364e88

File tree

1 file changed

+8
-3
lines changed

1 file changed

+8
-3
lines changed

‎src/backend/optimizer/path/costsize.c

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2941,8 +2941,13 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
29412941
if (rescannedtuples<0)
29422942
rescannedtuples=0;
29432943
}
2944-
/* We'll inflate various costs this much to account for rescanning */
2945-
rescanratio=1.0+ (rescannedtuples /inner_path_rows);
2944+
2945+
/*
2946+
* We'll inflate various costs this much to account for rescanning. Note
2947+
* that this is to be multiplied by something involving inner_rows, or
2948+
* another number related to the portion of the inner rel we'll scan.
2949+
*/
2950+
rescanratio=1.0+ (rescannedtuples /inner_rows);
29462951

29472952
/*
29482953
* Decide whether we want to materialize the inner input to shield it from
@@ -2969,7 +2974,7 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
29692974
* of the generated Material node.
29702975
*/
29712976
mat_inner_cost=inner_run_cost+
2972-
cpu_operator_cost*inner_path_rows*rescanratio;
2977+
cpu_operator_cost*inner_rows*rescanratio;
29732978

29742979
/*
29752980
* If we don't need mark/restore at all, we don't need materialization.

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp