Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit4a29eab

Browse files
committed
Remove pessimistic cost penalization from Incremental Sort
When incremental sorts were added in v13 a 1.5x pessimism factor was addedto the cost modal. Seemingly this was done because the cost modal onlyhas an estimate of the total number of input rows and the number ofpresorted groups. It assumes that the input rows will be evenlydistributed throughout the presorted groups. The 1.5x pessimism factorwas added to slightly reduce the likelihood of incremental sorts beingused in the hope to avoid performance regressions where an incrementalsort plan was picked and turned out slower due to a large skew in thenumber of rows in the presorted groups.An additional quirk with the path generation code meant that we couldconsider both a sort and an incremental sort on paths with presorted keys.This meant that with the pessimism factor, it was possible that we optedto perform a sort rather than an incremental sort when the given path hadpresorted keys.Here we remove the 1.5x pessimism factor to allow incremental sorts tohave a fairer chance at being chosen against a full sort.Previously we would generally create a sort path on the cheapest inputpath (if that wasn't sorted already) and incremental sort paths on anypath which had presorted keys. This meant that if the cheapest input pathwasn't completely sorted but happened to have presorted keys, we wouldcreate a full sort path *and* an incremental sort path on that input path.Here we change this logic so that if there are presorted keys, we onlycreate an incremental sort path, and create sort paths only when a fullsort is required.Both the removal of the cost pessimism factor and the changes made to thepath generation make it more likely that incremental sorts will now bechosen. That, of course, as with teaching the planner any new tricks,means an increased likelihood that the planner will perform an incrementalsort when it's not the best method. Our standard escape hatch for thesecases is an enable_* GUC. enable_incremental_sort already exists forthis.This came out of a report by Pavel Luzanov where he mentioned that themaster branch was choosing to perform a Seq Scan -> Sort -> GroupAggregate for his query with an ORDER BY aggregate function. The v15 planfor his query performed an Index Scan -> Group Aggregate, of course, theaggregate performed the final sort internally in nodeAgg.c for theaggregate's ORDER BY. The ideal plan would have been to use the index,which provided partially sorted input then use an incremental sort toprovide the aggregate with the sorted input. This was not being chosendue to the pessimism in the incremental sort cost modal, so here we removethat and rationalize the path generation so that sort and incremental sortplans don't have to needlessly compete. We assume that it's senselessto ever use a full sort on a given input path where an incremental sortcan be performed.Reported-by: Pavel LuzanovReviewed-by: Richard GuoDiscussion:https://postgr.es/m/9f61ddbf-2989-1536-b31e-6459370a6baa%40postgrespro.ru
1 parent8b6b043 commit4a29eab

File tree

8 files changed

+220
-419
lines changed

8 files changed

+220
-419
lines changed

‎contrib/postgres_fdw/expected/postgres_fdw.out

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2796,6 +2796,7 @@ select c2/2, sum(c2) * (c2/2) from ft1 group by c2/2 order by c2/2;
27962796
(5 rows)
27972797

27982798
-- Aggregates in subquery are pushed down.
2799+
set enable_incremental_sort = off;
27992800
explain (verbose, costs off)
28002801
select count(x.a), sum(x.a) from (select c2 a, sum(c1) b from ft1 group by c2, sqrt(c1) order by 1, 2) x;
28012802
QUERY PLAN
@@ -2814,6 +2815,7 @@ select count(x.a), sum(x.a) from (select c2 a, sum(c1) b from ft1 group by c2, s
28142815
1000 | 4500
28152816
(1 row)
28162817

2818+
reset enable_incremental_sort;
28172819
-- Aggregate is still pushed down by taking unshippable expression out
28182820
explain (verbose, costs off)
28192821
select c2 * (random() <= 1)::int as sum1, sum(c1) * c2 as sum2 from ft1 group by c2 order by 1, 2;

‎contrib/postgres_fdw/sql/postgres_fdw.sql

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -750,9 +750,11 @@ select c2/2, sum(c2) * (c2/2) from ft1 group by c2/2 order by c2/2;
750750
select c2/2,sum(c2)* (c2/2)from ft1group by c2/2order by c2/2;
751751

752752
-- Aggregates in subquery are pushed down.
753+
set enable_incremental_sort= off;
753754
explain (verbose, costs off)
754755
selectcount(x.a),sum(x.a)from (select c2 a,sum(c1) bfrom ft1group by c2, sqrt(c1)order by1,2) x;
755756
selectcount(x.a),sum(x.a)from (select c2 a,sum(c1) bfrom ft1group by c2, sqrt(c1)order by1,2) x;
757+
reset enable_incremental_sort;
756758

757759
-- Aggregate is still pushed down by taking unshippable expression out
758760
explain (verbose, costs off)

‎src/backend/optimizer/path/allpaths.c

Lines changed: 38 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -3207,70 +3207,52 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
32073207
continue;
32083208

32093209
/*
3210-
* Consider regular sort for the cheapest partial path (for each
3211-
* useful pathkeys). We know the path is not sorted, because we'd
3212-
* not get here otherwise.
3210+
* Try at least sorting the cheapest path and also try
3211+
* incrementally sorting any path which is partially sorted
3212+
* already (no need to deal with paths which have presorted keys
3213+
* when incremental sort is disabled unless it's the cheapest
3214+
* input path).
3215+
*/
3216+
if (subpath!=cheapest_partial_path&&
3217+
(presorted_keys==0|| !enable_incremental_sort))
3218+
continue;
3219+
3220+
/*
3221+
* Consider regular sort for any path that's not presorted or if
3222+
* incremental sort is disabled. We've no need to consider both
3223+
* sort and incremental sort on the same path. We assume that
3224+
* incremental sort is always faster when there are presorted
3225+
* keys.
32133226
*
32143227
* This is not redundant with the gather paths created in
32153228
* generate_gather_paths, because that doesn't generate ordered
32163229
* output. Here we add an explicit sort to match the useful
32173230
* ordering.
32183231
*/
3219-
if (cheapest_partial_path==subpath)
3232+
if (presorted_keys==0|| !enable_incremental_sort)
32203233
{
3221-
Path*tmp;
3222-
3223-
tmp= (Path*)create_sort_path(root,
3224-
rel,
3225-
subpath,
3226-
useful_pathkeys,
3227-
-1.0);
3228-
3229-
rows=tmp->rows*tmp->parallel_workers;
3230-
3231-
path=create_gather_merge_path(root,rel,
3232-
tmp,
3233-
rel->reltarget,
3234-
tmp->pathkeys,
3235-
NULL,
3236-
rowsp);
3237-
3238-
add_path(rel,&path->path);
3239-
3240-
/* Fall through */
3241-
}
3242-
3243-
/*
3244-
* Consider incremental sort, but only when the subpath is already
3245-
* partially sorted on a pathkey prefix.
3246-
*/
3247-
if (enable_incremental_sort&&presorted_keys>0)
3248-
{
3249-
Path*tmp;
3250-
3251-
/*
3252-
* We should have already excluded pathkeys of length 1
3253-
* because then presorted_keys > 0 would imply is_sorted was
3254-
* true.
3255-
*/
3256-
Assert(list_length(useful_pathkeys)!=1);
3257-
3258-
tmp= (Path*)create_incremental_sort_path(root,
3259-
rel,
3260-
subpath,
3261-
useful_pathkeys,
3262-
presorted_keys,
3263-
-1);
3264-
3265-
path=create_gather_merge_path(root,rel,
3266-
tmp,
3267-
rel->reltarget,
3268-
tmp->pathkeys,
3269-
NULL,
3270-
rowsp);
3271-
3272-
add_path(rel,&path->path);
3234+
subpath= (Path*)create_sort_path(root,
3235+
rel,
3236+
subpath,
3237+
useful_pathkeys,
3238+
-1.0);
3239+
rows=subpath->rows*subpath->parallel_workers;
32733240
}
3241+
else
3242+
subpath= (Path*)create_incremental_sort_path(root,
3243+
rel,
3244+
subpath,
3245+
useful_pathkeys,
3246+
presorted_keys,
3247+
-1);
3248+
path=create_gather_merge_path(root,rel,
3249+
subpath,
3250+
rel->reltarget,
3251+
subpath->pathkeys,
3252+
NULL,
3253+
rowsp);
3254+
3255+
add_path(rel,&path->path);
32743256
}
32753257
}
32763258
}

‎src/backend/optimizer/path/costsize.c

Lines changed: 18 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1959,8 +1959,8 @@ cost_incremental_sort(Path *path,
19591959
doubleinput_tuples,intwidth,Costcomparison_cost,intsort_mem,
19601960
doublelimit_tuples)
19611961
{
1962-
Coststartup_cost=0,
1963-
run_cost=0,
1962+
Coststartup_cost,
1963+
run_cost,
19641964
input_run_cost=input_total_cost-input_startup_cost;
19651965
doublegroup_tuples,
19661966
input_groups;
@@ -1969,10 +1969,9 @@ cost_incremental_sort(Path *path,
19691969
group_input_run_cost;
19701970
List*presortedExprs=NIL;
19711971
ListCell*l;
1972-
inti=0;
19731972
boolunknown_varno= false;
19741973

1975-
Assert(presorted_keys!=0);
1974+
Assert(presorted_keys>0&&presorted_keys<list_length(pathkeys));
19761975

19771976
/*
19781977
* We want to be sure the cost of a sort is never estimated as zero, even
@@ -2025,12 +2024,11 @@ cost_incremental_sort(Path *path,
20252024
/* expression not containing any Vars with "varno 0" */
20262025
presortedExprs=lappend(presortedExprs,member->em_expr);
20272026

2028-
i++;
2029-
if (i >=presorted_keys)
2027+
if (foreach_current_index(l)+1 >=presorted_keys)
20302028
break;
20312029
}
20322030

2033-
/* Estimate number of groups with equal presorted keys. */
2031+
/* Estimatethenumber of groups with equal presorted keys. */
20342032
if (!unknown_varno)
20352033
input_groups=estimate_num_groups(root,presortedExprs,input_tuples,
20362034
NULL,NULL);
@@ -2039,40 +2037,40 @@ cost_incremental_sort(Path *path,
20392037
group_input_run_cost=input_run_cost /input_groups;
20402038

20412039
/*
2042-
* Estimate average cost of sorting of one group where presorted keys are
2043-
* equal. Incremental sort is sensitive to distribution of tuples to the
2044-
* groups, where we're relying on quite rough assumptions. Thus, we're
2045-
* pessimistic about incremental sort performance and increase its average
2046-
* group size by half.
2040+
* Estimate the average cost of sorting of one group where presorted keys
2041+
* are equal.
20472042
*/
20482043
cost_tuplesort(&group_startup_cost,&group_run_cost,
2049-
1.5*group_tuples,width,comparison_cost,sort_mem,
2044+
group_tuples,width,comparison_cost,sort_mem,
20502045
limit_tuples);
20512046

20522047
/*
20532048
* Startup cost of incremental sort is the startup cost of its first group
20542049
* plus the cost of its input.
20552050
*/
2056-
startup_cost+=group_startup_cost
2057-
+input_startup_cost+group_input_run_cost;
2051+
startup_cost=group_startup_cost+input_startup_cost+
2052+
group_input_run_cost;
20582053

20592054
/*
20602055
* After we started producing tuples from the first group, the cost of
20612056
* producing all the tuples is given by the cost to finish processing this
20622057
* group, plus the total cost to process the remaining groups, plus the
20632058
* remaining cost of input.
20642059
*/
2065-
run_cost+=group_run_cost
2066-
+ (group_run_cost+group_startup_cost)* (input_groups-1)
2067-
+group_input_run_cost* (input_groups-1);
2060+
run_cost=group_run_cost+ (group_run_cost+group_startup_cost)*
2061+
(input_groups-1)+group_input_run_cost* (input_groups-1);
20682062

20692063
/*
20702064
* Incremental sort adds some overhead by itself. Firstly, it has to
20712065
* detect the sort groups. This is roughly equal to one extra copy and
2072-
* comparison per tuple. Secondly, it has to reset the tuplesort context
2073-
* for every group.
2066+
* comparison per tuple.
20742067
*/
20752068
run_cost+= (cpu_tuple_cost+comparison_cost)*input_tuples;
2069+
2070+
/*
2071+
* Additionally, we charge double cpu_tuple_cost for each input group to
2072+
* account for the tuplesort_reset that's performed after each group.
2073+
*/
20762074
run_cost+=2.0*cpu_tuple_cost*input_groups;
20772075

20782076
path->rows=input_tuples;

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp