Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit4e5fbb3

Browse files
committed
Change the division of labor between grouping_planner and query_planner
so that the latter estimates the number of groups that grouping willproduce. This is needed because it is primarily query_planner thatmakes the decision between fast-start and fast-finish plans, and in theoriginal coding it was unable to make more than a crude rule-of-thumbchoice when the query involved grouping. This revision helps us makesaner choices for queries like SELECT ... GROUP BY ... LIMIT, as in arecent example from Mark Kirkwood. Also move the responsibility forcanonicalizing sort_pathkeys and group_pathkeys into query_planner;this information has to be available anyway to support the first change,and doing it this way lets us get rid of compare_noncanonical_pathkeysentirely.
1 parent9e56c5a commit4e5fbb3

File tree

7 files changed

+143
-205
lines changed

7 files changed

+143
-205
lines changed

‎src/backend/nodes/outfuncs.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.259 2005/08/01 20:31:08 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.260 2005/08/27 22:13:43 tgl Exp $
1212
*
1313
* NOTES
1414
* Every node type that can appear in stored rules' parsetrees *must*
@@ -1169,6 +1169,9 @@ _outPlannerInfo(StringInfo str, PlannerInfo *node)
11691169
WRITE_NODE_FIELD(full_join_clauses);
11701170
WRITE_NODE_FIELD(in_info_list);
11711171
WRITE_NODE_FIELD(query_pathkeys);
1172+
WRITE_NODE_FIELD(group_pathkeys);
1173+
WRITE_NODE_FIELD(sort_pathkeys);
1174+
WRITE_FLOAT_FIELD(tuple_fraction,"%.4f");
11721175
WRITE_BOOL_FIELD(hasJoinRTEs);
11731176
WRITE_BOOL_FIELD(hasOuterJoins);
11741177
WRITE_BOOL_FIELD(hasHavingQual);

‎src/backend/optimizer/path/pathkeys.c

Lines changed: 1 addition & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
* Portions Copyright (c) 1994, Regents of the University of California
1212
*
1313
* IDENTIFICATION
14-
* $PostgreSQL: pgsql/src/backend/optimizer/path/pathkeys.c,v 1.71 2005/07/28 22:27:00 tgl Exp $
14+
* $PostgreSQL: pgsql/src/backend/optimizer/path/pathkeys.c,v 1.72 2005/08/27 22:13:43 tgl Exp $
1515
*
1616
*-------------------------------------------------------------------------
1717
*/
@@ -800,54 +800,6 @@ compare_pathkeys(List *keys1, List *keys2)
800800
returnPATHKEYS_BETTER2;/* key2 is longer */
801801
}
802802

803-
/*
804-
* compare_noncanonical_pathkeys
805-
* Compare two pathkeys to see if they are equivalent, and if not whether
806-
* one is "better" than the other. This is used when we must compare
807-
* non-canonicalized pathkeys.
808-
*
809-
* A pathkey can be considered better than another if it is a superset:
810-
* it contains all the keys of the other plus more.For example, either
811-
* ((A) (B)) or ((A B)) is better than ((A)).
812-
*
813-
* Currently, the only user of this routine is grouping_planner(),
814-
* and it will only pass single-element sublists (from
815-
* make_pathkeys_for_sortclauses). Therefore we don't have to do the
816-
* full two-way-subset-inclusion test on each pair of sublists that is
817-
* implied by the above statement. Instead we just verify they are
818-
* singleton lists and then do an equal(). This could be improved if
819-
* necessary.
820-
*/
821-
PathKeysComparison
822-
compare_noncanonical_pathkeys(List*keys1,List*keys2)
823-
{
824-
ListCell*key1,
825-
*key2;
826-
827-
forboth(key1,keys1,key2,keys2)
828-
{
829-
List*subkey1= (List*)lfirst(key1);
830-
List*subkey2= (List*)lfirst(key2);
831-
832-
Assert(list_length(subkey1)==1);
833-
Assert(list_length(subkey2)==1);
834-
if (!equal(subkey1,subkey2))
835-
returnPATHKEYS_DIFFERENT;/* no need to keep looking */
836-
}
837-
838-
/*
839-
* If we reached the end of only one list, the other is longer and
840-
* therefore not a subset.(We assume the additional sublist(s) of
841-
* the other list are not NIL --- no pathkey list should ever have a
842-
* NIL sublist.)
843-
*/
844-
if (key1==NULL&&key2==NULL)
845-
returnPATHKEYS_EQUAL;
846-
if (key1!=NULL)
847-
returnPATHKEYS_BETTER1;/* key1 is longer */
848-
returnPATHKEYS_BETTER2;/* key2 is longer */
849-
}
850-
851803
/*
852804
* pathkeys_contained_in
853805
* Common special case of compare_pathkeys: we just want to know
@@ -867,24 +819,6 @@ pathkeys_contained_in(List *keys1, List *keys2)
867819
return false;
868820
}
869821

870-
/*
871-
* noncanonical_pathkeys_contained_in
872-
* The same, when we don't have canonical pathkeys.
873-
*/
874-
bool
875-
noncanonical_pathkeys_contained_in(List*keys1,List*keys2)
876-
{
877-
switch (compare_noncanonical_pathkeys(keys1,keys2))
878-
{
879-
casePATHKEYS_EQUAL:
880-
casePATHKEYS_BETTER2:
881-
return true;
882-
default:
883-
break;
884-
}
885-
return false;
886-
}
887-
888822
/*
889823
* get_cheapest_path_for_pathkeys
890824
* Find the cheapest path (according to the specified criterion) that

‎src/backend/optimizer/plan/planmain.c

Lines changed: 98 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
*
1515
*
1616
* IDENTIFICATION
17-
* $PostgreSQL: pgsql/src/backend/optimizer/plan/planmain.c,v 1.86 2005/07/02 23:00:41 tgl Exp $
17+
* $PostgreSQL: pgsql/src/backend/optimizer/plan/planmain.c,v 1.87 2005/08/27 22:13:43 tgl Exp $
1818
*
1919
*-------------------------------------------------------------------------
2020
*/
@@ -25,9 +25,11 @@
2525
#include"optimizer/pathnode.h"
2626
#include"optimizer/paths.h"
2727
#include"optimizer/planmain.h"
28+
#include"optimizer/tlist.h"
29+
#include"utils/selfuncs.h"
2830

2931

30-
/*--------------------
32+
/*
3133
* query_planner
3234
* Generate a path (that is, a simplified plan) for a basic query,
3335
* which may involve joins but not any fancier features.
@@ -51,6 +53,8 @@
5153
* *cheapest_path receives the overall-cheapest path for the query
5254
* *sorted_path receives the cheapest presorted path for the query,
5355
*if any (NULL if there is no useful presorted path)
56+
* *num_groups receives the estimated number of groups, or 1 if query
57+
*does not use grouping
5458
*
5559
* Note: the PlannerInfo node also includes a query_pathkeys field, which is
5660
* both an input and an output of query_planner(). The input value signals
@@ -61,17 +65,21 @@
6165
* PlannerInfo field and not a passed parameter is that the low-level routines
6266
* in indxpath.c need to see it.)
6367
*
68+
* Note: the PlannerInfo node also includes group_pathkeys and sort_pathkeys,
69+
* which like query_pathkeys need to be canonicalized once the info is
70+
* available.
71+
*
6472
* tuple_fraction is interpreted as follows:
6573
* 0: expect all tuples to be retrieved (normal case)
6674
* 0 < tuple_fraction < 1: expect the given fraction of tuples available
6775
*from the plan to be retrieved
6876
* tuple_fraction >= 1: tuple_fraction is the absolute number of tuples
6977
*expected to be retrieved (ie, a LIMIT specification)
70-
*--------------------
7178
*/
7279
void
7380
query_planner(PlannerInfo*root,List*tlist,doubletuple_fraction,
74-
Path**cheapest_path,Path**sorted_path)
81+
Path**cheapest_path,Path**sorted_path,
82+
double*num_groups)
7583
{
7684
Query*parse=root->parse;
7785
List*constant_quals;
@@ -82,6 +90,8 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
8290
/* Make tuple_fraction accessible to lower-level routines */
8391
root->tuple_fraction=tuple_fraction;
8492

93+
*num_groups=1;/* default result */
94+
8595
/*
8696
* If the query has an empty join tree, then it's something easy like
8797
* "SELECT 2+2;" or "INSERT ... VALUES()".Fall through quickly.
@@ -156,9 +166,12 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
156166
/*
157167
* We should now have all the pathkey equivalence sets built, so it's
158168
* now possible to convert the requested query_pathkeys to canonical
159-
* form.
169+
* form. Also canonicalize the groupClause and sortClause pathkeys
170+
* for use later.
160171
*/
161172
root->query_pathkeys=canonicalize_pathkeys(root,root->query_pathkeys);
173+
root->group_pathkeys=canonicalize_pathkeys(root,root->group_pathkeys);
174+
root->sort_pathkeys=canonicalize_pathkeys(root,root->sort_pathkeys);
162175

163176
/*
164177
* Ready to do the primary planning.
@@ -169,12 +182,87 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
169182
elog(ERROR,"failed to construct the join relation");
170183

171184
/*
172-
* Now that we have an estimate of the final rel's size, we can
173-
* convert a tuple_fraction specified as an absolute count (ie, a
174-
* LIMIT option) into a fraction of the total tuples.
185+
* If there's grouping going on, estimate the number of result groups.
186+
* We couldn't do this any earlier because it depends on relation size
187+
* estimates that were set up above.
188+
*
189+
* Then convert tuple_fraction to fractional form if it is absolute,
190+
* and adjust it based on the knowledge that grouping_planner will be
191+
* doing grouping or aggregation work with our result.
192+
*
193+
* This introduces some undesirable coupling between this code and
194+
* grouping_planner, but the alternatives seem even uglier; we couldn't
195+
* pass back completed paths without making these decisions here.
175196
*/
176-
if (tuple_fraction >=1.0)
177-
tuple_fraction /=final_rel->rows;
197+
if (parse->groupClause)
198+
{
199+
List*groupExprs;
200+
201+
groupExprs=get_sortgrouplist_exprs(parse->groupClause,
202+
parse->targetList);
203+
*num_groups=estimate_num_groups(root,
204+
groupExprs,
205+
final_rel->rows);
206+
207+
/*
208+
* In GROUP BY mode, an absolute LIMIT is relative to the number
209+
* of groups not the number of tuples. If the caller gave us
210+
* a fraction, keep it as-is. (In both cases, we are effectively
211+
* assuming that all the groups are about the same size.)
212+
*/
213+
if (tuple_fraction >=1.0)
214+
tuple_fraction /=*num_groups;
215+
216+
/*
217+
* If both GROUP BY and ORDER BY are specified, we will need two
218+
* levels of sort --- and, therefore, certainly need to read all
219+
* the tuples --- unless ORDER BY is a subset of GROUP BY.
220+
*/
221+
if (parse->groupClause&&parse->sortClause&&
222+
!pathkeys_contained_in(root->sort_pathkeys,root->group_pathkeys))
223+
tuple_fraction=0.0;
224+
}
225+
elseif (parse->hasAggs||root->hasHavingQual)
226+
{
227+
/*
228+
* Ungrouped aggregate will certainly want to read all the tuples,
229+
* and it will deliver a single result row (so leave *num_groups 1).
230+
*/
231+
tuple_fraction=0.0;
232+
}
233+
elseif (parse->distinctClause)
234+
{
235+
/*
236+
* Since there was no grouping or aggregation, it's reasonable to
237+
* assume the UNIQUE filter has effects comparable to GROUP BY.
238+
* Return the estimated number of output rows for use by caller.
239+
* (If DISTINCT is used with grouping, we ignore its effects for
240+
* rowcount estimation purposes; this amounts to assuming the grouped
241+
* rows are distinct already.)
242+
*/
243+
List*distinctExprs;
244+
245+
distinctExprs=get_sortgrouplist_exprs(parse->distinctClause,
246+
parse->targetList);
247+
*num_groups=estimate_num_groups(root,
248+
distinctExprs,
249+
final_rel->rows);
250+
251+
/*
252+
* Adjust tuple_fraction the same way as for GROUP BY, too.
253+
*/
254+
if (tuple_fraction >=1.0)
255+
tuple_fraction /=*num_groups;
256+
}
257+
else
258+
{
259+
/*
260+
* Plain non-grouped, non-aggregated query: an absolute tuple
261+
* fraction can be divided by the number of tuples.
262+
*/
263+
if (tuple_fraction >=1.0)
264+
tuple_fraction /=final_rel->rows;
265+
}
178266

179267
/*
180268
* Pick out the cheapest-total path and the cheapest presorted path

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp