Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitb9fc2d0

Browse files
committed
Improve planner's handling of set-returning functions in grouping columns.
Improve query_is_distinct_for() to accept SRFs in the targetlist whenwe can prove distinctness from a DISTINCT clause. In that case thede-duplication will surely happen after SRF expansion, so the proofstill works. Continue to punt in the case where we'd try to provedistinctness from GROUP BY (or, in the future, source relations).To do that, we'd have to determine whether the SRFs were in thegrouping columns or elsewhere in the tlist, and it still doesn'tseem worth the trouble. But this trivial change allows us torecognize that "SELECT DISTINCT unnest(foo) FROM ..." producesunique-ified output, which seems worth having.Also, fix estimate_num_groups() to consider the possibility of SRFs inthe grouping columns. Its failure to do so was masked before v10 becausegrouping_planner() scaled up plan rowcount estimates by the estimated SRFmultiplier after performing grouping. That doesn't happen anymore, whichis more correct, but it means we need an adjustment in the estimate forthe number of groups. Failure to do this leads to an underestimate forthe number of output rows of subqueries like "SELECT DISTINCT unnest(foo)"compared to what 9.6 and earlier estimated, thus breaking plan choicesin some cases.Per report from Dmitry Shalashov. Back-patch to v10 to avoid degradedplan choices compared to previous releases.Discussion:https://postgr.es/m/CAKPeCUGAeHgoh5O=SvcQxREVkoX7UdeJUMj1F5=aBNvoTa+O8w@mail.gmail.com
1 parenta1187c4 commitb9fc2d0

File tree

2 files changed

+41
-14
lines changed

2 files changed

+41
-14
lines changed

‎src/backend/optimizer/plan/analyzejoins.c

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -744,8 +744,8 @@ rel_is_distinct_for(PlannerInfo *root, RelOptInfo *rel, List *clause_list)
744744
bool
745745
query_supports_distinctness(Query*query)
746746
{
747-
/*we don't copewithSRFs, see comment below */
748-
if (query->hasTargetSRFs)
747+
/*SRFs break distinctness exceptwithDISTINCT, see below */
748+
if (query->hasTargetSRFs&&query->distinctClause==NIL)
749749
return false;
750750

751751
/* check for features we can prove distinctness with */
@@ -786,21 +786,11 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
786786

787787
Assert(list_length(colnos)==list_length(opids));
788788

789-
/*
790-
* A set-returning function in the query's targetlist can result in
791-
* returning duplicate rows, if the SRF is evaluated after the
792-
* de-duplication step; so we play it safe and say "no" if there are any
793-
* SRFs. (We could be certain that it's okay if SRFs appear only in the
794-
* specified columns, since those must be evaluated before de-duplication;
795-
* but it doesn't presently seem worth the complication to check that.)
796-
*/
797-
if (query->hasTargetSRFs)
798-
return false;
799-
800789
/*
801790
* DISTINCT (including DISTINCT ON) guarantees uniqueness if all the
802791
* columns in the DISTINCT clause appear in colnos and operator semantics
803-
* match.
792+
* match. This is true even if there are SRFs in the DISTINCT columns or
793+
* elsewhere in the tlist.
804794
*/
805795
if (query->distinctClause)
806796
{
@@ -819,6 +809,16 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
819809
return true;
820810
}
821811

812+
/*
813+
* Otherwise, a set-returning function in the query's targetlist can
814+
* result in returning duplicate rows, despite any grouping that might
815+
* occur before tlist evaluation. (If all tlist SRFs are within GROUP BY
816+
* columns, it would be safe because they'd be expanded before grouping.
817+
* But it doesn't currently seem worth the effort to check for that.)
818+
*/
819+
if (query->hasTargetSRFs)
820+
return false;
821+
822822
/*
823823
* Similarly, GROUP BY without GROUPING SETS guarantees uniqueness if all
824824
* the grouped columns appear in colnos and operator semantics match.

‎src/backend/utils/adt/selfuncs.c

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3270,6 +3270,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
32703270
List**pgset)
32713271
{
32723272
List*varinfos=NIL;
3273+
doublesrf_multiplier=1.0;
32733274
doublenumdistinct;
32743275
ListCell*l;
32753276
inti;
@@ -3303,6 +3304,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
33033304
foreach(l,groupExprs)
33043305
{
33053306
Node*groupexpr= (Node*)lfirst(l);
3307+
doublethis_srf_multiplier;
33063308
VariableStatDatavardata;
33073309
List*varshere;
33083310
ListCell*l2;
@@ -3311,6 +3313,21 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
33113313
if (pgset&& !list_member_int(*pgset,i++))
33123314
continue;
33133315

3316+
/*
3317+
* Set-returning functions in grouping columns are a bit problematic.
3318+
* The code below will effectively ignore their SRF nature and come up
3319+
* with a numdistinct estimate as though they were scalar functions.
3320+
* We compensate by scaling up the end result by the largest SRF
3321+
* rowcount estimate. (This will be an overestimate if the SRF
3322+
* produces multiple copies of any output value, but it seems best to
3323+
* assume the SRF's outputs are distinct. In any case, it's probably
3324+
* pointless to worry too much about this without much better
3325+
* estimates for SRF output rowcounts than we have today.)
3326+
*/
3327+
this_srf_multiplier=expression_returns_set_rows(groupexpr);
3328+
if (srf_multiplier<this_srf_multiplier)
3329+
srf_multiplier=this_srf_multiplier;
3330+
33143331
/* Short-circuit for expressions returning boolean */
33153332
if (exprType(groupexpr)==BOOLOID)
33163333
{
@@ -3376,9 +3393,15 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
33763393
*/
33773394
if (varinfos==NIL)
33783395
{
3396+
/* Apply SRF multiplier as we would do in the long path */
3397+
numdistinct *=srf_multiplier;
3398+
/* Round off */
3399+
numdistinct=ceil(numdistinct);
33793400
/* Guard against out-of-range answers */
33803401
if (numdistinct>input_rows)
33813402
numdistinct=input_rows;
3403+
if (numdistinct<1.0)
3404+
numdistinct=1.0;
33823405
returnnumdistinct;
33833406
}
33843407

@@ -3547,6 +3570,10 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
35473570
varinfos=newvarinfos;
35483571
}while (varinfos!=NIL);
35493572

3573+
/* Now we can account for the effects of any SRFs */
3574+
numdistinct *=srf_multiplier;
3575+
3576+
/* Round off */
35503577
numdistinct=ceil(numdistinct);
35513578

35523579
/* Guard against out-of-range answers */

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp