Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitdf3a66e

Browse files
committed
Improve planner's handling of set-returning functions in grouping columns.
Improve query_is_distinct_for() to accept SRFs in the targetlist whenwe can prove distinctness from a DISTINCT clause. In that case thede-duplication will surely happen after SRF expansion, so the proofstill works. Continue to punt in the case where we'd try to provedistinctness from GROUP BY (or, in the future, source relations).To do that, we'd have to determine whether the SRFs were in thegrouping columns or elsewhere in the tlist, and it still doesn'tseem worth the trouble. But this trivial change allows us torecognize that "SELECT DISTINCT unnest(foo) FROM ..." producesunique-ified output, which seems worth having.Also, fix estimate_num_groups() to consider the possibility of SRFs inthe grouping columns. Its failure to do so was masked before v10 becausegrouping_planner() scaled up plan rowcount estimates by the estimated SRFmultiplier after performing grouping. That doesn't happen anymore, whichis more correct, but it means we need an adjustment in the estimate forthe number of groups. Failure to do this leads to an underestimate forthe number of output rows of subqueries like "SELECT DISTINCT unnest(foo)"compared to what 9.6 and earlier estimated, thus breaking plan choicesin some cases.Per report from Dmitry Shalashov. Back-patch to v10 to avoid degradedplan choices compared to previous releases.Discussion:https://postgr.es/m/CAKPeCUGAeHgoh5O=SvcQxREVkoX7UdeJUMj1F5=aBNvoTa+O8w@mail.gmail.com
1 parentb10967e commitdf3a66e

File tree

2 files changed

+41
-14
lines changed

2 files changed

+41
-14
lines changed

‎src/backend/optimizer/plan/analyzejoins.c

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -744,8 +744,8 @@ rel_is_distinct_for(PlannerInfo *root, RelOptInfo *rel, List *clause_list)
744744
bool
745745
query_supports_distinctness(Query*query)
746746
{
747-
/*we don't copewithSRFs, see comment below */
748-
if (query->hasTargetSRFs)
747+
/*SRFs break distinctness exceptwithDISTINCT, see below */
748+
if (query->hasTargetSRFs&&query->distinctClause==NIL)
749749
return false;
750750

751751
/* check for features we can prove distinctness with */
@@ -786,21 +786,11 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
786786

787787
Assert(list_length(colnos)==list_length(opids));
788788

789-
/*
790-
* A set-returning function in the query's targetlist can result in
791-
* returning duplicate rows, if the SRF is evaluated after the
792-
* de-duplication step; so we play it safe and say "no" if there are any
793-
* SRFs. (We could be certain that it's okay if SRFs appear only in the
794-
* specified columns, since those must be evaluated before de-duplication;
795-
* but it doesn't presently seem worth the complication to check that.)
796-
*/
797-
if (query->hasTargetSRFs)
798-
return false;
799-
800789
/*
801790
* DISTINCT (including DISTINCT ON) guarantees uniqueness if all the
802791
* columns in the DISTINCT clause appear in colnos and operator semantics
803-
* match.
792+
* match. This is true even if there are SRFs in the DISTINCT columns or
793+
* elsewhere in the tlist.
804794
*/
805795
if (query->distinctClause)
806796
{
@@ -819,6 +809,16 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
819809
return true;
820810
}
821811

812+
/*
813+
* Otherwise, a set-returning function in the query's targetlist can
814+
* result in returning duplicate rows, despite any grouping that might
815+
* occur before tlist evaluation. (If all tlist SRFs are within GROUP BY
816+
* columns, it would be safe because they'd be expanded before grouping.
817+
* But it doesn't currently seem worth the effort to check for that.)
818+
*/
819+
if (query->hasTargetSRFs)
820+
return false;
821+
822822
/*
823823
* Similarly, GROUP BY without GROUPING SETS guarantees uniqueness if all
824824
* the grouped columns appear in colnos and operator semantics match.

‎src/backend/utils/adt/selfuncs.c

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3361,6 +3361,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
33613361
List**pgset)
33623362
{
33633363
List*varinfos=NIL;
3364+
doublesrf_multiplier=1.0;
33643365
doublenumdistinct;
33653366
ListCell*l;
33663367
inti;
@@ -3394,6 +3395,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
33943395
foreach(l,groupExprs)
33953396
{
33963397
Node*groupexpr= (Node*)lfirst(l);
3398+
doublethis_srf_multiplier;
33973399
VariableStatDatavardata;
33983400
List*varshere;
33993401
ListCell*l2;
@@ -3402,6 +3404,21 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
34023404
if (pgset&& !list_member_int(*pgset,i++))
34033405
continue;
34043406

3407+
/*
3408+
* Set-returning functions in grouping columns are a bit problematic.
3409+
* The code below will effectively ignore their SRF nature and come up
3410+
* with a numdistinct estimate as though they were scalar functions.
3411+
* We compensate by scaling up the end result by the largest SRF
3412+
* rowcount estimate. (This will be an overestimate if the SRF
3413+
* produces multiple copies of any output value, but it seems best to
3414+
* assume the SRF's outputs are distinct. In any case, it's probably
3415+
* pointless to worry too much about this without much better
3416+
* estimates for SRF output rowcounts than we have today.)
3417+
*/
3418+
this_srf_multiplier=expression_returns_set_rows(groupexpr);
3419+
if (srf_multiplier<this_srf_multiplier)
3420+
srf_multiplier=this_srf_multiplier;
3421+
34053422
/* Short-circuit for expressions returning boolean */
34063423
if (exprType(groupexpr)==BOOLOID)
34073424
{
@@ -3467,9 +3484,15 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
34673484
*/
34683485
if (varinfos==NIL)
34693486
{
3487+
/* Apply SRF multiplier as we would do in the long path */
3488+
numdistinct *=srf_multiplier;
3489+
/* Round off */
3490+
numdistinct=ceil(numdistinct);
34703491
/* Guard against out-of-range answers */
34713492
if (numdistinct>input_rows)
34723493
numdistinct=input_rows;
3494+
if (numdistinct<1.0)
3495+
numdistinct=1.0;
34733496
returnnumdistinct;
34743497
}
34753498

@@ -3638,6 +3661,10 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
36383661
varinfos=newvarinfos;
36393662
}while (varinfos!=NIL);
36403663

3664+
/* Now we can account for the effects of any SRFs */
3665+
numdistinct *=srf_multiplier;
3666+
3667+
/* Round off */
36413668
numdistinct=ceil(numdistinct);
36423669

36433670
/* Guard against out-of-range answers */

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp