Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitbe3b265

Browse files
committed
Improve SELECT DISTINCT to consider hash aggregation, as well as sort/uniq,
as methods for implementing the DISTINCT step. This eliminates the formerperformance gap between DISTINCT and GROUP BY, and also makes it possibleto do SELECT DISTINCT on datatypes that only support hashing not sorting.SELECT DISTINCT ON is still always implemented by sorting; it would takeexecutor changes to support hashing that, and it's not clear it's worththe trouble.This is a release-note-worthy incompatibility from previous PG versions,since SELECT DISTINCT can no longer be counted on to deliver sorted outputwithout explicitly saying ORDER BY. (Anyone who can't cope with thatcan consider turning off enable_hashagg.)Several regression test queries needed to have ORDER BY added to preservestable output order. I fixed the ones that manifested here, but theremight be some other cases that show up on other platforms.
1 parent4abd7b4 commitbe3b265

File tree

13 files changed

+396
-111
lines changed

13 files changed

+396
-111
lines changed

‎src/backend/nodes/outfuncs.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
*
99
*
1010
* IDENTIFICATION
11-
* $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.329 2008/08/02 21:31:59 tgl Exp $
11+
* $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.330 2008/08/05 02:43:17 tgl Exp $
1212
*
1313
* NOTES
1414
* Every node type that can appear in stored rules' parsetrees *must*
@@ -1334,6 +1334,7 @@ _outPlannerInfo(StringInfo str, PlannerInfo *node)
13341334
WRITE_NODE_FIELD(append_rel_list);
13351335
WRITE_NODE_FIELD(query_pathkeys);
13361336
WRITE_NODE_FIELD(group_pathkeys);
1337+
WRITE_NODE_FIELD(distinct_pathkeys);
13371338
WRITE_NODE_FIELD(sort_pathkeys);
13381339
WRITE_FLOAT_FIELD(total_table_pages,"%.0f");
13391340
WRITE_FLOAT_FIELD(tuple_fraction,"%.4f");

‎src/backend/optimizer/plan/planmain.c

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
*
1515
*
1616
* IDENTIFICATION
17-
* $PostgreSQL: pgsql/src/backend/optimizer/plan/planmain.c,v 1.108 2008/08/03 19:10:52 tgl Exp $
17+
* $PostgreSQL: pgsql/src/backend/optimizer/plan/planmain.c,v 1.109 2008/08/05 02:43:17 tgl Exp $
1818
*
1919
*-------------------------------------------------------------------------
2020
*/
@@ -66,9 +66,9 @@
6666
* PlannerInfo field and not a passed parameter is that the low-level routines
6767
* in indxpath.c need to see it.)
6868
*
69-
* Note: the PlannerInfo node also includes group_pathkeys and sort_pathkeys,
70-
* which like query_pathkeys need to be canonicalized once the info is
71-
* available.
69+
* Note: the PlannerInfo node also includes group_pathkeys, distinct_pathkeys,
70+
*and sort_pathkeys,which like query_pathkeys need to be canonicalized once
71+
*the info isavailable.
7272
*
7373
* tuple_fraction is interpreted as follows:
7474
* 0: expect all tuples to be retrieved (normal case)
@@ -120,6 +120,8 @@ query_planner(PlannerInfo *root, List *tlist,
120120
root->query_pathkeys);
121121
root->group_pathkeys=canonicalize_pathkeys(root,
122122
root->group_pathkeys);
123+
root->distinct_pathkeys=canonicalize_pathkeys(root,
124+
root->distinct_pathkeys);
123125
root->sort_pathkeys=canonicalize_pathkeys(root,
124126
root->sort_pathkeys);
125127
return;
@@ -237,10 +239,12 @@ query_planner(PlannerInfo *root, List *tlist,
237239
/*
238240
* We have completed merging equivalence sets, so it's now possible to
239241
* convert the requested query_pathkeys to canonical form.Also
240-
* canonicalize the groupClause and sortClause pathkeys for use later.
242+
* canonicalize the groupClause, distinctClause and sortClause pathkeys
243+
* for use later.
241244
*/
242245
root->query_pathkeys=canonicalize_pathkeys(root,root->query_pathkeys);
243246
root->group_pathkeys=canonicalize_pathkeys(root,root->group_pathkeys);
247+
root->distinct_pathkeys=canonicalize_pathkeys(root,root->distinct_pathkeys);
244248
root->sort_pathkeys=canonicalize_pathkeys(root,root->sort_pathkeys);
245249

246250
/*
@@ -286,9 +290,11 @@ query_planner(PlannerInfo *root, List *tlist,
286290
/*
287291
* If both GROUP BY and ORDER BY are specified, we will need two
288292
* levels of sort --- and, therefore, certainly need to read all the
289-
* tuples --- unless ORDER BY is a subset of GROUP BY.
293+
* tuples --- unless ORDER BY is a subset of GROUP BY. Likewise if
294+
* we have both DISTINCT and GROUP BY.
290295
*/
291-
if (!pathkeys_contained_in(root->sort_pathkeys,root->group_pathkeys))
296+
if (!pathkeys_contained_in(root->sort_pathkeys,root->group_pathkeys)||
297+
!pathkeys_contained_in(root->distinct_pathkeys,root->group_pathkeys))
292298
tuple_fraction=0.0;
293299
}
294300
elseif (parse->hasAggs||root->hasHavingQual)

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp