NotificationsYou must be signed in to change notification settings
Fork6
Star31

Commitdb0d67d

committed

Optimize order of GROUP BY keys

When evaluating a query with a multi-column GROUP BY clause using sort,the cost may be heavily dependent on the order in which the keys arecompared when building the groups. Grouping does not imply any ordering,so we're allowed to compare the keys in arbitrary order, and a Hash Aggleverages this. But for Group Agg, we simply compared keys in the orderas specified in the query. This commit explores alternative ordering ofthe keys, trying to find a cheaper one.In principle, we might generate grouping paths for all permutations ofthe keys, and leave the rest to the optimizer. But that might get veryexpensive, so we try to pick only a couple interesting orderings basedon both local and global information.When planning the grouping path, we explore statistics (number ofdistinct values, cost of the comparison function) for the keys andreorder them to minimize comparison costs. Intuitively, it may be betterto perform more expensive comparisons (for complex data types etc.)last, because maybe the cheaper comparisons will be enough. Similarly,the higher the cardinality of a key, the lower the probability we’llneed to compare more keys. The patch generates and costs variousorderings, picking the cheapest ones.The ordering of group keys may interact with other parts of the query,some of which may not be known while planning the grouping. E.g. theremay be an explicit ORDER BY clause, or some other ordering-dependentoperation, higher up in the query, and using the same ordering may allowusing either incremental sort or even eliminate the sort entirely.The patch generates orderings and picks those minimizing the comparisoncost (for various pathkeys), and then adds orderings that might beuseful for operations higher up in the plan (ORDER BY, etc.). Finally,it always keeps the ordering specified in the query, on the assumptionthe user might have additional insights.This introduces a new GUC enable_group_by_reordering, so that theoptimization may be disabled if needed.The original patch was proposed by Teodor Sigaev, and later improved andreworked by Dmitry Dolgov. Reviews by a number of people, including me,Andrey Lepikhov, Claudio Freire, Ibrar Ahmed and Zhihong Yu.Author: Dmitry Dolgov, Teodor Sigaev, Tomas VondraReviewed-by: Tomas Vondra, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed, Zhihong YuDiscussion:https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ruDiscussion:https://postgr.es/m/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com

1 parent606948b commitdb0d67dCopy full SHA for db0d67d

File tree

24 files changed

+1882

-494

lines changed

contrib/postgres_fdw/expected
- postgres_fdw.out
doc/src/sgml
- config.sgml
src
- backend
  - optimizer
    - path
    - plan
      - planner.c
    - util
      - pathnode.c
  - utils
    - adt
      - selfuncs.c
    - misc
      - guc.c
      - postgresql.conf.sample
- include
  - nodes
    - nodes.h
    - pathnodes.h
  - optimizer
    - cost.h
    - paths.h
  - utils
    - selfuncs.h
- test/regress
  - expected
  - sql
    - aggregates.sql
    - incremental_sort.sql

24 files changed

+1882

-494

lines changed

`‎contrib/postgres_fdw/expected/postgres_fdw.out‎`

Lines changed: 6 additions & 9 deletions

Original file line number	Diff line number	Diff line change
`@@ -2741,16 +2741,13 @@ select c2 * (random() <= 1)::int as c2 from ft2 group by c2 * (random() <= 1)::i`
`2741`	`2741`	`-- GROUP BY clause in various forms, cardinal, alias and constant expression`
`2742`	`2742`	`explain (verbose, costs off)`
`2743`	`2743`	`select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;`
`2744`		`- QUERY PLAN`
`2745`		`----------------------------------------------------------------------------------------`
`2746`		`-Sort`
	`2744`	`+QUERY PLAN`
	`2745`	`+------------------------------------------------------------------------------------------------------------`
	`2746`	`+Foreign Scan`
`2747`	`2747`	`Output: (count(c2)), c2, 5, 7.0, 9`
`2748`		`- Sort Key: ft1.c2`
`2749`		`- -> Foreign Scan`
`2750`		`- Output: (count(c2)), c2, 5, 7.0, 9`
`2751`		`- Relations: Aggregate on (public.ft1)`
`2752`		`- Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5`
`2753`		`-(7 rows)`
	`2748`	`+ Relations: Aggregate on (public.ft1)`
	`2749`	`+ Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST`
	`2750`	`+(4 rows)`
`2754`	`2751`
`2755`	`2752`	`select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;`
`2756`	`2753`	`w \| x \| y \| z`

`‎doc/src/sgml/config.sgml‎`

Lines changed: 14 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -4967,6 +4967,20 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="`
`4967`	`4967`	`</listitem>`
`4968`	`4968`	`</varlistentry>`
`4969`	`4969`
	`4970`	`+ <varlistentry id="guc-enable-groupby-reordering" xreflabel="enable_group_by_reordering">`
	`4971`	`+ <term><varname>enable_group_by_reordering</varname> (<type>boolean</type>)`
	`4972`	`+ <indexterm>`
	`4973`	`+ <primary><varname>enable_group_by_reordering</varname> configuration parameter</primary>`
	`4974`	`+ </indexterm>`
	`4975`	`+ </term>`
	`4976`	`+ <listitem>`
	`4977`	`+ <para>`
	`4978`	`+ Enables or disables reodering of keys in <literal>GROUP BY</literal>`
	`4979`	`+ clause. The default is <literal>on</literal>.`
	`4980`	`+ </para>`
	`4981`	`+ </listitem>`
	`4982`	`+ </varlistentry>`
	`4983`	`+`
`4970`	`4984`	`<varlistentry id="guc-enable-hashagg" xreflabel="enable_hashagg">`
`4971`	`4985`	`<term><varname>enable_hashagg</varname> (<type>boolean</type>)`
`4972`	`4986`	`<indexterm>`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commitdb0d67d

File tree

24 files changed

24 files changed

`‎contrib/postgres_fdw/expected/postgres_fdw.out‎`

`‎doc/src/sgml/config.sgml‎`

0 commit comments