NotificationsYou must be signed in to change notification settings
Fork5
Star26

Commite1fad50

committed

Revise generation of hashjoin paths: generate one path per

hashjoinable clause, not one path for a randomly-chosen element of eachset of clauses with the same join operator. That is, if you wrote SELECT ... WHERE t1.f1 = t2.f2 and t1.f3 = t2.f4,and both '=' ops were the same opcode (say, all four fields are int4),then the system would either consider hashing on f1=f2 or on f3=f4,but it would *not* consider both possibilities. Boo hiss.Also, revise estimation of hashjoin costs to include a penalty when theinner join var has a high disbursion --- ie, the most common value ispretty common. This tends to lead to badly skewed hash bucket occupancyand way more comparisons than you'd expect on average.I imagine that the cost calculation still needs tweaking, but at leastit generates a more reasonable plan than before on George Young's example.

1 parentb7883d7 commite1fad50Copy full SHA for e1fad50

File tree

5 files changed

+201

-118

lines changed

src
- backend/optimizer
  - path
    - costsize.c
    - joinpath.c
  - util
    - pathnode.c
- include/optimizer
  - cost.h
  - pathnode.h

5 files changed

+201

-118

lines changed

`‎src/backend/optimizer/path/costsize.c`

Lines changed: 47 additions & 29 deletions

Original file line number	Diff line number	Diff line change
`@@ -3,18 +3,30 @@`
`3`	`3`	`* costsize.c`
`4`	`4`	`* Routines to compute (and set) relation sizes and path costs`
`5`	`5`	`*`
`6`		`- * Copyright (c) 1994, Regents of the University of California`
	`6`	`+ * Path costs are measured in units of disk accesses: one page fetch`
	`7`	`+ * has cost 1. The other primitive unit is the CPU time required to`
	`8`	`+ * process one tuple, which we set at "_cpu_page_weight_" of a page`
	`9`	`+ * fetch. Obviously, the CPU time per tuple depends on the query`
	`10`	`+ * involved, but the relative CPU and disk speeds of a given platform`
	`11`	`+ * are so variable that we are lucky if we can get useful numbers`
	`12`	`+ * at all. _cpu_page_weight_ is user-settable, in case a particular`
	`13`	`+ * user is clueful enough to have a better-than-default estimate`
	`14`	`+ * of the ratio for his platform. There is also _cpu_index_page_weight_,`
	`15`	`+ * the cost to process a tuple of an index during an index scan.`
`7`	`16`	`*`
	`17`	`+ *`
	`18`	`+ * Copyright (c) 1994, Regents of the University of California`
`8`	`19`	`*`
`9`	`20`	`* IDENTIFICATION`
`10`		`- * $Header: /cvsroot/pgsql/src/backend/optimizer/path/costsize.c,v 1.43 1999/07/16 04:59:14 momjian Exp $`
	`21`	`+ * $Header: /cvsroot/pgsql/src/backend/optimizer/path/costsize.c,v 1.44 1999/08/06 04:00:15 tgl Exp $`
`11`	`22`	`*`
`12`	`23`	`*-------------------------------------------------------------------------`
`13`	`24`	`*/`
`14`	`25`
`15`	`26`	`#include<math.h>`
`16`	`27`
`17`	`28`	`#include"postgres.h"`
	`29`	`+`
`18`	`30`	`#ifdefHAVE_LIMITS_H`
`19`	`31`	`#include<limits.h>`
`20`	`32`	`#ifndefMAXINT`
`@@ -26,25 +38,24 @@`
`26`	`38`	`#endif`
`27`	`39`	`#endif`
`28`	`40`
`29`		`-`
	`41`	`+#include"miscadmin.h"`
`30`	`42`	`#include"optimizer/cost.h"`
`31`	`43`	`#include"optimizer/internal.h"`
`32`	`44`	`#include"optimizer/tlist.h"`
`33`	`45`	`#include"utils/lsyscache.h"`
`34`	`46`
`35`		`-externintNBuffers;`
`36`	`47`
	`48`	`+staticintcompute_targetlist_width(List*targetlist);`
`37`	`49`	`staticintcompute_attribute_width(TargetEntry*tlistentry);`
`38`	`50`	`staticdoublerelation_byte_size(inttuples,intwidth);`
`39`	`51`	`staticdoublebase_log(doublex,doubleb);`
`40`		`-staticintcompute_targetlist_width(List*targetlist);`
	`52`	`+`
`41`	`53`
`42`	`54`	`int_disable_cost_=30000000;`
`43`	`55`
`44`	`56`	`bool_enable_seqscan_= true;`
`45`	`57`	`bool_enable_indexscan_= true;`
`46`	`58`	`bool_enable_sort_= true;`
`47`		`-bool_enable_hash_= true;`
`48`	`59`	`bool_enable_nestloop_= true;`
`49`	`60`	`bool_enable_mergejoin_= true;`
`50`	`61`	`bool_enable_hashjoin_= true;`
`@@ -316,61 +327,68 @@ cost_mergejoin(Cost outercost,`
`316`	`327`	`}`
`317`	`328`
`318`	`329`	`/*`
`319`		`- * cost_hashjoin--XXX HASH`
	`330`	`+ * cost_hashjoin`
	`331`	`+ *`
`320`	`332`	`* 'outercost' and 'innercost' are the (disk+cpu) costs of scanning the`
`321`	`333`	`*outer and inner relations`
`322`		`- * 'outerkeys' and 'innerkeys' are lists of the keys to be used`
`323`		`- *to hash the outer and inner relations`
`324`	`334`	`* 'outersize' and 'innersize' are the number of tuples in the outer`
`325`	`335`	`*and inner relations`
`326`	`336`	`* 'outerwidth' and 'innerwidth' are the (typical) widths (in bytes)`
`327`	`337`	`*of the tuples of the outer and inner relations`
	`338`	`+ * 'innerdisbursion' is an estimate of the disbursion statistic`
	`339`	`+ *for the inner hash key.`
`328`	`340`	`*`
`329`	`341`	`* Returns a flonum.`
`330`	`342`	`*/`
`331`	`343`	`Cost`
`332`	`344`	`cost_hashjoin(Costoutercost,`
`333`	`345`	`Costinnercost,`
`334`		`-List*outerkeys,`
`335`		`-List*innerkeys,`
`336`	`346`	`intoutersize,`
`337`	`347`	`intinnersize,`
`338`	`348`	`intouterwidth,`
`339`		`-intinnerwidth)`
	`349`	`+intinnerwidth,`
	`350`	`+Costinnerdisbursion)`
`340`	`351`	`{`
`341`	`352`	`Costtemp=0;`
`342`		`-intouterpages=page_size(outersize,outerwidth);`
`343`		`-intinnerpages=page_size(innersize,innerwidth);`
	`353`	`+doubleouterbytes=relation_byte_size(outersize,outerwidth);`
	`354`	`+doubleinnerbytes=relation_byte_size(innersize,innerwidth);`
	`355`	`+longhashtablebytes=SortMem*1024L;`
`344`	`356`
`345`	`357`	`if (!_enable_hashjoin_)`
`346`	`358`	`temp+=_disable_cost_;`
`347`	`359`
`348`		`-/*`
`349`		`- * Bias against putting larger relation on inside.`
`350`		`- *`
`351`		`- * Code used to use "outerpages < innerpages" but that has poor`
`352`		`- * resolution when both relations are small.`
`353`		`- */`
`354`		`-if (relation_byte_size(outersize,outerwidth)<`
`355`		`-relation_byte_size(innersize,innerwidth))`
`356`		`-temp+=_disable_cost_;`
`357`		`-`
`358`	`360`	`/* cost of source data */`
`359`	`361`	`temp+=outercost+innercost;`
`360`	`362`
`361`	`363`	`/* cost of computing hash function: must do it once per tuple */`
`362`	`364`	`temp+=_cpu_page_weight_* (outersize+innersize);`
`363`	`365`
`364`		`-/* cost of main-memory hashtable */`
`365`		`-temp+= (innerpages<NBuffers) ?innerpages :NBuffers;`
	`366`	`+/* the number of tuple comparisons needed is the number of outer`
	`367`	`+ * tuples times the typical hash bucket size, which we estimate`
	`368`	`+ * conservatively as the inner disbursion times the inner tuple`
	`369`	`+ * count. The cost per comparison is set at _cpu_index_page_weight_;`
	`370`	`+ * is that reasonable, or do we need another basic parameter?`
	`371`	`+ */`
	`372`	`+temp+=_cpu_index_page_weight_outersize`
	`373`	`+(innersize*innerdisbursion);`
`366`	`374`
`367`	`375`	`/*`
`368`	`376`	`* if inner relation is too big then we will need to "batch" the join,`
`369`	`377`	`* which implies writing and reading most of the tuples to disk an`
`370`		`- * extra time.`
	`378`	`+ * extra time. Charge one cost unit per page of I/O.`
	`379`	`+ */`
	`380`	`+if (innerbytes>hashtablebytes)`
	`381`	`+temp+=2* (page_size(outersize,outerwidth)+`
	`382`	`+page_size(innersize,innerwidth));`
	`383`	`+`
	`384`	`+/*`
	`385`	`+ * Bias against putting larger relation on inside. We don't want`
	`386`	`+ * an absolute prohibition, though, since larger relation might have`
	`387`	`+ * better disbursion --- and we can't trust the size estimates`
	`388`	`+ * unreservedly, anyway.`
`371`	`389`	`*/`
`372`		`-if (innerpages>NBuffers)`
`373`		`-temp+=2* (outerpages+innerpages);`
	`390`	`+if (innerbytes>outerbytes)`
	`391`	`+temp=1.1;/ is this an OK fudge factor? */`
`374`	`392`
`375`	`393`	`Assert(temp >=0);`
`376`	`394`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commite1fad50

File tree

5 files changed

5 files changed

`‎src/backend/optimizer/path/costsize.c`

0 commit comments