Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commite1fad50

Browse files
committed
Revise generation of hashjoin paths: generate one path per
hashjoinable clause, not one path for a randomly-chosen element of eachset of clauses with the same join operator. That is, if you wrote SELECT ... WHERE t1.f1 = t2.f2 and t1.f3 = t2.f4,and both '=' ops were the same opcode (say, all four fields are int4),then the system would either consider hashing on f1=f2 or on f3=f4,but it would *not* consider both possibilities. Boo hiss.Also, revise estimation of hashjoin costs to include a penalty when theinner join var has a high disbursion --- ie, the most common value ispretty common. This tends to lead to badly skewed hash bucket occupancyand way more comparisons than you'd expect on average.I imagine that the cost calculation still needs tweaking, but at leastit generates a more reasonable plan than before on George Young's example.
1 parentb7883d7 commite1fad50

File tree

5 files changed

+201
-118
lines changed

5 files changed

+201
-118
lines changed

‎src/backend/optimizer/path/costsize.c

Lines changed: 47 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -3,18 +3,30 @@
33
* costsize.c
44
* Routines to compute (and set) relation sizes and path costs
55
*
6-
* Copyright (c) 1994, Regents of the University of California
6+
* Path costs are measured in units of disk accesses: one page fetch
7+
* has cost 1. The other primitive unit is the CPU time required to
8+
* process one tuple, which we set at "_cpu_page_weight_" of a page
9+
* fetch. Obviously, the CPU time per tuple depends on the query
10+
* involved, but the relative CPU and disk speeds of a given platform
11+
* are so variable that we are lucky if we can get useful numbers
12+
* at all. _cpu_page_weight_ is user-settable, in case a particular
13+
* user is clueful enough to have a better-than-default estimate
14+
* of the ratio for his platform. There is also _cpu_index_page_weight_,
15+
* the cost to process a tuple of an index during an index scan.
716
*
17+
*
18+
* Copyright (c) 1994, Regents of the University of California
819
*
920
* IDENTIFICATION
10-
* $Header: /cvsroot/pgsql/src/backend/optimizer/path/costsize.c,v 1.43 1999/07/16 04:59:14 momjian Exp $
21+
* $Header: /cvsroot/pgsql/src/backend/optimizer/path/costsize.c,v 1.44 1999/08/06 04:00:15 tgl Exp $
1122
*
1223
*-------------------------------------------------------------------------
1324
*/
1425

1526
#include<math.h>
1627

1728
#include"postgres.h"
29+
1830
#ifdefHAVE_LIMITS_H
1931
#include<limits.h>
2032
#ifndefMAXINT
@@ -26,25 +38,24 @@
2638
#endif
2739
#endif
2840

29-
41+
#include"miscadmin.h"
3042
#include"optimizer/cost.h"
3143
#include"optimizer/internal.h"
3244
#include"optimizer/tlist.h"
3345
#include"utils/lsyscache.h"
3446

35-
externintNBuffers;
3647

48+
staticintcompute_targetlist_width(List*targetlist);
3749
staticintcompute_attribute_width(TargetEntry*tlistentry);
3850
staticdoublerelation_byte_size(inttuples,intwidth);
3951
staticdoublebase_log(doublex,doubleb);
40-
staticintcompute_targetlist_width(List*targetlist);
52+
4153

4254
int_disable_cost_=30000000;
4355

4456
bool_enable_seqscan_= true;
4557
bool_enable_indexscan_= true;
4658
bool_enable_sort_= true;
47-
bool_enable_hash_= true;
4859
bool_enable_nestloop_= true;
4960
bool_enable_mergejoin_= true;
5061
bool_enable_hashjoin_= true;
@@ -316,61 +327,68 @@ cost_mergejoin(Cost outercost,
316327
}
317328

318329
/*
319-
* cost_hashjoin--XXX HASH
330+
* cost_hashjoin
331+
*
320332
* 'outercost' and 'innercost' are the (disk+cpu) costs of scanning the
321333
*outer and inner relations
322-
* 'outerkeys' and 'innerkeys' are lists of the keys to be used
323-
*to hash the outer and inner relations
324334
* 'outersize' and 'innersize' are the number of tuples in the outer
325335
*and inner relations
326336
* 'outerwidth' and 'innerwidth' are the (typical) widths (in bytes)
327337
*of the tuples of the outer and inner relations
338+
* 'innerdisbursion' is an estimate of the disbursion statistic
339+
*for the inner hash key.
328340
*
329341
* Returns a flonum.
330342
*/
331343
Cost
332344
cost_hashjoin(Costoutercost,
333345
Costinnercost,
334-
List*outerkeys,
335-
List*innerkeys,
336346
intoutersize,
337347
intinnersize,
338348
intouterwidth,
339-
intinnerwidth)
349+
intinnerwidth,
350+
Costinnerdisbursion)
340351
{
341352
Costtemp=0;
342-
intouterpages=page_size(outersize,outerwidth);
343-
intinnerpages=page_size(innersize,innerwidth);
353+
doubleouterbytes=relation_byte_size(outersize,outerwidth);
354+
doubleinnerbytes=relation_byte_size(innersize,innerwidth);
355+
longhashtablebytes=SortMem*1024L;
344356

345357
if (!_enable_hashjoin_)
346358
temp+=_disable_cost_;
347359

348-
/*
349-
* Bias against putting larger relation on inside.
350-
*
351-
* Code used to use "outerpages < innerpages" but that has poor
352-
* resolution when both relations are small.
353-
*/
354-
if (relation_byte_size(outersize,outerwidth)<
355-
relation_byte_size(innersize,innerwidth))
356-
temp+=_disable_cost_;
357-
358360
/* cost of source data */
359361
temp+=outercost+innercost;
360362

361363
/* cost of computing hash function: must do it once per tuple */
362364
temp+=_cpu_page_weight_* (outersize+innersize);
363365

364-
/* cost of main-memory hashtable */
365-
temp+= (innerpages<NBuffers) ?innerpages :NBuffers;
366+
/* the number of tuple comparisons needed is the number of outer
367+
* tuples times the typical hash bucket size, which we estimate
368+
* conservatively as the inner disbursion times the inner tuple
369+
* count. The cost per comparison is set at _cpu_index_page_weight_;
370+
* is that reasonable, or do we need another basic parameter?
371+
*/
372+
temp+=_cpu_index_page_weight_*outersize*
373+
(innersize*innerdisbursion);
366374

367375
/*
368376
* if inner relation is too big then we will need to "batch" the join,
369377
* which implies writing and reading most of the tuples to disk an
370-
* extra time.
378+
* extra time. Charge one cost unit per page of I/O.
379+
*/
380+
if (innerbytes>hashtablebytes)
381+
temp+=2* (page_size(outersize,outerwidth)+
382+
page_size(innersize,innerwidth));
383+
384+
/*
385+
* Bias against putting larger relation on inside. We don't want
386+
* an absolute prohibition, though, since larger relation might have
387+
* better disbursion --- and we can't trust the size estimates
388+
* unreservedly, anyway.
371389
*/
372-
if (innerpages>NBuffers)
373-
temp+=2* (outerpages+innerpages);
390+
if (innerbytes>outerbytes)
391+
temp*=1.1;/* is this an OK fudge factor? */
374392

375393
Assert(temp >=0);
376394

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp