Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commite3ffd05

Browse files
committed
Weaken the planner's tests for relevant joinclauses.
We should be willing to cross-join two small relations if that allows usto use an inner indexscan on a large relation (that is, the potentialindexqual for the large table requires both smaller relations). Thisworked in simple cases but fell apart as soon as there was a join clauseto a fourth relation, because the existence of any two-relation join clausecaused the planner to not consider clauseless joins between other baserelations. The added regression test shows an example case adapted froma recent complaint from Benoit Delbosc.Adjust have_relevant_joinclause, have_relevant_eclass_joinclause, andhas_relevant_eclass_joinclause to consider that a join clause mentioningthree or more relations is sufficient grounds for joining any subset ofthose relations, even if we have to do so via a cartesian join. Since suchclauses are relatively uncommon, this shouldn't affect planning speed ontypical queries; in fact it should help a bit, because the latter twofunctions in particular get significantly simpler.Although this is arguably a bug fix, I'm not going to risk back-patchingit, since it might have currently-unforeseen consequences.
1 parentc0cc526 commite3ffd05

File tree

5 files changed

+110
-93
lines changed

5 files changed

+110
-93
lines changed

‎src/backend/optimizer/path/equivclass.c

Lines changed: 19 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -2035,7 +2035,7 @@ get_parent_relid(PlannerInfo *root, RelOptInfo *rel)
20352035
/*
20362036
* have_relevant_eclass_joinclause
20372037
*Detect whether there is an EquivalenceClass that could produce
2038-
*a joinclausebetween the two given relations.
2038+
*a joinclauseinvolving the two given relations.
20392039
*
20402040
* This is essentially a very cut-down version of
20412041
* generate_join_implied_equalities().Note it's OK to occasionally say "yes"
@@ -2051,9 +2051,6 @@ have_relevant_eclass_joinclause(PlannerInfo *root,
20512051
foreach(lc1,root->eq_classes)
20522052
{
20532053
EquivalenceClass*ec= (EquivalenceClass*)lfirst(lc1);
2054-
boolhas_rel1;
2055-
boolhas_rel2;
2056-
ListCell*lc2;
20572054

20582055
/*
20592056
* Won't generate joinclauses if single-member (this test covers the
@@ -2063,45 +2060,27 @@ have_relevant_eclass_joinclause(PlannerInfo *root,
20632060
continue;
20642061

20652062
/*
2063+
* We do not need to examine the individual members of the EC, because
2064+
* all that we care about is whether each rel overlaps the relids of
2065+
* at least one member, and a test on ec_relids is sufficient to prove
2066+
* that. (As with have_relevant_joinclause(), it is not necessary
2067+
* that the EC be able to form a joinclause relating exactly the two
2068+
* given rels, only that it be able to form a joinclause mentioning
2069+
* both, and this will surely be true if both of them overlap
2070+
* ec_relids.)
2071+
*
20662072
* Note we don't test ec_broken; if we did, we'd need a separate code
2067-
* path to look through ec_sources. Checking themembers anyway is OK
2068-
* as a possibly-overoptimistic heuristic.
2073+
* path to look through ec_sources. Checking themembership anyway is
2074+
*OKas a possibly-overoptimistic heuristic.
20692075
*
20702076
* We don't test ec_has_const either, even though a const eclass won't
20712077
* generate real join clauses.This is because if we had "WHERE a.x =
20722078
* b.y and a.x = 42", it is worth considering a join between a and b,
20732079
* since the join result is likely to be small even though it'll end
20742080
* up being an unqualified nestloop.
20752081
*/
2076-
2077-
/* Needn't scan if it couldn't contain members from each rel */
2078-
if (!bms_overlap(rel1->relids,ec->ec_relids)||
2079-
!bms_overlap(rel2->relids,ec->ec_relids))
2080-
continue;
2081-
2082-
/* Scan the EC to see if it has member(s) in each rel */
2083-
has_rel1=has_rel2= false;
2084-
foreach(lc2,ec->ec_members)
2085-
{
2086-
EquivalenceMember*cur_em= (EquivalenceMember*)lfirst(lc2);
2087-
2088-
if (cur_em->em_is_const||cur_em->em_is_child)
2089-
continue;/* ignore consts and children here */
2090-
if (bms_is_subset(cur_em->em_relids,rel1->relids))
2091-
{
2092-
has_rel1= true;
2093-
if (has_rel2)
2094-
break;
2095-
}
2096-
if (bms_is_subset(cur_em->em_relids,rel2->relids))
2097-
{
2098-
has_rel2= true;
2099-
if (has_rel1)
2100-
break;
2101-
}
2102-
}
2103-
2104-
if (has_rel1&&has_rel2)
2082+
if (bms_overlap(rel1->relids,ec->ec_relids)&&
2083+
bms_overlap(rel2->relids,ec->ec_relids))
21052084
return true;
21062085
}
21072086

@@ -2112,7 +2091,7 @@ have_relevant_eclass_joinclause(PlannerInfo *root,
21122091
/*
21132092
* has_relevant_eclass_joinclause
21142093
*Detect whether there is an EquivalenceClass that could produce
2115-
*a joinclausebetween the given relation and anything else.
2094+
*a joinclauseinvolving the given relation and anything else.
21162095
*
21172096
* This is the same as have_relevant_eclass_joinclause with the other rel
21182097
* implicitly defined as "everything else in the query".
@@ -2125,9 +2104,6 @@ has_relevant_eclass_joinclause(PlannerInfo *root, RelOptInfo *rel1)
21252104
foreach(lc1,root->eq_classes)
21262105
{
21272106
EquivalenceClass*ec= (EquivalenceClass*)lfirst(lc1);
2128-
boolhas_rel1;
2129-
boolhas_rel2;
2130-
ListCell*lc2;
21312107

21322108
/*
21332109
* Won't generate joinclauses if single-member (this test covers the
@@ -2137,45 +2113,11 @@ has_relevant_eclass_joinclause(PlannerInfo *root, RelOptInfo *rel1)
21372113
continue;
21382114

21392115
/*
2140-
* Note we don't test ec_broken; if we did, we'd need a separate code
2141-
* path to look through ec_sources. Checking the members anyway is OK
2142-
* as a possibly-overoptimistic heuristic.
2143-
*
2144-
* We don't test ec_has_const either, even though a const eclass won't
2145-
* generate real join clauses.This is because if we had "WHERE a.x =
2146-
* b.y and a.x = 42", it is worth considering a join between a and b,
2147-
* since the join result is likely to be small even though it'll end
2148-
* up being an unqualified nestloop.
2116+
* Per the comment in have_relevant_eclass_joinclause, it's sufficient
2117+
* to find an EC that mentions both this rel and some other rel.
21492118
*/
2150-
2151-
/* Needn't scan if it couldn't contain members from each rel */
2152-
if (!bms_overlap(rel1->relids,ec->ec_relids)||
2153-
bms_is_subset(ec->ec_relids,rel1->relids))
2154-
continue;
2155-
2156-
/* Scan the EC to see if it has member(s) in each rel */
2157-
has_rel1=has_rel2= false;
2158-
foreach(lc2,ec->ec_members)
2159-
{
2160-
EquivalenceMember*cur_em= (EquivalenceMember*)lfirst(lc2);
2161-
2162-
if (cur_em->em_is_const||cur_em->em_is_child)
2163-
continue;/* ignore consts and children here */
2164-
if (bms_is_subset(cur_em->em_relids,rel1->relids))
2165-
{
2166-
has_rel1= true;
2167-
if (has_rel2)
2168-
break;
2169-
}
2170-
if (!bms_overlap(cur_em->em_relids,rel1->relids))
2171-
{
2172-
has_rel2= true;
2173-
if (has_rel1)
2174-
break;
2175-
}
2176-
}
2177-
2178-
if (has_rel1&&has_rel2)
2119+
if (bms_overlap(rel1->relids,ec->ec_relids)&&
2120+
!bms_is_subset(ec->ec_relids,rel1->relids))
21792121
return true;
21802122
}
21812123

‎src/backend/optimizer/path/joinrels.c

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -88,13 +88,9 @@ join_search_one_level(PlannerInfo *root, int level)
8888
has_join_restriction(root,old_rel))
8989
{
9090
/*
91-
* Note that if all available join clauses for this rel require
92-
* more than one other rel, we will fail to make any joins against
93-
* it here. In most cases that's OK; it'll be considered by
94-
* "bushy plan" join code in a higher-level pass where we have
95-
* those other rels collected into a join rel.
96-
*
97-
* See also the last-ditch case below.
91+
* There are relevant join clauses or join order restrictions,
92+
* so consider joins between this rel and (only) those rels it is
93+
* linked to by a clause or restriction.
9894
*/
9995
make_rels_by_clause_joins(root,
10096
old_rel,
@@ -160,8 +156,8 @@ join_search_one_level(PlannerInfo *root, int level)
160156
{
161157
/*
162158
* OK, we can build a rel of the right level from this
163-
* pair of rels. Do so if there is at least oneusable
164-
* join clause ora relevantjoin restriction.
159+
* pair of rels. Do so if there is at least onerelevant
160+
* join clause or join order restriction.
165161
*/
166162
if (have_relevant_joinclause(root,old_rel,new_rel)||
167163
have_join_order_restriction(root,old_rel,new_rel))

‎src/backend/optimizer/util/joininfo.c

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,34 +21,46 @@
2121

2222
/*
2323
* have_relevant_joinclause
24-
*Detect whether there is a joinclause thatcan be used to join
24+
*Detect whether there is a joinclause thatinvolves
2525
*the two given relations.
26+
*
27+
* Note: the joinclause does not have to be evaluatable with only these two
28+
* relations. This is intentional. For example consider
29+
*SELECT * FROM a, b, c WHERE a.x = (b.y + c.z)
30+
* If a is much larger than the other tables, it may be worthwhile to
31+
* cross-join b and c and then use an inner indexscan on a.x. Therefore
32+
* we should consider this joinclause as reason to join b to c, even though
33+
* it can't be applied at that join step.
2634
*/
2735
bool
2836
have_relevant_joinclause(PlannerInfo*root,
2937
RelOptInfo*rel1,RelOptInfo*rel2)
3038
{
3139
boolresult= false;
32-
Relidsjoin_relids;
3340
List*joininfo;
41+
Relidsother_relids;
3442
ListCell*l;
3543

36-
join_relids=bms_union(rel1->relids,rel2->relids);
37-
3844
/*
3945
* We could scan either relation's joininfo list; may as well use the
4046
* shorter one.
4147
*/
4248
if (list_length(rel1->joininfo) <=list_length(rel2->joininfo))
49+
{
4350
joininfo=rel1->joininfo;
51+
other_relids=rel2->relids;
52+
}
4453
else
54+
{
4555
joininfo=rel2->joininfo;
56+
other_relids=rel1->relids;
57+
}
4658

4759
foreach(l,joininfo)
4860
{
4961
RestrictInfo*rinfo= (RestrictInfo*)lfirst(l);
5062

51-
if (bms_is_subset(rinfo->required_relids,join_relids))
63+
if (bms_overlap(other_relids,rinfo->required_relids))
5264
{
5365
result= true;
5466
break;
@@ -62,8 +74,6 @@ have_relevant_joinclause(PlannerInfo *root,
6274
if (!result&&rel1->has_eclass_joins&&rel2->has_eclass_joins)
6375
result=have_relevant_eclass_joinclause(root,rel1,rel2);
6476

65-
bms_free(join_relids);
66-
6777
returnresult;
6878
}
6979

‎src/test/regress/expected/join.out

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2666,6 +2666,57 @@ select * from int4_tbl a full join int4_tbl b on false;
26662666
-2147483647 |
26672667
(10 rows)
26682668

2669+
--
2670+
-- test for ability to use a cartesian join when necessary
2671+
--
2672+
explain (costs off)
2673+
select * from
2674+
tenk1 join int4_tbl on f1 = twothousand,
2675+
int4(sin(1)) q1,
2676+
int4(sin(0)) q2
2677+
where q1 = thousand or q2 = thousand;
2678+
QUERY PLAN
2679+
-----------------------------------------------------------------------------
2680+
Hash Join
2681+
Hash Cond: (tenk1.twothousand = int4_tbl.f1)
2682+
-> Nested Loop
2683+
Join Filter: ((q1.q1 = tenk1.thousand) OR (q2.q2 = tenk1.thousand))
2684+
-> Nested Loop
2685+
-> Function Scan on q1
2686+
-> Function Scan on q2
2687+
-> Bitmap Heap Scan on tenk1
2688+
Recheck Cond: ((q1.q1 = thousand) OR (q2.q2 = thousand))
2689+
-> BitmapOr
2690+
-> Bitmap Index Scan on tenk1_thous_tenthous
2691+
Index Cond: (q1.q1 = thousand)
2692+
-> Bitmap Index Scan on tenk1_thous_tenthous
2693+
Index Cond: (q2.q2 = thousand)
2694+
-> Hash
2695+
-> Seq Scan on int4_tbl
2696+
(16 rows)
2697+
2698+
explain (costs off)
2699+
select * from
2700+
tenk1 join int4_tbl on f1 = twothousand,
2701+
int4(sin(1)) q1,
2702+
int4(sin(0)) q2
2703+
where thousand = (q1 + q2);
2704+
QUERY PLAN
2705+
--------------------------------------------------------------
2706+
Hash Join
2707+
Hash Cond: (tenk1.twothousand = int4_tbl.f1)
2708+
-> Nested Loop
2709+
-> Nested Loop
2710+
-> Function Scan on q1
2711+
-> Function Scan on q2
2712+
-> Bitmap Heap Scan on tenk1
2713+
Recheck Cond: (thousand = (q1.q1 + q2.q2))
2714+
-> Bitmap Index Scan on tenk1_thous_tenthous
2715+
Index Cond: (thousand = (q1.q1 + q2.q2))
2716+
-> Hash
2717+
-> Seq Scan on int4_tbl
2718+
(12 rows)
2719+
26692720
--
26702721
-- test join removal
26712722
--

‎src/test/regress/sql/join.sql

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -689,6 +689,24 @@ order by 1,2;
689689
select*from int4_tbl a fulljoin int4_tbl bon true;
690690
select*from int4_tbl a fulljoin int4_tbl bon false;
691691

692+
--
693+
-- test for ability to use a cartesian join when necessary
694+
--
695+
696+
explain (costs off)
697+
select*from
698+
tenk1join int4_tblon f1= twothousand,
699+
int4(sin(1)) q1,
700+
int4(sin(0)) q2
701+
where q1= thousandor q2= thousand;
702+
703+
explain (costs off)
704+
select*from
705+
tenk1join int4_tblon f1= twothousand,
706+
int4(sin(1)) q1,
707+
int4(sin(0)) q2
708+
where thousand= (q1+ q2);
709+
692710
--
693711
-- test join removal
694712
--

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp