Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitfdacbf9

Browse files
committed
Fix planning of star-schema-style queries.
Part of the intent of the parameterized-path mechanism was to handlestar-schema queries efficiently, but some overly-restrictive searchlimiting logic added in commite2fa76dprevented such cases from working as desired. Fix that and add aregression test about it. Per gripe from Marc Cousin.This is arguably a bug rather than a new feature, so back-patch to 9.2where parameterized paths were introduced.
1 parent60ff5de commitfdacbf9

File tree

4 files changed

+86
-17
lines changed

4 files changed

+86
-17
lines changed

‎src/backend/optimizer/README

Lines changed: 27 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -702,17 +702,37 @@ intermediate layers of joins, for example:
702702
-> Index Scan using C_Z_IDX on C
703703
Index Condition: C.Z = A.X
704704

705-
If all joins are plain inner joins then this is unnecessary, because
706-
it's always possible to reorder the joins so that a parameter is used
705+
If all joins are plain inner joins then this isusuallyunnecessary,
706+
becauseit's possible to reorder the joins so that a parameter is used
707707
immediately below the nestloop node that provides it. But in the
708-
presence of outer joins, join reordering may not be possible, and then
709-
this option can be critical. Before version 9.2, Postgres used ad-hoc
710-
methods for planning and executing such queries, and those methods could
711-
not handle passing parameters down through multiple join levels.
708+
presence of outer joins, such join reordering may not be possible.
709+
710+
Also, the bottom-level scan might require parameters from more than one
711+
other relation. In principle we could join the other relations first
712+
so that all the parameters are supplied from a single nestloop level.
713+
But if those other relations have no join clause in common (which is
714+
common in star-schema queries for instance), the planner won't consider
715+
joining them directly to each other. In such a case we need to be able
716+
to create a plan like
717+
718+
NestLoop
719+
-> Seq Scan on SmallTable1 A
720+
NestLoop
721+
-> Seq Scan on SmallTable2 B
722+
NestLoop
723+
-> Index Scan using XYIndex on LargeTable C
724+
Index Condition: C.X = A.AID and C.Y = B.BID
725+
726+
so we should be willing to pass down A.AID through a join even though
727+
there is no join order constraint forcing the plan to look like this.
728+
729+
Before version 9.2, Postgres used ad-hoc methods for planning and
730+
executing nestloop queries of this kind, and those methods could not
731+
handle passing parameters down through multiple join levels.
712732

713733
To plan such queries, we now use a notion of a "parameterized path",
714734
which is a path that makes use of a join clause to a relation that's not
715-
scanned by the path. In the examplejust above, we would construct a
735+
scanned by the path. In the exampletwo above, we would construct a
716736
path representing the possibility of doing this:
717737

718738
-> Index Scan using C_Z_IDX on C

‎src/backend/optimizer/path/joinpath.c

Lines changed: 31 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -117,13 +117,14 @@ add_paths_to_joinrel(PlannerInfo *root,
117117
/*
118118
* Decide whether it's sensible to generate parameterized paths for this
119119
* joinrel, and if so, which relations such paths should require. There
120-
* is no need to create a parameterized result path unless there is a join
121-
* order restriction that prevents joining one of our input rels directly
122-
* to the parameter source rel instead of joining to the other input rel.
123-
* This restriction reduces the number of parameterized paths we have to
124-
* deal with at higher join levels, without compromising the quality of
125-
* the resulting plan. We express the restriction as a Relids set that
126-
* must overlap the parameterization of any proposed join path.
120+
* is usually no need to create a parameterized result path unless there
121+
* is a join order restriction that prevents joining one of our input rels
122+
* directly to the parameter source rel instead of joining to the other
123+
* input rel. (But see exception in try_nestloop_path.) This restriction
124+
* reduces the number of parameterized paths we have to deal with at
125+
* higher join levels, without compromising the quality of the resulting
126+
* plan. We express the restriction as a Relids set that must overlap the
127+
* parameterization of any proposed join path.
127128
*/
128129
foreach(lc,root->join_info_list)
129130
{
@@ -291,9 +292,29 @@ try_nestloop_path(PlannerInfo *root,
291292
if (required_outer&&
292293
!bms_overlap(required_outer,param_source_rels))
293294
{
294-
/* Waste no memory when we reject a path here */
295-
bms_free(required_outer);
296-
return;
295+
/*
296+
* We override the param_source_rels heuristic to accept nestloop
297+
* paths in which the outer rel satisfies some but not all of the
298+
* inner path's parameterization. This is necessary to get good plans
299+
* for star-schema scenarios, in which a parameterized path for a
300+
* large table may require parameters from multiple small tables that
301+
* will not get joined directly to each other. We can handle that by
302+
* stacking nestloops that have the small tables on the outside; but
303+
* this breaks the rule the param_source_rels heuristic is based on,
304+
* namely that parameters should not be passed down across joins
305+
* unless there's a join-order-constraint-based reason to do so. So
306+
* ignore the param_source_rels restriction when this case applies.
307+
*/
308+
Relidsouterrelids=outer_path->parent->relids;
309+
Relidsinnerparams=PATH_REQ_OUTER(inner_path);
310+
311+
if (!(bms_overlap(innerparams,outerrelids)&&
312+
bms_nonempty_difference(innerparams,outerrelids)))
313+
{
314+
/* Waste no memory when we reject a path here */
315+
bms_free(required_outer);
316+
return;
317+
}
297318
}
298319

299320
/*

‎src/test/regress/expected/join.out

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2780,6 +2780,25 @@ where thousand = (q1 + q2);
27802780
-> Seq Scan on int4_tbl
27812781
(12 rows)
27822782

2783+
--
2784+
-- test ability to generate a suitable plan for a star-schema query
2785+
--
2786+
explain (costs off)
2787+
select * from
2788+
tenk1, int8_tbl a, int8_tbl b
2789+
where thousand = a.q1 and tenthous = b.q1 and a.q2 = 1 and b.q2 = 2;
2790+
QUERY PLAN
2791+
---------------------------------------------------------------------
2792+
Nested Loop
2793+
-> Seq Scan on int8_tbl b
2794+
Filter: (q2 = 2)
2795+
-> Nested Loop
2796+
-> Seq Scan on int8_tbl a
2797+
Filter: (q2 = 1)
2798+
-> Index Scan using tenk1_thous_tenthous on tenk1
2799+
Index Cond: ((thousand = a.q1) AND (tenthous = b.q1))
2800+
(8 rows)
2801+
27832802
--
27842803
-- test extraction of restriction OR clauses from join OR clause
27852804
-- (we used to only do this for indexable clauses)

‎src/test/regress/sql/join.sql

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -763,6 +763,15 @@ select * from
763763
int4(sin(0)) q2
764764
where thousand= (q1+ q2);
765765

766+
--
767+
-- test ability to generate a suitable plan for a star-schema query
768+
--
769+
770+
explain (costs off)
771+
select*from
772+
tenk1, int8_tbl a, int8_tbl b
773+
where thousand=a.q1and tenthous=b.q1anda.q2=1andb.q2=2;
774+
766775
--
767776
-- test extraction of restriction OR clauses from join OR clause
768777
-- (we used to only do this for indexable clauses)

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp