Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit1b55878

Browse files
committed
Fix planning of star-schema-style queries.
Part of the intent of the parameterized-path mechanism was to handlestar-schema queries efficiently, but some overly-restrictive searchlimiting logic added in commite2fa76dprevented such cases from working as desired. Fix that and add aregression test about it. Per gripe from Marc Cousin.This is arguably a bug rather than a new feature, so back-patch to 9.2where parameterized paths were introduced.
1 parentabce8dc commit1b55878

File tree

4 files changed

+86
-17
lines changed

4 files changed

+86
-17
lines changed

‎src/backend/optimizer/README

Lines changed: 27 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -705,17 +705,37 @@ intermediate layers of joins, for example:
705705
-> Index Scan using C_Z_IDX on C
706706
Index Condition: C.Z = A.X
707707

708-
If all joins are plain inner joins then this is unnecessary, because
709-
it's always possible to reorder the joins so that a parameter is used
708+
If all joins are plain inner joins then this isusuallyunnecessary,
709+
becauseit's possible to reorder the joins so that a parameter is used
710710
immediately below the nestloop node that provides it. But in the
711-
presence of outer joins, join reordering may not be possible, and then
712-
this option can be critical. Before version 9.2, Postgres used ad-hoc
713-
methods for planning and executing such queries, and those methods could
714-
not handle passing parameters down through multiple join levels.
711+
presence of outer joins, such join reordering may not be possible.
712+
713+
Also, the bottom-level scan might require parameters from more than one
714+
other relation. In principle we could join the other relations first
715+
so that all the parameters are supplied from a single nestloop level.
716+
But if those other relations have no join clause in common (which is
717+
common in star-schema queries for instance), the planner won't consider
718+
joining them directly to each other. In such a case we need to be able
719+
to create a plan like
720+
721+
NestLoop
722+
-> Seq Scan on SmallTable1 A
723+
NestLoop
724+
-> Seq Scan on SmallTable2 B
725+
NestLoop
726+
-> Index Scan using XYIndex on LargeTable C
727+
Index Condition: C.X = A.AID and C.Y = B.BID
728+
729+
so we should be willing to pass down A.AID through a join even though
730+
there is no join order constraint forcing the plan to look like this.
731+
732+
Before version 9.2, Postgres used ad-hoc methods for planning and
733+
executing nestloop queries of this kind, and those methods could not
734+
handle passing parameters down through multiple join levels.
715735

716736
To plan such queries, we now use a notion of a "parameterized path",
717737
which is a path that makes use of a join clause to a relation that's not
718-
scanned by the path. In the examplejust above, we would construct a
738+
scanned by the path. In the exampletwo above, we would construct a
719739
path representing the possibility of doing this:
720740

721741
-> Index Scan using C_Z_IDX on C

‎src/backend/optimizer/path/joinpath.c

Lines changed: 31 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -117,13 +117,14 @@ add_paths_to_joinrel(PlannerInfo *root,
117117
/*
118118
* Decide whether it's sensible to generate parameterized paths for this
119119
* joinrel, and if so, which relations such paths should require. There
120-
* is no need to create a parameterized result path unless there is a join
121-
* order restriction that prevents joining one of our input rels directly
122-
* to the parameter source rel instead of joining to the other input rel.
123-
* This restriction reduces the number of parameterized paths we have to
124-
* deal with at higher join levels, without compromising the quality of
125-
* the resulting plan. We express the restriction as a Relids set that
126-
* must overlap the parameterization of any proposed join path.
120+
* is usually no need to create a parameterized result path unless there
121+
* is a join order restriction that prevents joining one of our input rels
122+
* directly to the parameter source rel instead of joining to the other
123+
* input rel. (But see exception in try_nestloop_path.) This restriction
124+
* reduces the number of parameterized paths we have to deal with at
125+
* higher join levels, without compromising the quality of the resulting
126+
* plan. We express the restriction as a Relids set that must overlap the
127+
* parameterization of any proposed join path.
127128
*/
128129
foreach(lc,root->join_info_list)
129130
{
@@ -291,9 +292,29 @@ try_nestloop_path(PlannerInfo *root,
291292
if (required_outer&&
292293
!bms_overlap(required_outer,param_source_rels))
293294
{
294-
/* Waste no memory when we reject a path here */
295-
bms_free(required_outer);
296-
return;
295+
/*
296+
* We override the param_source_rels heuristic to accept nestloop
297+
* paths in which the outer rel satisfies some but not all of the
298+
* inner path's parameterization. This is necessary to get good plans
299+
* for star-schema scenarios, in which a parameterized path for a
300+
* large table may require parameters from multiple small tables that
301+
* will not get joined directly to each other. We can handle that by
302+
* stacking nestloops that have the small tables on the outside; but
303+
* this breaks the rule the param_source_rels heuristic is based on,
304+
* namely that parameters should not be passed down across joins
305+
* unless there's a join-order-constraint-based reason to do so. So
306+
* ignore the param_source_rels restriction when this case applies.
307+
*/
308+
Relidsouterrelids=outer_path->parent->relids;
309+
Relidsinnerparams=PATH_REQ_OUTER(inner_path);
310+
311+
if (!(bms_overlap(innerparams,outerrelids)&&
312+
bms_nonempty_difference(innerparams,outerrelids)))
313+
{
314+
/* Waste no memory when we reject a path here */
315+
bms_free(required_outer);
316+
return;
317+
}
297318
}
298319

299320
/*

‎src/test/regress/expected/join.out

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2780,6 +2780,25 @@ where thousand = (q1 + q2);
27802780
-> Seq Scan on int4_tbl
27812781
(12 rows)
27822782

2783+
--
2784+
-- test ability to generate a suitable plan for a star-schema query
2785+
--
2786+
explain (costs off)
2787+
select * from
2788+
tenk1, int8_tbl a, int8_tbl b
2789+
where thousand = a.q1 and tenthous = b.q1 and a.q2 = 1 and b.q2 = 2;
2790+
QUERY PLAN
2791+
---------------------------------------------------------------------
2792+
Nested Loop
2793+
-> Seq Scan on int8_tbl b
2794+
Filter: (q2 = 2)
2795+
-> Nested Loop
2796+
-> Seq Scan on int8_tbl a
2797+
Filter: (q2 = 1)
2798+
-> Index Scan using tenk1_thous_tenthous on tenk1
2799+
Index Cond: ((thousand = a.q1) AND (tenthous = b.q1))
2800+
(8 rows)
2801+
27832802
--
27842803
-- test placement of movable quals in a parameterized join tree
27852804
--

‎src/test/regress/sql/join.sql

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -763,6 +763,15 @@ select * from
763763
int4(sin(0)) q2
764764
where thousand= (q1+ q2);
765765

766+
--
767+
-- test ability to generate a suitable plan for a star-schema query
768+
--
769+
770+
explain (costs off)
771+
select*from
772+
tenk1, int8_tbl a, int8_tbl b
773+
where thousand=a.q1and tenthous=b.q1anda.q2=1andb.q2=2;
774+
766775
--
767776
-- test placement of movable quals in a parameterized join tree
768777
--

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp