Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitd70c147

Browse files
committed
Avoid crash in partitionwise join planning under GEQO.
While trying to plan a partitionwise join, we may be faced with caseswhere one or both input partitions for a particular segment of the joinhave been pruned away. In HEAD and v11, this is problematic becauseearlier processing didn't bother to make a pruned RelOptInfo fullyvalid. With an upcoming patch to make partition pruning more efficient,this'll be even more problematic because said RelOptInfo won't exist atall.The existing code attempts to deal with this by retroactively making theRelOptInfo fully valid, but that causes crashes under GEQO because joinplanning is done in a short-lived memory context. In v11 we couldprobably have fixed this by switching to the planner's main contextwhile fixing up the RelOptInfo, but that idea doesn't scale well to theupcoming patch. It would be better not to mess with the base-relationdata structures during join planning, anyway --- that's just a recipefor order-of-operations bugs.In many cases, though, we don't actually need the child RelOptInfo,because if the input is certainly empty then the join segment's resultis certainly empty, so we can skip making a join plan altogether. (Theexisting code ultimately arrives at the same conclusion, but only afterdoing a lot more work.) This approach works except when the pruned-awaypartition is on the nullable side of a LEFT, ANTI, or FULL join, and theother side isn't pruned. But in those cases the existing code leaves alot to be desired anyway --- the correct output is just the result ofthe unpruned side of the join, but we were emitting a useless outer joinagainst a dummy Result. Pending somebody writing code to handle thatmore nicely, let's just abandon the partitionwise-join optimization insuch cases.When the modified code skips making a join plan, it doesn't make ajoin RelOptInfo either; this requires some upper-level code tocope with nulls in part_rels[] arrays. We would have had to havethat anyway after the upcoming patch.Back-patch to v11 since the crash is demonstrable there.Discussion:https://postgr.es/m/8305.1553884377@sss.pgh.pa.us
1 parent26d4fda commitd70c147

File tree

7 files changed

+157
-163
lines changed

7 files changed

+157
-163
lines changed

‎src/backend/optimizer/path/allpaths.c

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1166,11 +1166,11 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
11661166
* for partitioned child rels.
11671167
*
11681168
* Note: here we abuse the consider_partitionwise_join flag by setting
1169-
* it*even*for child rels that are not partitioned.In that case,
1170-
*we set it totell try_partitionwise_join() thatit doesn't need to
1171-
*generate their targetlists and EC entries as they have already been
1172-
*generated here, as opposed to the dummy child rels for which the
1173-
*flag is left set to false so that it will generate them.
1169+
* it for child rels that are notthemselvespartitioned.We do so to
1170+
* tell try_partitionwise_join() thatthe child rel is sufficiently
1171+
*valid to be used as a per-partition input, even if it later gets
1172+
*proven to be dummy. (It's not usable until we've set up the
1173+
*reltarget and EC entries, which we just did.)
11741174
*/
11751175
if (rel->consider_partitionwise_join)
11761176
childrel->consider_partitionwise_join= true;
@@ -3551,7 +3551,9 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
35513551
{
35523552
RelOptInfo*child_rel=part_rels[cnt_parts];
35533553

3554-
Assert(child_rel!=NULL);
3554+
/* If it's been pruned entirely, it's certainly dummy. */
3555+
if (child_rel==NULL)
3556+
continue;
35553557

35563558
/* Add partitionwise join paths for partitioned child-joins. */
35573559
generate_partitionwise_join_paths(root,child_rel);

‎src/backend/optimizer/path/joinrels.c

Lines changed: 56 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,6 @@ static void try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1,
4343
RelOptInfo*rel2,RelOptInfo*joinrel,
4444
SpecialJoinInfo*parent_sjinfo,
4545
List*parent_restrictlist);
46-
staticvoidupdate_child_rel_info(PlannerInfo*root,
47-
RelOptInfo*rel,RelOptInfo*childrel);
4846
staticintmatch_expr_to_partition_keys(Expr*expr,RelOptInfo*rel,
4947
boolstrict_op);
5048

@@ -1401,6 +1399,10 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
14011399
{
14021400
RelOptInfo*child_rel1=rel1->part_rels[cnt_parts];
14031401
RelOptInfo*child_rel2=rel2->part_rels[cnt_parts];
1402+
boolrel1_empty= (child_rel1==NULL||
1403+
IS_DUMMY_REL(child_rel1));
1404+
boolrel2_empty= (child_rel2==NULL||
1405+
IS_DUMMY_REL(child_rel2));
14041406
SpecialJoinInfo*child_sjinfo;
14051407
List*child_restrictlist;
14061408
RelOptInfo*child_joinrel;
@@ -1409,24 +1411,69 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
14091411
intnappinfos;
14101412

14111413
/*
1412-
* If a child table has consider_partitionwise_join=false, it means
1414+
* Check for cases where we can prove that this segment of the join
1415+
* returns no rows, due to one or both inputs being empty (including
1416+
* inputs that have been pruned away entirely). If so just ignore it.
1417+
* These rules are equivalent to populate_joinrel_with_paths's rules
1418+
* for dummy input relations.
1419+
*/
1420+
switch (parent_sjinfo->jointype)
1421+
{
1422+
caseJOIN_INNER:
1423+
caseJOIN_SEMI:
1424+
if (rel1_empty||rel2_empty)
1425+
continue;/* ignore this join segment */
1426+
break;
1427+
caseJOIN_LEFT:
1428+
caseJOIN_ANTI:
1429+
if (rel1_empty)
1430+
continue;/* ignore this join segment */
1431+
break;
1432+
caseJOIN_FULL:
1433+
if (rel1_empty&&rel2_empty)
1434+
continue;/* ignore this join segment */
1435+
break;
1436+
default:
1437+
/* other values not expected here */
1438+
elog(ERROR,"unrecognized join type: %d",
1439+
(int)parent_sjinfo->jointype);
1440+
break;
1441+
}
1442+
1443+
/*
1444+
* If a child has been pruned entirely then we can't generate paths
1445+
* for it, so we have to reject partitionwise joining unless we were
1446+
* able to eliminate this partition above.
1447+
*/
1448+
if (child_rel1==NULL||child_rel2==NULL)
1449+
{
1450+
/*
1451+
* Mark the joinrel as unpartitioned so that later functions treat
1452+
* it correctly.
1453+
*/
1454+
joinrel->nparts=0;
1455+
return;
1456+
}
1457+
1458+
/*
1459+
* If a leaf relation has consider_partitionwise_join=false, it means
14131460
* that it's a dummy relation for which we skipped setting up tlist
1414-
* expressions and adding EC members in set_append_rel_size(), so do
1415-
*that now for use later.
1461+
* expressions and adding EC members in set_append_rel_size(), so
1462+
*again we have to fail here.
14161463
*/
14171464
if (rel1_is_simple&& !child_rel1->consider_partitionwise_join)
14181465
{
14191466
Assert(child_rel1->reloptkind==RELOPT_OTHER_MEMBER_REL);
14201467
Assert(IS_DUMMY_REL(child_rel1));
1421-
update_child_rel_info(root,rel1,child_rel1);
1422-
child_rel1->consider_partitionwise_join= true;
1468+
joinrel->nparts=0;
1469+
return;
14231470
}
14241471
if (rel2_is_simple&& !child_rel2->consider_partitionwise_join)
14251472
{
14261473
Assert(child_rel2->reloptkind==RELOPT_OTHER_MEMBER_REL);
14271474
Assert(IS_DUMMY_REL(child_rel2));
1428-
update_child_rel_info(root,rel2,child_rel2);
1429-
child_rel2->consider_partitionwise_join= true;
1475+
joinrel->nparts=0;
1476+
return;
14301477
}
14311478

14321479
/* We should never try to join two overlapping sets of rels. */
@@ -1470,28 +1517,6 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
14701517
}
14711518
}
14721519

1473-
/*
1474-
* Set up tlist expressions for the childrel, and add EC members referencing
1475-
* the childrel.
1476-
*/
1477-
staticvoid
1478-
update_child_rel_info(PlannerInfo*root,
1479-
RelOptInfo*rel,RelOptInfo*childrel)
1480-
{
1481-
AppendRelInfo*appinfo=root->append_rel_array[childrel->relid];
1482-
1483-
/* Make child tlist expressions */
1484-
childrel->reltarget->exprs= (List*)
1485-
adjust_appendrel_attrs(root,
1486-
(Node*)rel->reltarget->exprs,
1487-
1,&appinfo);
1488-
1489-
/* Make child entries in the EquivalenceClass as well */
1490-
if (rel->has_eclass_joins||has_useful_pathkeys(root,rel))
1491-
add_child_rel_equivalences(root,appinfo,rel,childrel);
1492-
childrel->has_eclass_joins=rel->has_eclass_joins;
1493-
}
1494-
14951520
/*
14961521
* Returns true if there exists an equi-join condition for each pair of
14971522
* partition keys from given relations being joined.

‎src/backend/optimizer/plan/planner.c

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -6963,6 +6963,10 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
69636963
List*child_scanjoin_targets=NIL;
69646964
ListCell*lc;
69656965

6966+
/* Pruned or dummy children can be ignored. */
6967+
if (child_rel==NULL||IS_DUMMY_REL(child_rel))
6968+
continue;
6969+
69666970
/* Translate scan/join targets for this child. */
69676971
appinfos=find_appinfos_by_relids(root,child_rel->relids,
69686972
&nappinfos);
@@ -7063,8 +7067,9 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
70637067
RelOptInfo*child_grouped_rel;
70647068
RelOptInfo*child_partially_grouped_rel;
70657069

7066-
/* Input child rel must have a path */
7067-
Assert(child_input_rel->pathlist!=NIL);
7070+
/* Pruned or dummy children can be ignored. */
7071+
if (child_input_rel==NULL||IS_DUMMY_REL(child_input_rel))
7072+
continue;
70687073

70697074
/*
70707075
* Copy the given "extra" structure as is and then override the
@@ -7106,14 +7111,6 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
71067111
extra->target_parallel_safe,
71077112
child_extra.havingQual);
71087113

7109-
/* Ignore empty children. They contribute nothing. */
7110-
if (IS_DUMMY_REL(child_input_rel))
7111-
{
7112-
mark_dummy_rel(child_grouped_rel);
7113-
7114-
continue;
7115-
}
7116-
71177114
/* Create grouping paths for this child relation. */
71187115
create_ordinary_grouping_paths(root,child_input_rel,
71197116
child_grouped_rel,

‎src/test/regress/expected/partition_aggregate.out

Lines changed: 39 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -716,37 +716,33 @@ SELECT a.x, sum(b.x) FROM pagg_tab1 a FULL OUTER JOIN pagg_tab2 b ON a.x = b.y G
716716
| 500
717717
(16 rows)
718718

719-
-- LEFT JOIN, with dummy relation on right side,
719+
-- LEFT JOIN, with dummy relation on right side, ideally
720720
-- should produce full partitionwise aggregation plan as GROUP BY is on
721-
-- non-nullable columns
721+
-- non-nullable columns.
722+
-- But right now we are unable to do partitionwise join in this case.
722723
EXPLAIN (COSTS OFF)
723724
SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a LEFT JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20 GROUP BY a.x, b.y ORDER BY 1, 2;
724-
QUERY PLAN
725-
-----------------------------------------------------------------------------
725+
QUERY PLAN
726+
-----------------------------------------------------------------------
726727
Sort
727-
Sort Key: pagg_tab1_p1.x, y
728-
->Append
729-
-> HashAggregate
730-
Group Key: pagg_tab1_p1.x, y
731-
-> Hash Left Join
732-
Hash Cond: (pagg_tab1_p1.x= y)
733-
Filter: ((pagg_tab1_p1.x > 5) OR (y < 20))
728+
Sort Key: pagg_tab1_p1.x,pagg_tab2_p2.y
729+
->HashAggregate
730+
Group Key: pagg_tab1_p1.x, pagg_tab2_p2.y
731+
-> Hash Left Join
732+
Hash Cond: (pagg_tab1_p1.x = pagg_tab2_p2.y)
733+
Filter: ((pagg_tab1_p1.x> 5) OR (pagg_tab2_p2.y < 20))
734+
-> Append
734735
-> Seq Scan on pagg_tab1_p1
735736
Filter: (x < 20)
736-
-> Hash
737-
-> Result
738-
One-Time Filter: false
739-
-> HashAggregate
740-
Group Key: pagg_tab1_p2.x, pagg_tab2_p2.y
741-
-> Hash Left Join
742-
Hash Cond: (pagg_tab1_p2.x = pagg_tab2_p2.y)
743-
Filter: ((pagg_tab1_p2.x > 5) OR (pagg_tab2_p2.y < 20))
744737
-> Seq Scan on pagg_tab1_p2
745738
Filter: (x < 20)
746-
-> Hash
739+
-> Hash
740+
-> Append
747741
-> Seq Scan on pagg_tab2_p2
748742
Filter: (y > 10)
749-
(23 rows)
743+
-> Seq Scan on pagg_tab2_p3
744+
Filter: (y > 10)
745+
(18 rows)
750746

751747
SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a LEFT JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20 GROUP BY a.x, b.y ORDER BY 1, 2;
752748
x | y | count
@@ -760,49 +756,33 @@ SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a LEFT JOI
760756
18 | 18 | 100
761757
(7 rows)
762758

763-
-- FULL JOIN, with dummy relations on both sides,
759+
-- FULL JOIN, with dummy relations on both sides, ideally
764760
-- should produce partial partitionwise aggregation plan as GROUP BY is on
765-
-- nullable columns
761+
-- nullable columns.
762+
-- But right now we are unable to do partitionwise join in this case.
766763
EXPLAIN (COSTS OFF)
767764
SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a FULL JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20 GROUP BY a.x, b.y ORDER BY 1, 2;
768-
QUERY PLAN
769-
-----------------------------------------------------------------------------------
770-
Finalize GroupAggregate
771-
Group Key: pagg_tab1_p1.x, y
772-
-> Sort
773-
Sort Key: pagg_tab1_p1.x, y
774-
-> Append
775-
-> Partial HashAggregate
776-
Group Key: pagg_tab1_p1.x, y
777-
-> Hash Full Join
778-
Hash Cond: (pagg_tab1_p1.x = y)
779-
Filter: ((pagg_tab1_p1.x > 5) OR (y < 20))
780-
-> Seq Scan on pagg_tab1_p1
781-
Filter: (x < 20)
782-
-> Hash
783-
-> Result
784-
One-Time Filter: false
785-
-> Partial HashAggregate
786-
Group Key: pagg_tab1_p2.x, pagg_tab2_p2.y
787-
-> Hash Full Join
788-
Hash Cond: (pagg_tab1_p2.x = pagg_tab2_p2.y)
789-
Filter: ((pagg_tab1_p2.x > 5) OR (pagg_tab2_p2.y < 20))
790-
-> Seq Scan on pagg_tab1_p2
791-
Filter: (x < 20)
792-
-> Hash
793-
-> Seq Scan on pagg_tab2_p2
794-
Filter: (y > 10)
795-
-> Partial HashAggregate
796-
Group Key: x, pagg_tab2_p3.y
797-
-> Hash Full Join
798-
Hash Cond: (pagg_tab2_p3.y = x)
799-
Filter: ((x > 5) OR (pagg_tab2_p3.y < 20))
765+
QUERY PLAN
766+
-----------------------------------------------------------------------
767+
Sort
768+
Sort Key: pagg_tab1_p1.x, pagg_tab2_p2.y
769+
-> HashAggregate
770+
Group Key: pagg_tab1_p1.x, pagg_tab2_p2.y
771+
-> Hash Full Join
772+
Hash Cond: (pagg_tab1_p1.x = pagg_tab2_p2.y)
773+
Filter: ((pagg_tab1_p1.x > 5) OR (pagg_tab2_p2.y < 20))
774+
-> Append
775+
-> Seq Scan on pagg_tab1_p1
776+
Filter: (x < 20)
777+
-> Seq Scan on pagg_tab1_p2
778+
Filter: (x < 20)
779+
-> Hash
780+
-> Append
781+
-> Seq Scan on pagg_tab2_p2
782+
Filter: (y > 10)
800783
-> Seq Scan on pagg_tab2_p3
801784
Filter: (y > 10)
802-
-> Hash
803-
-> Result
804-
One-Time Filter: false
805-
(35 rows)
785+
(18 rows)
806786

807787
SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a FULL JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20 GROUP BY a.x, b.y ORDER BY 1, 2;
808788
x | y | count

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp