Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commite7c02a5

Browse files
committed
Fix planner failures with overlapping mergejoin clauses in an outer join.
Given overlapping or partially redundant join clauses, for examplet1 JOIN t2 ON t1.a = t2.x AND t1.b = t2.xthe planner's EquivalenceClass machinery will ordinarily refactor theclauses as "t1.a = t1.b AND t1.a = t2.x", so that join processing doesn'tsee multiple references to the same EquivalenceClass in a list of joinequality clauses. However, if the join is outer, it's incorrect to derivea restriction clause on the outer side from the join conditions, so theclause refactoring does not happen and we end up with overlapping joinconditions. The code that attempted to deal with such cases had severalsubtle bugs, which could result in "left and right pathkeys do not match inmergejoin" or "outer pathkeys do not match mergeclauses" planner errors,if the selected join plan type was a mergejoin. (It does not appear thatany actually incorrect plan could have been emitted.)The core of the problem really was failure to recognize that the outer andinner relations' pathkeys have different relationships to the mergeclauselist. A join's mergeclause list is constructed by reference to the outerpathkeys, so it will always be ordered the same as the outer pathkeys, butthis cannot be presumed true for the inner pathkeys. If the inner sides ofthe mergeclauses contain multiple references to the same EquivalenceClass({t2.x} in the above example) then a simplistic rendering of the requiredinner sort order is like "ORDER BY t2.x, t2.x", but the pathkey machineryrecognizes that the second sort column is redundant and throws it away.The mergejoin planning code failed to account for that behavior properly.One error was to try to generate cut-down versions of the mergeclause listfrom cut-down versions of the inner pathkeys in the same way as the initialconstruction of the mergeclause list from the outer pathkeys was done; thiscould lead to choosing a mergeclause list that fails to match the outerpathkeys. The other problem was that the pathkey cross-checking code increate_mergejoin_plan treated the inner and outer pathkey listsidentically, whereas actually the expectations for them must be different.That led to false "pathkeys do not match" failures in some cases, and inprinciple could have led to failure to detect bogus plans in other cases,though there is no indication that such bogus plans could be generated.Reported by Alexander Kuzmenkov, who also reviewed this patch. This hasbeen broken for years (back to around 8.3 according to my testing), soback-patch to all supported branches.Discussion:https://postgr.es/m/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru
1 parent83fce67 commite7c02a5

File tree

6 files changed

+322
-124
lines changed

6 files changed

+322
-124
lines changed

‎src/backend/optimizer/path/joinpath.c

Lines changed: 14 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -739,10 +739,10 @@ sort_inner_and_outer(PlannerInfo *root,
739739
outerkeys=all_pathkeys;/* no work at first one... */
740740

741741
/* Sort the mergeclauses into the corresponding ordering */
742-
cur_mergeclauses=find_mergeclauses_for_pathkeys(root,
743-
outerkeys,
744-
true,
745-
extra->mergeclause_list);
742+
cur_mergeclauses=
743+
find_mergeclauses_for_outer_pathkeys(root,
744+
outerkeys,
745+
extra->mergeclause_list);
746746

747747
/* Should have used them all... */
748748
Assert(list_length(cur_mergeclauses)==list_length(extra->mergeclause_list));
@@ -987,10 +987,10 @@ match_unsorted_outer(PlannerInfo *root,
987987
continue;
988988

989989
/* Look for useful mergeclauses (if any) */
990-
mergeclauses=find_mergeclauses_for_pathkeys(root,
991-
outerpath->pathkeys,
992-
true,
993-
extra->mergeclause_list);
990+
mergeclauses=
991+
find_mergeclauses_for_outer_pathkeys(root,
992+
outerpath->pathkeys,
993+
extra->mergeclause_list);
994994

995995
/*
996996
* Done with this outer path if no chance for a mergejoin.
@@ -1110,10 +1110,9 @@ match_unsorted_outer(PlannerInfo *root,
11101110
if (sortkeycnt<num_sortkeys)
11111111
{
11121112
newclauses=
1113-
find_mergeclauses_for_pathkeys(root,
1114-
trialsortkeys,
1115-
false,
1116-
mergeclauses);
1113+
trim_mergeclauses_for_inner_pathkeys(root,
1114+
mergeclauses,
1115+
trialsortkeys);
11171116
Assert(newclauses!=NIL);
11181117
}
11191118
else
@@ -1152,10 +1151,9 @@ match_unsorted_outer(PlannerInfo *root,
11521151
if (sortkeycnt<num_sortkeys)
11531152
{
11541153
newclauses=
1155-
find_mergeclauses_for_pathkeys(root,
1156-
trialsortkeys,
1157-
false,
1158-
mergeclauses);
1154+
trim_mergeclauses_for_inner_pathkeys(root,
1155+
mergeclauses,
1156+
trialsortkeys);
11591157
Assert(newclauses!=NIL);
11601158
}
11611159
else

‎src/backend/optimizer/path/pathkeys.c

Lines changed: 121 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -941,29 +941,27 @@ update_mergeclause_eclasses(PlannerInfo *root, RestrictInfo *restrictinfo)
941941
}
942942

943943
/*
944-
*find_mergeclauses_for_pathkeys
945-
* This routine attempts to find aset of mergeclauses that can be
946-
* used with a specified ordering forone oftheinput relations.
944+
*find_mergeclauses_for_outer_pathkeys
945+
* This routine attempts to find alist of mergeclauses that can be
946+
* used with a specified ordering for thejoin's outer relation.
947947
* If successful, it returns a list of mergeclauses.
948948
*
949-
* 'pathkeys' is a pathkeys list showing the ordering of an input path.
950-
* 'outer_keys' is TRUE if these keys are for the outer input path,
951-
*FALSE if for inner.
949+
* 'pathkeys' is a pathkeys list showing the ordering of an outer-rel path.
952950
* 'restrictinfos' is a list of mergejoinable restriction clauses for the
953-
*join relation being formed.
951+
*join relation being formed, in no particular order.
954952
*
955953
* The restrictinfos must be marked (via outer_is_left) to show which side
956954
* of each clause is associated with the current outer path. (See
957955
* select_mergejoin_clauses())
958956
*
959957
* The result is NIL if no merge can be done, else a maximal list of
960958
* usable mergeclauses (represented as a list of their restrictinfo nodes).
959+
* The list is ordered to match the pathkeys, as required for execution.
961960
*/
962961
List*
963-
find_mergeclauses_for_pathkeys(PlannerInfo*root,
964-
List*pathkeys,
965-
boolouter_keys,
966-
List*restrictinfos)
962+
find_mergeclauses_for_outer_pathkeys(PlannerInfo*root,
963+
List*pathkeys,
964+
List*restrictinfos)
967965
{
968966
List*mergeclauses=NIL;
969967
ListCell*i;
@@ -1004,32 +1002,29 @@ find_mergeclauses_for_pathkeys(PlannerInfo *root,
10041002
*
10051003
* It's possible that multiple matching clauses might have different
10061004
* ECs on the other side, in which case the order we put them into our
1007-
* result makes a difference in the pathkeys required for theother
1008-
* inputpath. However this routine hasn't got any info about which
1005+
* result makes a difference in the pathkeys required for theinner
1006+
* inputrel. However this routine hasn't got any info about which
10091007
* order would be best, so we don't worry about that.
10101008
*
10111009
* It's also possible that the selected mergejoin clauses produce
1012-
* a noncanonical ordering of pathkeys for theother side, ie, we
1010+
* a noncanonical ordering of pathkeys for theinner side, ie, we
10131011
* might select clauses that reference b.v1, b.v2, b.v1 in that
10141012
* order. This is not harmful in itself, though it suggests that
1015-
* the clauses are partially redundant. Since it happens only with
1016-
* redundant query conditions, we don't bother to eliminate it.
1017-
* make_inner_pathkeys_for_merge() has to delete duplicates when
1018-
* it constructs the canonical pathkeys list, and we also have to
1019-
* deal with the case in create_mergejoin_plan().
1013+
* the clauses are partially redundant. Since the alternative is
1014+
* to omit mergejoin clauses and thereby possibly fail to generate a
1015+
* plan altogether, we live with it. make_inner_pathkeys_for_merge()
1016+
* has to delete duplicates when it constructs the inner pathkeys
1017+
* list, and we also have to deal with such cases specially in
1018+
* create_mergejoin_plan().
10201019
*----------
10211020
*/
10221021
foreach(j,restrictinfos)
10231022
{
10241023
RestrictInfo*rinfo= (RestrictInfo*)lfirst(j);
10251024
EquivalenceClass*clause_ec;
10261025

1027-
if (outer_keys)
1028-
clause_ec=rinfo->outer_is_left ?
1029-
rinfo->left_ec :rinfo->right_ec;
1030-
else
1031-
clause_ec=rinfo->outer_is_left ?
1032-
rinfo->right_ec :rinfo->left_ec;
1026+
clause_ec=rinfo->outer_is_left ?
1027+
rinfo->left_ec :rinfo->right_ec;
10331028
if (clause_ec==pathkey_ec)
10341029
matched_restrictinfos=lappend(matched_restrictinfos,rinfo);
10351030
}
@@ -1233,8 +1228,8 @@ select_outer_pathkeys_for_merge(PlannerInfo *root,
12331228
* must be applied to an inner path to make it usable with the
12341229
* given mergeclauses.
12351230
*
1236-
* 'mergeclauses' is a list of RestrictInfos for mergejoin clauses
1237-
*that will be used in a merge join.
1231+
* 'mergeclauses' is a list of RestrictInfos forthemergejoin clauses
1232+
*that will be used in a merge join, in order.
12381233
* 'outer_pathkeys' are the already-known canonical pathkeys for the outer
12391234
*side of the join.
12401235
*
@@ -1311,8 +1306,13 @@ make_inner_pathkeys_for_merge(PlannerInfo *root,
13111306
opathkey->pk_nulls_first);
13121307

13131308
/*
1314-
* Don't generate redundant pathkeys (can happen if multiple
1315-
* mergeclauses refer to same EC).
1309+
* Don't generate redundant pathkeys (which can happen if multiple
1310+
* mergeclauses refer to the same EC). Because we do this, the output
1311+
* pathkey list isn't necessarily ordered like the mergeclauses, which
1312+
* complicates life for create_mergejoin_plan(). But if we didn't,
1313+
* we'd have a noncanonical sort key list, which would be bad; for one
1314+
* reason, it certainly wouldn't match any available sort order for
1315+
* the input relation.
13161316
*/
13171317
if (!pathkey_is_redundant(pathkey,pathkeys))
13181318
pathkeys=lappend(pathkeys,pathkey);
@@ -1321,6 +1321,98 @@ make_inner_pathkeys_for_merge(PlannerInfo *root,
13211321
returnpathkeys;
13221322
}
13231323

1324+
/*
1325+
* trim_mergeclauses_for_inner_pathkeys
1326+
* This routine trims a list of mergeclauses to include just those that
1327+
* work with a specified ordering for the join's inner relation.
1328+
*
1329+
* 'mergeclauses' is a list of RestrictInfos for mergejoin clauses for the
1330+
*join relation being formed, in an order known to work for the
1331+
*currently-considered sort ordering of the join's outer rel.
1332+
* 'pathkeys' is a pathkeys list showing the ordering of an inner-rel path;
1333+
*it should be equal to, or a truncation of, the result of
1334+
*make_inner_pathkeys_for_merge for these mergeclauses.
1335+
*
1336+
* What we return will be a prefix of the given mergeclauses list.
1337+
*
1338+
* We need this logic because make_inner_pathkeys_for_merge's result isn't
1339+
* necessarily in the same order as the mergeclauses. That means that if we
1340+
* consider an inner-rel pathkey list that is a truncation of that result,
1341+
* we might need to drop mergeclauses even though they match a surviving inner
1342+
* pathkey. This happens when they are to the right of a mergeclause that
1343+
* matches a removed inner pathkey.
1344+
*
1345+
* The mergeclauses must be marked (via outer_is_left) to show which side
1346+
* of each clause is associated with the current outer path. (See
1347+
* select_mergejoin_clauses())
1348+
*/
1349+
List*
1350+
trim_mergeclauses_for_inner_pathkeys(PlannerInfo*root,
1351+
List*mergeclauses,
1352+
List*pathkeys)
1353+
{
1354+
List*new_mergeclauses=NIL;
1355+
PathKey*pathkey;
1356+
EquivalenceClass*pathkey_ec;
1357+
boolmatched_pathkey;
1358+
ListCell*lip;
1359+
ListCell*i;
1360+
1361+
/* No pathkeys => no mergeclauses (though we don't expect this case) */
1362+
if (pathkeys==NIL)
1363+
returnNIL;
1364+
/* Initialize to consider first pathkey */
1365+
lip=list_head(pathkeys);
1366+
pathkey= (PathKey*)lfirst(lip);
1367+
pathkey_ec=pathkey->pk_eclass;
1368+
lip=lnext(lip);
1369+
matched_pathkey= false;
1370+
1371+
/* Scan mergeclauses to see how many we can use */
1372+
foreach(i,mergeclauses)
1373+
{
1374+
RestrictInfo*rinfo= (RestrictInfo*)lfirst(i);
1375+
EquivalenceClass*clause_ec;
1376+
1377+
/* Assume we needn't do update_mergeclause_eclasses again here */
1378+
1379+
/* Check clause's inner-rel EC against current pathkey */
1380+
clause_ec=rinfo->outer_is_left ?
1381+
rinfo->right_ec :rinfo->left_ec;
1382+
1383+
/* If we don't have a match, attempt to advance to next pathkey */
1384+
if (clause_ec!=pathkey_ec)
1385+
{
1386+
/* If we had no clauses matching this inner pathkey, must stop */
1387+
if (!matched_pathkey)
1388+
break;
1389+
1390+
/* Advance to next inner pathkey, if any */
1391+
if (lip==NULL)
1392+
break;
1393+
pathkey= (PathKey*)lfirst(lip);
1394+
pathkey_ec=pathkey->pk_eclass;
1395+
lip=lnext(lip);
1396+
matched_pathkey= false;
1397+
}
1398+
1399+
/* If mergeclause matches current inner pathkey, we can use it */
1400+
if (clause_ec==pathkey_ec)
1401+
{
1402+
new_mergeclauses=lappend(new_mergeclauses,rinfo);
1403+
matched_pathkey= true;
1404+
}
1405+
else
1406+
{
1407+
/* Else, no hope of adding any more mergeclauses */
1408+
break;
1409+
}
1410+
}
1411+
1412+
returnnew_mergeclauses;
1413+
}
1414+
1415+
13241416
/****************************************************************************
13251417
*PATHKEY USEFULNESS CHECKS
13261418
*

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp