Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commite9e26b5

Browse files
committed
Invent "multibitmapsets", and use them to speed up antijoin detection.
Implement a data structure that is a List of Bitmapsets, which isessentially a 2-D boolean array except that the rows need not allbe the same width. Operations such as union and intersection aremeaningful for these, just as they are for Bitmapsets. Eventuallywe might build many of the same operations that we have written forBitmapsets, but for the first use-case we just need a few.That first use-case is for antijoin detection: reduce_outer_joinsneeds to find the set of Vars that are certain to be non-null in asuccessfully joined (not null-extended) left join row, and alsofind the set of Vars subject to higher-level IS NULL constraints,and intersect them. We had been doing this by making Lists ofthe Var nodes and then using list_intersect, which works but ispretty inefficient compared to a bitmapset-like intersection.Potentially it's O(N^2) if there are a lot of Vars involved,which fortunately there generally aren't; still it's not great.Moreover, that method requires the Vars of interest to be exactlyequal() in the join condition and the upper IS NULL condition,which is problematic for my WIP patch that labels Vars accordingto which outer joins have possibly nulled them.Discussion:https://postgr.es/m/892228.1668437838@sss.pgh.pa.usDiscussion:https://postgr.es/m/CAMbWs4-mvPPCJ1W6iK6dD5HiNwoJdi6mZp=-7mE8N9Sh+cd0tQ@mail.gmail.com
1 parent90e4f30 commite9e26b5

File tree

7 files changed

+238
-25
lines changed

7 files changed

+238
-25
lines changed

‎src/backend/nodes/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ OBJS = \
2121
extensible.o\
2222
list.o\
2323
makefuncs.o\
24+
multibitmapset.o\
2425
nodeFuncs.o\
2526
nodes.o\
2627
outfuncs.o\

‎src/backend/nodes/README

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ FILES IN THIS DIRECTORY (src/backend/nodes/)
5858
Specialized manipulation functions:
5959
bitmapset.c- Bitmapset support
6060
list.c- generic list support
61+
multibitmapset.c - List-of-Bitmapset support
6162
params.c- Param support
6263
tidbitmap.c- TIDBitmap support
6364
value.c- support for value nodes

‎src/backend/nodes/meson.build

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ backend_sources += files(
33
'extensible.c',
44
'list.c',
55
'makefuncs.c',
6+
'multibitmapset.c',
67
'nodeFuncs.c',
78
'nodes.c',
89
'params.c',

‎src/backend/nodes/multibitmapset.c

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
/*-------------------------------------------------------------------------
2+
*
3+
* multibitmapset.c
4+
* Lists of Bitmapsets
5+
*
6+
* A multibitmapset is useful in situations where members of a set can
7+
* be identified by two small integers; for example, varno and varattno
8+
* of a group of Vars within a query. The implementation is a List of
9+
* Bitmapsets, so that the empty set can be represented by NIL. (But,
10+
* as with Bitmapsets, that's not the only allowed representation.)
11+
* The zero-based index of a List element is the first identifying value,
12+
* and the (also zero-based) index of a bit within that Bitmapset is
13+
* the second identifying value. There is no expectation that the
14+
* Bitmapsets should all be the same size.
15+
*
16+
* The available operations on multibitmapsets are intended to parallel
17+
* those on bitmapsets, for example union and intersection. So far only
18+
* a small fraction of that has been built out; we'll add more as needed.
19+
*
20+
*
21+
* Copyright (c) 2022, PostgreSQL Global Development Group
22+
*
23+
* IDENTIFICATION
24+
* src/backend/nodes/multibitmapset.c
25+
*
26+
*-------------------------------------------------------------------------
27+
*/
28+
#include"postgres.h"
29+
30+
#include"nodes/multibitmapset.h"
31+
32+
33+
/*
34+
* mbms_add_member
35+
*Add a new member to a multibitmapset.
36+
*
37+
* The new member is identified by "listidx", the zero-based index of the
38+
* List element it should go into, and "bitidx", which specifies the bit
39+
* number to be set therein.
40+
*
41+
* This is like bms_add_member, but for multibitmapsets.
42+
*/
43+
List*
44+
mbms_add_member(List*a,intlistidx,intbitidx)
45+
{
46+
Bitmapset*bms;
47+
ListCell*lc;
48+
49+
if (listidx<0||bitidx<0)
50+
elog(ERROR,"negative multibitmapset member index not allowed");
51+
/* Add empty elements as needed */
52+
while (list_length(a) <=listidx)
53+
a=lappend(a,NULL);
54+
/* Update the target element */
55+
lc=list_nth_cell(a,listidx);
56+
bms=lfirst_node(Bitmapset,lc);
57+
bms=bms_add_member(bms,bitidx);
58+
lfirst(lc)=bms;
59+
returna;
60+
}
61+
62+
/*
63+
* mbms_add_members
64+
*Add all members of set b to set a.
65+
*
66+
* This is a UNION operation, but the left input is modified in-place.
67+
*
68+
* This is like bms_add_members, but for multibitmapsets.
69+
*/
70+
List*
71+
mbms_add_members(List*a,constList*b)
72+
{
73+
ListCell*lca,
74+
*lcb;
75+
76+
/* Add empty elements to a, as needed */
77+
while (list_length(a)<list_length(b))
78+
a=lappend(a,NULL);
79+
/* forboth will stop at the end of the shorter list, which is fine */
80+
forboth(lca,a,lcb,b)
81+
{
82+
Bitmapset*bmsa=lfirst_node(Bitmapset,lca);
83+
constBitmapset*bmsb=lfirst_node(Bitmapset,lcb);
84+
85+
bmsa=bms_add_members(bmsa,bmsb);
86+
lfirst(lca)=bmsa;
87+
}
88+
returna;
89+
}
90+
91+
/*
92+
* mbms_int_members
93+
*Reduce set a to its intersection with set b.
94+
*
95+
* This is an INTERSECT operation, but the left input is modified in-place.
96+
*
97+
* This is like bms_int_members, but for multibitmapsets.
98+
*/
99+
List*
100+
mbms_int_members(List*a,constList*b)
101+
{
102+
ListCell*lca,
103+
*lcb;
104+
105+
/* Remove any elements of a that are no longer of use */
106+
a=list_truncate(a,list_length(b));
107+
/* forboth will stop at the end of the shorter list, which is fine */
108+
forboth(lca,a,lcb,b)
109+
{
110+
Bitmapset*bmsa=lfirst_node(Bitmapset,lca);
111+
constBitmapset*bmsb=lfirst_node(Bitmapset,lcb);
112+
113+
bmsa=bms_int_members(bmsa,bmsb);
114+
lfirst(lca)=bmsa;
115+
}
116+
returna;
117+
}
118+
119+
/*
120+
* mbms_is_member
121+
*Is listidx/bitidx a member of A?
122+
*
123+
* This is like bms_is_member, but for multibitmapsets.
124+
*/
125+
bool
126+
mbms_is_member(intlistidx,intbitidx,constList*a)
127+
{
128+
constBitmapset*bms;
129+
130+
/* XXX better to just return false for negative indexes? */
131+
if (listidx<0||bitidx<0)
132+
elog(ERROR,"negative multibitmapset member index not allowed");
133+
if (listidx >=list_length(a))
134+
return false;
135+
bms=list_nth_node(Bitmapset,a,listidx);
136+
returnbms_is_member(bitidx,bms);
137+
}
138+
139+
/*
140+
* mbms_overlap_sets
141+
*Identify the bitmapsets having common members in a and b.
142+
*
143+
* The result is a bitmapset of the list indexes of bitmapsets that overlap.
144+
*/
145+
Bitmapset*
146+
mbms_overlap_sets(constList*a,constList*b)
147+
{
148+
Bitmapset*result=NULL;
149+
ListCell*lca,
150+
*lcb;
151+
152+
/* forboth will stop at the end of the shorter list, which is fine */
153+
forboth(lca,a,lcb,b)
154+
{
155+
constBitmapset*bmsa=lfirst_node(Bitmapset,lca);
156+
constBitmapset*bmsb=lfirst_node(Bitmapset,lcb);
157+
158+
if (bms_overlap(bmsa,bmsb))
159+
result=bms_add_member(result,foreach_current_index(lca));
160+
}
161+
returnresult;
162+
}

‎src/backend/optimizer/prep/prepjointree.c

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
#include"catalog/pg_type.h"
2929
#include"funcapi.h"
3030
#include"nodes/makefuncs.h"
31+
#include"nodes/multibitmapset.h"
3132
#include"nodes/nodeFuncs.h"
3233
#include"optimizer/clauses.h"
3334
#include"optimizer/optimizer.h"
@@ -2769,7 +2770,7 @@ reduce_outer_joins_pass1(Node *jtnode)
27692770
*state: state data collected by phase 1 for this node
27702771
*root: toplevel planner state
27712772
*nonnullable_rels: set of base relids forced non-null by upper quals
2772-
*forced_null_vars:list of Vars forced null by upper quals
2773+
*forced_null_vars:multibitmapset of Vars forced null by upper quals
27732774
*/
27742775
staticvoid
27752776
reduce_outer_joins_pass2(Node*jtnode,
@@ -2799,8 +2800,8 @@ reduce_outer_joins_pass2(Node *jtnode,
27992800
pass_nonnullable_rels=bms_add_members(pass_nonnullable_rels,
28002801
nonnullable_rels);
28012802
pass_forced_null_vars=find_forced_null_vars(f->quals);
2802-
pass_forced_null_vars=list_concat(pass_forced_null_vars,
2803-
forced_null_vars);
2803+
pass_forced_null_vars=mbms_add_members(pass_forced_null_vars,
2804+
forced_null_vars);
28042805
/* And recurse --- but only into interesting subtrees */
28052806
Assert(list_length(f->fromlist)==list_length(state->sub_states));
28062807
forboth(l,f->fromlist,s,state->sub_states)
@@ -2897,7 +2898,7 @@ reduce_outer_joins_pass2(Node *jtnode,
28972898
if (jointype==JOIN_LEFT)
28982899
{
28992900
List*nonnullable_vars;
2900-
List*overlap;
2901+
Bitmapset*overlap;
29012902

29022903
/* Find Vars in j->quals that must be non-null in joined rows */
29032904
nonnullable_vars=find_nonnullable_vars(j->quals);
@@ -2907,11 +2908,8 @@ reduce_outer_joins_pass2(Node *jtnode,
29072908
* forced_null_vars overlap: we need to know if the overlap
29082909
* includes any RHS variables.
29092910
*/
2910-
overlap=list_intersection(nonnullable_vars,
2911-
forced_null_vars);
2912-
if (overlap!=NIL&&
2913-
bms_overlap(pull_varnos(root, (Node*)overlap),
2914-
right_state->relids))
2911+
overlap=mbms_overlap_sets(nonnullable_vars,forced_null_vars);
2912+
if (bms_overlap(overlap,right_state->relids))
29152913
jointype=JOIN_ANTI;
29162914
}
29172915

@@ -2964,8 +2962,8 @@ reduce_outer_joins_pass2(Node *jtnode,
29642962
/* OK to merge upper and local constraints */
29652963
local_nonnullable_rels=bms_add_members(local_nonnullable_rels,
29662964
nonnullable_rels);
2967-
local_forced_null_vars=list_concat(local_forced_null_vars,
2968-
forced_null_vars);
2965+
local_forced_null_vars=mbms_add_members(local_forced_null_vars,
2966+
forced_null_vars);
29692967
}
29702968
}
29712969
else

‎src/backend/optimizer/util/clauses.c

Lines changed: 25 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
#include"funcapi.h"
3232
#include"miscadmin.h"
3333
#include"nodes/makefuncs.h"
34+
#include"nodes/multibitmapset.h"
3435
#include"nodes/nodeFuncs.h"
3536
#include"nodes/subscripting.h"
3637
#include"nodes/supportnodes.h"
@@ -1566,7 +1567,7 @@ find_nonnullable_rels_walker(Node *node, bool top_level)
15661567
* find_nonnullable_vars
15671568
*Determine which Vars are forced nonnullable by given clause.
15681569
*
1569-
* Returnsa list of all level-zero Vars that are referenced in the clause in
1570+
* Returnsthe set of all level-zero Vars that are referenced in the clause in
15701571
* such a way that the clause cannot possibly return TRUE if any of these Vars
15711572
* is NULL. (It is OK to err on the side of conservatism; hence the analysis
15721573
* here is simplistic.)
@@ -1578,8 +1579,9 @@ find_nonnullable_rels_walker(Node *node, bool top_level)
15781579
* the expression to have been AND/OR flattened and converted to implicit-AND
15791580
* format.
15801581
*
1581-
* The result is a palloc'd List, but we have not copied the member Var nodes.
1582-
* Also, we don't bother trying to eliminate duplicate entries.
1582+
* Attnos of the identified Vars are returned in a multibitmapset (a List of
1583+
* Bitmapsets). List indexes correspond to relids (varnos), while the per-rel
1584+
* Bitmapsets hold varattnos offset by FirstLowInvalidHeapAttributeNumber.
15831585
*
15841586
* top_level is true while scanning top-level AND/OR structure; here, showing
15851587
* the result is either FALSE or NULL is good enough. top_level is false when
@@ -1608,7 +1610,9 @@ find_nonnullable_vars_walker(Node *node, bool top_level)
16081610
Var*var= (Var*)node;
16091611

16101612
if (var->varlevelsup==0)
1611-
result=list_make1(var);
1613+
result=mbms_add_member(result,
1614+
var->varno,
1615+
var->varattno-FirstLowInvalidHeapAttributeNumber);
16121616
}
16131617
elseif (IsA(node,List))
16141618
{
@@ -1623,9 +1627,9 @@ find_nonnullable_vars_walker(Node *node, bool top_level)
16231627
*/
16241628
foreach(l, (List*)node)
16251629
{
1626-
result=list_concat(result,
1627-
find_nonnullable_vars_walker(lfirst(l),
1628-
top_level));
1630+
result=mbms_add_members(result,
1631+
find_nonnullable_vars_walker(lfirst(l),
1632+
top_level));
16291633
}
16301634
}
16311635
elseif (IsA(node,FuncExpr))
@@ -1657,7 +1661,12 @@ find_nonnullable_vars_walker(Node *node, bool top_level)
16571661
switch (expr->boolop)
16581662
{
16591663
caseAND_EXPR:
1660-
/* At top level we can just recurse (to the List case) */
1664+
1665+
/*
1666+
* At top level we can just recurse (to the List case), since
1667+
* the result should be the union of what we can prove in each
1668+
* arm.
1669+
*/
16611670
if (top_level)
16621671
{
16631672
result=find_nonnullable_vars_walker((Node*)expr->args,
@@ -1689,7 +1698,7 @@ find_nonnullable_vars_walker(Node *node, bool top_level)
16891698
if (result==NIL)/* first subresult? */
16901699
result=subresult;
16911700
else
1692-
result=list_intersection(result,subresult);
1701+
result=mbms_int_members(result,subresult);
16931702

16941703
/*
16951704
* If the intersection is empty, we can stop looking. This
@@ -1788,8 +1797,8 @@ find_nonnullable_vars_walker(Node *node, bool top_level)
17881797
* side of conservatism; hence the analysis here is simplistic. In fact,
17891798
* we only detect simple "var IS NULL" tests at the top level.)
17901799
*
1791-
*The result is a palloc'd List, butwehave not copiedthemember Var nodes.
1792-
*Also, we don't bother trying to eliminate duplicate entries.
1800+
*As with find_nonnullable_vars,wereturnthevarattnos of the identified
1801+
*Vars in a multibitmapset.
17931802
*/
17941803
List*
17951804
find_forced_null_vars(Node*node)
@@ -1804,7 +1813,9 @@ find_forced_null_vars(Node *node)
18041813
var=find_forced_null_var(node);
18051814
if (var)
18061815
{
1807-
result=list_make1(var);
1816+
result=mbms_add_member(result,
1817+
var->varno,
1818+
var->varattno-FirstLowInvalidHeapAttributeNumber);
18081819
}
18091820
/* Otherwise, handle AND-conditions */
18101821
elseif (IsA(node,List))
@@ -1815,8 +1826,8 @@ find_forced_null_vars(Node *node)
18151826
*/
18161827
foreach(l, (List*)node)
18171828
{
1818-
result=list_concat(result,
1819-
find_forced_null_vars(lfirst(l)));
1829+
result=mbms_add_members(result,
1830+
find_forced_null_vars((Node*)lfirst(l)));
18201831
}
18211832
}
18221833
elseif (IsA(node,BoolExpr))

‎src/include/nodes/multibitmapset.h

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
/*-------------------------------------------------------------------------
2+
*
3+
* multibitmapset.h
4+
* Lists of Bitmapsets
5+
*
6+
* A multibitmapset is useful in situations where members of a set can
7+
* be identified by two small integers; for example, varno and varattno
8+
* of a group of Vars within a query. The implementation is a List of
9+
* Bitmapsets, so that the empty set can be represented by NIL. (But,
10+
* as with Bitmapsets, that's not the only allowed representation.)
11+
* The zero-based index of a List element is the first identifying value,
12+
* and the (also zero-based) index of a bit within that Bitmapset is
13+
* the second identifying value. There is no expectation that the
14+
* Bitmapsets should all be the same size.
15+
*
16+
* The available operations on multibitmapsets are intended to parallel
17+
* those on bitmapsets, for example union and intersection. So far only
18+
* a small fraction of that has been built out; we'll add more as needed.
19+
*
20+
*
21+
* Copyright (c) 2022, PostgreSQL Global Development Group
22+
*
23+
* src/include/nodes/multibitmapset.h
24+
*
25+
*-------------------------------------------------------------------------
26+
*/
27+
#ifndefMULTIBITMAPSET_H
28+
#defineMULTIBITMAPSET_H
29+
30+
#include"nodes/bitmapset.h"
31+
#include"nodes/pg_list.h"
32+
33+
externList*mbms_add_member(List*a,intlistidx,intbitidx);
34+
externList*mbms_add_members(List*a,constList*b);
35+
externList*mbms_int_members(List*a,constList*b);
36+
externboolmbms_is_member(intlistidx,intbitidx,constList*a);
37+
externBitmapset*mbms_overlap_sets(constList*a,constList*b);
38+
39+
#endif/* MULTIBITMAPSET_H */

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp