Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitf7da492

Browse files
committed
Fix array size allocation for HashAggregate hash keys.
When there were duplicate columns in the hash key list, the arraysizes could be miscomputed, resulting in access off the end of thearray. Adjust the computation to ensure the array is always largeenough.(I considered whether the duplicates could be removed in planning, butI can't rule out the possibility that duplicate columns might havedifferent hash functions assigned. Simpler to just make sure it worksat execution time regardless.)Bug apparently introduced infc4b3de as part of narrowing down thetuples stored in the hashtable. Reported by Colm McHugh of Salesforce,though I didn't use their patch. Backpatch back to version 10 wherethe bug was introduced.Discussion:https://postgr.es/m/CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
1 parenta7b2fca commitf7da492

File tree

3 files changed

+47
-7
lines changed

3 files changed

+47
-7
lines changed

‎src/backend/executor/nodeAgg.c

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1311,9 +1311,14 @@ build_hash_table(AggState *aggstate)
13111311
* by themselves, and secondly ctids for row-marks.
13121312
*
13131313
* To eliminate duplicates, we build a bitmapset of the needed columns, and
1314-
* then build an array of the columns included in the hashtable. Note that
1315-
* the array is preserved over ExecReScanAgg, so we allocate it in the
1316-
* per-query context (unlike the hash table itself).
1314+
* then build an array of the columns included in the hashtable. We might
1315+
* still have duplicates if the passed-in grpColIdx has them, which can happen
1316+
* in edge cases from semijoins/distinct; these can't always be removed,
1317+
* because it's not certain that the duplicate cols will be using the same
1318+
* hash function.
1319+
*
1320+
* Note that the array is preserved over ExecReScanAgg, so we allocate it in
1321+
* the per-query context (unlike the hash table itself).
13171322
*/
13181323
staticvoid
13191324
find_hash_columns(AggState*aggstate)
@@ -1334,6 +1339,7 @@ find_hash_columns(AggState *aggstate)
13341339
AttrNumber*grpColIdx=perhash->aggnode->grpColIdx;
13351340
List*hashTlist=NIL;
13361341
TupleDeschashDesc;
1342+
intmaxCols;
13371343
inti;
13381344

13391345
perhash->largestGrpColIdx=0;
@@ -1358,15 +1364,24 @@ find_hash_columns(AggState *aggstate)
13581364
colnos=bms_del_member(colnos,attnum);
13591365
}
13601366
}
1361-
/* Add in all the grouping columns */
1362-
for (i=0;i<perhash->numCols;i++)
1363-
colnos=bms_add_member(colnos,grpColIdx[i]);
1367+
1368+
/*
1369+
* Compute maximum number of input columns accounting for possible
1370+
* duplications in the grpColIdx array, which can happen in some edge
1371+
* cases where HashAggregate was generated as part of a semijoin or a
1372+
* DISTINCT.
1373+
*/
1374+
maxCols=bms_num_members(colnos)+perhash->numCols;
13641375

13651376
perhash->hashGrpColIdxInput=
1366-
palloc(bms_num_members(colnos)*sizeof(AttrNumber));
1377+
palloc(maxCols*sizeof(AttrNumber));
13671378
perhash->hashGrpColIdxHash=
13681379
palloc(perhash->numCols*sizeof(AttrNumber));
13691380

1381+
/* Add all the grouping columns to colnos */
1382+
for (i=0;i<perhash->numCols;i++)
1383+
colnos=bms_add_member(colnos,grpColIdx[i]);
1384+
13701385
/*
13711386
* First build mapping for columns directly hashed. These are the
13721387
* first, because they'll be accessed when computing hash values and

‎src/test/regress/expected/aggregates.out

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2140,3 +2140,21 @@ select v||'a', case when v||'a' = 'aa' then 1 else 0 end, count(*)
21402140
ba | 0 | 1
21412141
(2 rows)
21422142

2143+
-- Make sure that generation of HashAggregate for uniqification purposes
2144+
-- does not lead to array overflow due to unexpected duplicate hash keys
2145+
-- see CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
2146+
explain (costs off)
2147+
select 1 from tenk1
2148+
where (hundred, thousand) in (select twothousand, twothousand from onek);
2149+
QUERY PLAN
2150+
-------------------------------------------------------------
2151+
Hash Join
2152+
Hash Cond: (tenk1.hundred = onek.twothousand)
2153+
-> Seq Scan on tenk1
2154+
Filter: (hundred = thousand)
2155+
-> Hash
2156+
-> HashAggregate
2157+
Group Key: onek.twothousand, onek.twothousand
2158+
-> Seq Scan on onek
2159+
(8 rows)
2160+

‎src/test/regress/sql/aggregates.sql

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -944,3 +944,10 @@ select v||'a', case v||'a' when 'aa' then 1 else 0 end, count(*)
944944
select v||'a', case when v||'a'='aa' then1 else0 end,count(*)
945945
from unnest(array['a','b']) u(v)
946946
group by v||'a'order by1;
947+
948+
-- Make sure that generation of HashAggregate for uniqification purposes
949+
-- does not lead to array overflow due to unexpected duplicate hash keys
950+
-- see CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
951+
explain (costs off)
952+
select1from tenk1
953+
where (hundred, thousand)in (select twothousand, twothousandfrom onek);

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp