Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit99efd8d

Browse files
committed
Fix array size allocation for HashAggregate hash keys.
When there were duplicate columns in the hash key list, the arraysizes could be miscomputed, resulting in access off the end of thearray. Adjust the computation to ensure the array is always largeenough.(I considered whether the duplicates could be removed in planning, butI can't rule out the possibility that duplicate columns might havedifferent hash functions assigned. Simpler to just make sure it worksat execution time regardless.)Bug apparently introduced infc4b3de as part of narrowing down thetuples stored in the hashtable. Reported by Colm McHugh of Salesforce,though I didn't use their patch. Backpatch back to version 10 wherethe bug was introduced.Discussion:https://postgr.es/m/CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
1 parent2ccebcd commit99efd8d

File tree

3 files changed

+47
-7
lines changed

3 files changed

+47
-7
lines changed

‎src/backend/executor/nodeAgg.c

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1932,9 +1932,14 @@ build_hash_table(AggState *aggstate)
19321932
* by themselves, and secondly ctids for row-marks.
19331933
*
19341934
* To eliminate duplicates, we build a bitmapset of the needed columns, and
1935-
* then build an array of the columns included in the hashtable. Note that
1936-
* the array is preserved over ExecReScanAgg, so we allocate it in the
1937-
* per-query context (unlike the hash table itself).
1935+
* then build an array of the columns included in the hashtable. We might
1936+
* still have duplicates if the passed-in grpColIdx has them, which can happen
1937+
* in edge cases from semijoins/distinct; these can't always be removed,
1938+
* because it's not certain that the duplicate cols will be using the same
1939+
* hash function.
1940+
*
1941+
* Note that the array is preserved over ExecReScanAgg, so we allocate it in
1942+
* the per-query context (unlike the hash table itself).
19381943
*/
19391944
staticvoid
19401945
find_hash_columns(AggState*aggstate)
@@ -1954,6 +1959,7 @@ find_hash_columns(AggState *aggstate)
19541959
AttrNumber*grpColIdx=perhash->aggnode->grpColIdx;
19551960
List*hashTlist=NIL;
19561961
TupleDeschashDesc;
1962+
intmaxCols;
19571963
inti;
19581964

19591965
perhash->largestGrpColIdx=0;
@@ -1978,15 +1984,24 @@ find_hash_columns(AggState *aggstate)
19781984
colnos=bms_del_member(colnos,attnum);
19791985
}
19801986
}
1981-
/* Add in all the grouping columns */
1982-
for (i=0;i<perhash->numCols;i++)
1983-
colnos=bms_add_member(colnos,grpColIdx[i]);
1987+
1988+
/*
1989+
* Compute maximum number of input columns accounting for possible
1990+
* duplications in the grpColIdx array, which can happen in some edge
1991+
* cases where HashAggregate was generated as part of a semijoin or a
1992+
* DISTINCT.
1993+
*/
1994+
maxCols=bms_num_members(colnos)+perhash->numCols;
19841995

19851996
perhash->hashGrpColIdxInput=
1986-
palloc(bms_num_members(colnos)*sizeof(AttrNumber));
1997+
palloc(maxCols*sizeof(AttrNumber));
19871998
perhash->hashGrpColIdxHash=
19881999
palloc(perhash->numCols*sizeof(AttrNumber));
19892000

2001+
/* Add all the grouping columns to colnos */
2002+
for (i=0;i<perhash->numCols;i++)
2003+
colnos=bms_add_member(colnos,grpColIdx[i]);
2004+
19902005
/*
19912006
* First build mapping for columns directly hashed. These are the
19922007
* first, because they'll be accessed when computing hash values and

‎src/test/regress/expected/aggregates.out

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2100,3 +2100,21 @@ select v||'a', case when v||'a' = 'aa' then 1 else 0 end, count(*)
21002100
ba | 0 | 1
21012101
(2 rows)
21022102

2103+
-- Make sure that generation of HashAggregate for uniqification purposes
2104+
-- does not lead to array overflow due to unexpected duplicate hash keys
2105+
-- see CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
2106+
explain (costs off)
2107+
select 1 from tenk1
2108+
where (hundred, thousand) in (select twothousand, twothousand from onek);
2109+
QUERY PLAN
2110+
-------------------------------------------------------------
2111+
Hash Join
2112+
Hash Cond: (tenk1.hundred = onek.twothousand)
2113+
-> Seq Scan on tenk1
2114+
Filter: (hundred = thousand)
2115+
-> Hash
2116+
-> HashAggregate
2117+
Group Key: onek.twothousand, onek.twothousand
2118+
-> Seq Scan on onek
2119+
(8 rows)
2120+

‎src/test/regress/sql/aggregates.sql

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -926,3 +926,10 @@ select v||'a', case v||'a' when 'aa' then 1 else 0 end, count(*)
926926
select v||'a', case when v||'a'='aa' then1 else0 end,count(*)
927927
from unnest(array['a','b']) u(v)
928928
group by v||'a'order by1;
929+
930+
-- Make sure that generation of HashAggregate for uniqification purposes
931+
-- does not lead to array overflow due to unexpected duplicate hash keys
932+
-- see CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
933+
explain (costs off)
934+
select1from tenk1
935+
where (hundred, thousand)in (select twothousand, twothousandfrom onek);

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp