Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit730089d

Browse files
committed
Fix a low-probability crash in our qsort implementation.
It's standard for quicksort implementations, after having partitioned theinput into two subgroups, to recurse to process the smaller partition andthen handle the larger partition by iterating. This method guaranteesthat no more than log2(N) levels of recursion can be needed. However,Bentley and McIlroy argued that checking to see which partition is smallerisn't worth the cycles, and so their code doesn't do that but just alwaysrecurses on the left partition. In most cases that's fine; but withworst-case input we might need O(N) levels of recursion, and that meansthat qsort could be driven to stack overflow. Such an overflow seems tobe the only explanation for today's report from Yiqing Jin of a SIGSEGVin med3_tuple while creating an index of a couple billion entries with avery large maintenance_work_mem setting. Therefore, let's spend the fewadditional cycles and lines of code needed to choose the smaller partitionfor recursion.Also, fix up the qsort code so that it properly uses size_t not int forsome intermediate values representing numbers of items. This would onlybe a live risk when sorting more than INT_MAX bytes (in qsort/qsort_arg)or tuples (in qsort_tuple), which I believe would never happen with anycaller in the current core code --- but perhaps it could happen withcall sites in third-party modules? In any case, this is trouble waitingto happen, and the corrected code is probably if anything shorter andfaster than before, since it removes sign-extension steps that had tohappen when converting between int and size_t.In passing, move a couple of CHECK_FOR_INTERRUPTS() calls so that it'snot necessary to preserve the value of "r" across them, and prettifythe output of gen_qsort_tuple.pl a little.Back-patch to all supported branches. The odds of hitting this issueare probably higher in 9.4 and up than before, due to the new abilityto allocate sort workspaces exceeding 1GB, but there's no good reasonto believe that it's impossible to crash older branches this way.
1 parentdc5075f commit730089d

File tree

3 files changed

+156
-60
lines changed

3 files changed

+156
-60
lines changed

‎src/backend/utils/sort/gen_qsort_tuple.pl

Lines changed: 60 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,13 @@
1414
#
1515
#Modifications from vanilla NetBSD source:
1616
# Add do ... while() macro fix
17-
# Remove __inline, _DIAGASSERTs, __P
18-
# Remove ill-considered "swap_cnt" switch to insertion sort,
19-
# in favor of a simple check for presorted input.
20-
# Instead of sorting arbitrary objects, we're always sorting SortTuples
21-
# Add CHECK_FOR_INTERRUPTS()
17+
# Remove __inline, _DIAGASSERTs, __P
18+
# Remove ill-considered "swap_cnt" switch to insertion sort,
19+
# in favor of a simple check for presorted input.
20+
# Take care to recurse on the smaller partition, to bound stack usage.
21+
#
22+
# Instead of sorting arbitrary objects, we're always sorting SortTuples.
23+
# Add CHECK_FOR_INTERRUPTS().
2224
#
2325
# CAUTION: if you change this file, see also qsort.c and qsort_arg.c
2426
#
@@ -43,17 +45,20 @@
4345
$EXTRAPARAMS =', ssup';
4446
$CMPPARAMS =', ssup';
4547
print<<'EOM';
48+
4649
#define cmp_ssup(a, b, ssup) \
4750
ApplySortComparator((a)->datum1, (a)->isnull1, \
4851
(b)->datum1, (b)->isnull1, ssup)
52+
4953
EOM
5054
emit_qsort_implementation();
5155

5256
subemit_qsort_boilerplate
5357
{
5458
print<<'EOM';
5559
/*
56-
* autogenerated by src/backend/utils/sort/gen_qsort_tuple.pl, do not edit
60+
* autogenerated by src/backend/utils/sort/gen_qsort_tuple.pl, do not edit!
61+
*
5762
* This file is included by tuplesort.c, rather than compiled separately.
5863
*/
5964
@@ -78,7 +83,7 @@ sub emit_qsort_boilerplate
7883
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
7984
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
8085
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
81-
* ARE DISCLAIMED.IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
86+
* ARE DISCLAIMED.IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
8287
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
8388
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
8489
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
@@ -92,8 +97,16 @@ sub emit_qsort_boilerplate
9297
* Qsort routine based on J. L. Bentley and M. D. McIlroy,
9398
* "Engineering a sort function",
9499
* Software--Practice and Experience 23 (1993) 1249-1265.
100+
*
95101
* We have modified their original by adding a check for already-sorted input,
96102
* which seems to be a win per discussions on pgsql-hackers around 2006-03-21.
103+
*
104+
* Also, we recurse on the smaller partition and iterate on the larger one,
105+
* which ensures we cannot recurse more than log(N) levels (since the
106+
* partition recursed to is surely no more than half of the input). Bentley
107+
* and McIlroy explicitly rejected doing this on the grounds that it's "not
108+
* worth the effort", but we have seen crashes in the field due to stack
109+
* overrun, so that judgment seems wrong.
97110
*/
98111
99112
static void
@@ -114,7 +127,8 @@ sub emit_qsort_boilerplate
114127
*(b) = t;\
115128
} while (0);
116129
117-
#define vecswap(a, b, n) if ((n) > 0) swapfunc((a), (b), (size_t)(n))
130+
#define vecswap(a, b, n) if ((n) > 0) swapfunc(a, b, n)
131+
118132
EOM
119133
}
120134

@@ -141,8 +155,9 @@ sub emit_qsort_implementation
141155
*pl,
142156
*pm,
143157
*pn;
144-
intd,
145-
r,
158+
size_td1,
159+
d2;
160+
intr,
146161
presorted;
147162
148163
loop:
@@ -173,7 +188,8 @@ sub emit_qsort_implementation
173188
pn = a + (n - 1);
174189
if (n > 40)
175190
{
176-
d = (n / 8);
191+
size_td = (n / 8);
192+
177193
pl = med3_$SUFFIX(pl, pl + d, pl + 2 * d$EXTRAPARAMS);
178194
pm = med3_$SUFFIX(pm - d, pm, pm + d$EXTRAPARAMS);
179195
pn = med3_$SUFFIX(pn - 2 * d, pn - d, pn$EXTRAPARAMS);
@@ -187,23 +203,23 @@ sub emit_qsort_implementation
187203
{
188204
while (pb <= pc && (r = cmp_$SUFFIX(pb, a$CMPPARAMS)) <= 0)
189205
{
190-
CHECK_FOR_INTERRUPTS();
191206
if (r == 0)
192207
{
193208
swap(pa, pb);
194209
pa++;
195210
}
196211
pb++;
212+
CHECK_FOR_INTERRUPTS();
197213
}
198214
while (pb <= pc && (r = cmp_$SUFFIX(pc, a$CMPPARAMS)) >= 0)
199215
{
200-
CHECK_FOR_INTERRUPTS();
201216
if (r == 0)
202217
{
203218
swap(pc, pd);
204219
pd--;
205220
}
206221
pc--;
222+
CHECK_FOR_INTERRUPTS();
207223
}
208224
if (pb > pc)
209225
break;
@@ -212,21 +228,39 @@ sub emit_qsort_implementation
212228
pc--;
213229
}
214230
pn = a + n;
215-
r = Min(pa - a, pb - pa);
216-
vecswap(a, pb -r, r);
217-
r = Min(pd - pc, pn - pd - 1);
218-
vecswap(pb, pn -r, r);
219-
if ((r= pb - pa) > 1)
220-
qsort_$SUFFIX(a, r$EXTRAPARAMS);
221-
if ((r = pd - pc) > 1)
231+
d1 = Min(pa - a, pb - pa);
232+
vecswap(a, pb -d1, d1);
233+
d1 = Min(pd - pc, pn - pd - 1);
234+
vecswap(pb, pn -d1, d1);
235+
d1= pb - pa;
236+
d2 = pd - pc;
237+
if (d1 <= d2)
222238
{
223-
/* Iterate rather than recurse to save stack space */
224-
a = pn - r;
225-
n = r;
226-
goto loop;
239+
/* Recurse on left partition, then iterate on right partition */
240+
if (d1 > 1)
241+
qsort_$SUFFIX(a, d1$EXTRAPARAMS);
242+
if (d2 > 1)
243+
{
244+
/* Iterate rather than recurse to save stack space */
245+
/* qsort_$SUFFIX(pn - d2, d2$EXTRAPARAMS); */
246+
a = pn - d2;
247+
n = d2;
248+
goto loop;
249+
}
250+
}
251+
else
252+
{
253+
/* Recurse on right partition, then iterate on left partition */
254+
if (d2 > 1)
255+
qsort_$SUFFIX(pn - d2, d2$EXTRAPARAMS);
256+
if (d1 > 1)
257+
{
258+
/* Iterate rather than recurse to save stack space */
259+
/* qsort_$SUFFIX(a, d1$EXTRAPARAMS); */
260+
n = d1;
261+
goto loop;
262+
}
227263
}
228-
/*qsort_$SUFFIX(pn - r, r$EXTRAPARAMS);*/
229264
}
230-
231265
EOM
232266
}

‎src/port/qsort.c

Lines changed: 49 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
* Remove __inline, _DIAGASSERTs, __P
77
* Remove ill-considered "swap_cnt" switch to insertion sort,
88
* in favor of a simple check for presorted input.
9+
* Take care to recurse on the smaller partition, to bound stack usage.
910
*
1011
*CAUTION: if you change this file, see also qsort_arg.c, gen_qsort_tuple.pl
1112
*
@@ -54,9 +55,18 @@ static void swapfunc(char *, char *, size_t, int);
5455
* Qsort routine based on J. L. Bentley and M. D. McIlroy,
5556
* "Engineering a sort function",
5657
* Software--Practice and Experience 23 (1993) 1249-1265.
58+
*
5759
* We have modified their original by adding a check for already-sorted input,
5860
* which seems to be a win per discussions on pgsql-hackers around 2006-03-21.
61+
*
62+
* Also, we recurse on the smaller partition and iterate on the larger one,
63+
* which ensures we cannot recurse more than log(N) levels (since the
64+
* partition recursed to is surely no more than half of the input). Bentley
65+
* and McIlroy explicitly rejected doing this on the grounds that it's "not
66+
* worth the effort", but we have seen crashes in the field due to stack
67+
* overrun, so that judgment seems wrong.
5968
*/
69+
6070
#defineswapcode(TYPE,parmi,parmj,n) \
6171
do {\
6272
size_t i = (n) / sizeof (TYPE);\
@@ -89,7 +99,7 @@ swapfunc(char *a, char *b, size_t n, int swaptype)
8999
} else\
90100
swapfunc(a, b, es, swaptype)
91101

92-
#definevecswap(a,b,n) if ((n) > 0) swapfunc((a), (b), (size_t)(n), swaptype)
102+
#definevecswap(a,b,n) if ((n) > 0) swapfunc(a, b, n, swaptype)
93103

94104
staticchar*
95105
med3(char*a,char*b,char*c,int (*cmp) (constvoid*,constvoid*))
@@ -109,8 +119,9 @@ pg_qsort(void *a, size_t n, size_t es, int (*cmp) (const void *, const void *))
109119
*pl,
110120
*pm,
111121
*pn;
112-
intd,
113-
r,
122+
size_td1,
123+
d2;
124+
intr,
114125
swaptype,
115126
presorted;
116127

@@ -141,7 +152,8 @@ loop:SWAPINIT(a, es);
141152
pn= (char*)a+ (n-1)*es;
142153
if (n>40)
143154
{
144-
d= (n /8)*es;
155+
size_td= (n /8)*es;
156+
145157
pl=med3(pl,pl+d,pl+2*d,cmp);
146158
pm=med3(pm-d,pm,pm+d,cmp);
147159
pn=med3(pn-2*d,pn-d,pn,cmp);
@@ -178,27 +190,46 @@ loop:SWAPINIT(a, es);
178190
pc-=es;
179191
}
180192
pn= (char*)a+n*es;
181-
r=Min(pa- (char*)a,pb-pa);
182-
vecswap(a,pb-r,r);
183-
r=Min(pd-pc,pn-pd-es);
184-
vecswap(pb,pn-r,r);
185-
if ((r=pb-pa)>es)
186-
qsort(a,r /es,es,cmp);
187-
if ((r=pd-pc)>es)
193+
d1=Min(pa- (char*)a,pb-pa);
194+
vecswap(a,pb-d1,d1);
195+
d1=Min(pd-pc,pn-pd-es);
196+
vecswap(pb,pn-d1,d1);
197+
d1=pb-pa;
198+
d2=pd-pc;
199+
if (d1 <=d2)
188200
{
189-
/* Iterate rather than recurse to save stack space */
190-
a=pn-r;
191-
n=r /es;
192-
gotoloop;
201+
/* Recurse on left partition, then iterate on right partition */
202+
if (d1>es)
203+
pg_qsort(a,d1 /es,es,cmp);
204+
if (d2>es)
205+
{
206+
/* Iterate rather than recurse to save stack space */
207+
/* pg_qsort(pn - d2, d2 / es, es, cmp); */
208+
a=pn-d2;
209+
n=d2 /es;
210+
gotoloop;
211+
}
212+
}
213+
else
214+
{
215+
/* Recurse on right partition, then iterate on left partition */
216+
if (d2>es)
217+
pg_qsort(pn-d2,d2 /es,es,cmp);
218+
if (d1>es)
219+
{
220+
/* Iterate rather than recurse to save stack space */
221+
/* pg_qsort(a, d1 / es, es, cmp); */
222+
n=d1 /es;
223+
gotoloop;
224+
}
193225
}
194-
/*qsort(pn - r, r / es, es, cmp);*/
195226
}
196227

197228
/*
198-
* qsort wrapper for strcmp.
229+
* qsortcomparatorwrapper for strcmp.
199230
*/
200231
int
201232
pg_qsort_strcmp(constvoid*a,constvoid*b)
202233
{
203-
returnstrcmp(*(char*const*)a,*(char*const*)b);
234+
returnstrcmp(*(constchar*const*)a,*(constchar*const*)b);
204235
}

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp