Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit639f5ad

Browse files
committed
Further cleanup of gistsplit.c.
After further reflection I was unconvinced that the existing coding isguaranteed to return valid union datums in every code path for multi-columnindexes. Fix that by forcing a gistunionsubkey() call at the end of therecursion. Having done that, we can remove some clearly-redundant callselsewhere. This should be a little faster for multi-column indexes (sincethe previous coding would uselessly do such a call for each column whileunwinding the recursion), as well as much harder to break.Also, simplify the handling of cases where one side or the other of aprimary split contains only don't-care tuples. The previous coding used avery ugly hack in removeDontCares() that essentially forced one randomtuple to be treated as non-don't-care, providing a random initial choice ofseed datum for the secondary split. It seems unlikely that that methodwill give better-than-random splits. Instead, treat such a split asdegenerate and just let the next column determine the split, the same waythat we handle fully degenerate cases where the two sides produce identicalunion datums.
1 parent782365a commit639f5ad

File tree

1 file changed

+93
-64
lines changed

1 file changed

+93
-64
lines changed

‎src/backend/access/gist/gistsplit.c

Lines changed: 93 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -163,23 +163,16 @@ findDontCares(Relation r, GISTSTATE *giststate, GISTENTRY *valvec,
163163
* Remove tuples that are marked don't-cares from the tuple index array a[]
164164
* of length *len.This is applied separately to the spl_left and spl_right
165165
* arrays.
166-
*
167-
* Corner case: we do not wish to reduce the index array to zero length.
168-
* (If we did, then the union key for this side would be null, and having just
169-
* one of spl_ldatum_exists and spl_rdatum_exists be TRUE might confuse
170-
* user-defined PickSplit methods.) To avoid that, we'll forcibly redefine
171-
* one tuple as non-don't-care if necessary. Hence, we must be able to adjust
172-
* caller's NumDontCare count.
173166
*/
174167
staticvoid
175-
removeDontCares(OffsetNumber*a,int*len,bool*dontcare,int*NumDontCare)
168+
removeDontCares(OffsetNumber*a,int*len,constbool*dontcare)
176169
{
177170
intoriglen,
178-
curlen,
171+
newlen,
179172
i;
180173
OffsetNumber*curwpos;
181174

182-
origlen=curlen=*len;
175+
origlen=newlen=*len;
183176
curwpos=a;
184177
for (i=0;i<origlen;i++)
185178
{
@@ -191,18 +184,11 @@ removeDontCares(OffsetNumber *a, int *len, bool *dontcare, int *NumDontCare)
191184
*curwpos=ai;
192185
curwpos++;
193186
}
194-
elseif (curlen==1)
195-
{
196-
/* corner case: don't let array become empty */
197-
dontcare[ai]= FALSE;/* mark item as non-dont-care */
198-
*NumDontCare-=1;
199-
i--;/* reprocess item on next iteration */
200-
}
201187
else
202-
curlen--;
188+
newlen--;
203189
}
204190

205-
*len=curlen;
191+
*len=newlen;
206192
}
207193

208194
/*
@@ -305,8 +291,14 @@ supportSecondarySplit(Relation r, GISTSTATE *giststate, int attno,
305291
penalty2;
306292

307293
/*
308-
* there is only one previously defined union, so we just choose swap
309-
* or not by lowest penalty
294+
* There is only one previously defined union, so we just choose swap
295+
* or not by lowest penalty for that side.We can only get here if a
296+
* secondary split happened to have all NULLs in its column in the
297+
* tuples that the outer recursion level had assigned to one side.
298+
* (Note that the null checks in gistSplitByKey don't prevent the
299+
* case, because they'll only be checking tuples that were considered
300+
* don't-cares at the outer recursion level, not the tuples that went
301+
* into determining the passed-down left and right union keys.)
310302
*/
311303
penalty1=gistpenalty(giststate,attno,entry1, false,&entrySL, false);
312304
penalty2=gistpenalty(giststate,attno,entry1, false,&entrySR, false);
@@ -404,13 +396,19 @@ genericPickSplit(GISTSTATE *giststate, GistEntryVector *entryvec, GIST_SPLITVEC
404396
* two vectors.
405397
*
406398
* Returns FALSE if split is complete (there are no more index columns, or
407-
* there is no need to consider them).Note that in this case the union
408-
* keys for all columns must be computed here.
409-
* Returns TRUE and v->spl_dontcare = NULL if left and right unions of attno
410-
* column are the same, so we should split on next column instead.
399+
* there is no need to consider them because split is optimal already).
400+
*
401+
* Returns TRUE and v->spl_dontcare = NULL if the picksplit result is
402+
* degenerate (all tuples seem to be don't-cares), so we should just
403+
* disregard this column and split on the next column(s) instead.
404+
*
411405
* Returns TRUE and v->spl_dontcare != NULL if there are don't-care tuples
412406
* that could be relocated based on the next column(s). The don't-care
413407
* tuples have been removed from the split and must be reinserted by caller.
408+
* There is at least one non-don't-care tuple on each side of the split,
409+
* and union keys for all columns are updated to include just those tuples.
410+
*
411+
* A TRUE result implies there is at least one more index column.
414412
*/
415413
staticbool
416414
gistUserPicksplit(Relationr,GistEntryVector*entryvec,intattno,GistSplitVector*v,
@@ -481,49 +479,57 @@ gistUserPicksplit(Relation r, GistEntryVector *entryvec, int attno, GistSplitVec
481479

482480
/*
483481
* If index columns remain, then consider whether we can improve the split
484-
* by using them. Even if we can't, we must compute union keys for those
485-
* columns before we can return FALSE.
482+
* by using them.
486483
*/
487484
v->spl_dontcare=NULL;
488485

489486
if (attno+1<giststate->tupdesc->natts)
490487
{
491488
intNumDontCare;
492489

490+
/*
491+
* Make a quick check to see if left and right union keys are equal;
492+
* if so, the split is certainly degenerate, so tell caller to
493+
* re-split with the next column.
494+
*/
493495
if (gistKeyIsEQ(giststate,attno,sv->spl_ldatum,sv->spl_rdatum))
494-
{
495-
/*
496-
* Left and right union keys are equal, so we can get better split
497-
* by considering next column.
498-
*/
499496
return true;
500-
}
501497

502498
/*
503-
* Locate don't-care tuples, if any
499+
* Locate don't-care tuples, if any. If there are none, the split is
500+
* optimal, so just fall out and return false.
504501
*/
505502
v->spl_dontcare= (bool*)palloc0(sizeof(bool)* (entryvec->n+1));
506503

507504
NumDontCare=findDontCares(r,giststate,entryvec->vector,v,attno);
508505

509-
if (NumDontCare==0)
506+
if (NumDontCare>0)
510507
{
511508
/*
512-
* There are no don't-cares, so just compute the union keys for
513-
* remaining columns and we're done.
509+
* Remove don't-cares from spl_left[] and spl_right[].
514510
*/
515-
gistunionsubkey(giststate,itup,v);
516-
}
517-
else
518-
{
511+
removeDontCares(sv->spl_left,&sv->spl_nleft,v->spl_dontcare);
512+
removeDontCares(sv->spl_right,&sv->spl_nright,v->spl_dontcare);
513+
519514
/*
520-
* Remove don't-cares from spl_left[] and spl_right[]. NOTE: this
521-
* could reduce NumDontCare to zero.
515+
* If all tuples on either side were don't-cares, the split is
516+
* degenerate, and we're best off to ignore it and split on the
517+
* next column. (We used to try to press on with a secondary
518+
* split by forcing a random tuple on each side to be treated as
519+
* non-don't-care, but it seems unlikely that that technique
520+
* really gives a better result. Note that we don't want to try a
521+
* secondary split with empty left or right primary split sides,
522+
* because then there is no union key on that side for the
523+
* PickSplit function to try to expand, so it can have no good
524+
* figure of merit for what it's doing. Also note that this check
525+
* ensures we can't produce a bogus one-side-only split in the
526+
* NumDontCare == 1 special case below.)
522527
*/
523-
removeDontCares(sv->spl_left,&sv->spl_nleft,
524-
v->spl_dontcare,&NumDontCare);
525-
removeDontCares(sv->spl_right,&sv->spl_nright,
526-
v->spl_dontcare,&NumDontCare);
528+
if (sv->spl_nleft==0||sv->spl_nright==0)
529+
{
530+
v->spl_dontcare=NULL;
531+
return true;
532+
}
527533

528534
/*
529535
* Recompute union keys, considering only non-don't-care tuples.
@@ -539,7 +545,9 @@ gistUserPicksplit(Relation r, GistEntryVector *entryvec, int attno, GistSplitVec
539545
/*
540546
* If there's only one don't-care tuple then we can't do a
541547
* PickSplit on it, so just choose whether to send it left or
542-
* right by comparing penalties.
548+
* right by comparing penalties. We needed the
549+
* gistunionsubkey step anyway so that we have appropriate
550+
* union keys for figuring the penalties.
543551
*/
544552
OffsetNumbertoMove;
545553

@@ -554,13 +562,14 @@ gistUserPicksplit(Relation r, GistEntryVector *entryvec, int attno, GistSplitVec
554562
/* ... and assign it to cheaper side */
555563
placeOne(r,giststate,v,itup[toMove-1],toMove,attno+1);
556564

557-
/* recompute the union keys including this tuple */
558-
v->spl_dontcare=NULL;
559-
gistunionsubkey(giststate,itup,v);
565+
/*
566+
* At this point the union keys are wrong, but we don't care
567+
* because we're done splitting. The outermost recursion
568+
* level of gistSplitByKey will fix things before returning.
569+
*/
560570
}
561-
elseif (NumDontCare>1)
571+
else
562572
return true;
563-
/* else NumDontCare is now zero; handle same as above */
564573
}
565574
}
566575

@@ -706,7 +715,7 @@ gistSplitByKey(Relation r, Page page, IndexTuple *itup, int len,
706715
*/
707716
v->spl_risnull[attno]=v->spl_lisnull[attno]= TRUE;
708717

709-
if (attno+1<r->rd_att->natts)
718+
if (attno+1<giststate->tupdesc->natts)
710719
gistSplitByKey(r,page,itup,len,giststate,v,attno+1);
711720
else
712721
gistSplitHalf(&v->splitVector,len);
@@ -731,28 +740,31 @@ gistSplitByKey(Relation r, Page page, IndexTuple *itup, int len,
731740
else
732741
v->splitVector.spl_left[v->splitVector.spl_nleft++]=i;
733742

734-
/* Must compute union keys for this and any following columns */
735-
v->spl_dontcare=NULL;
736-
gistunionsubkey(giststate,itup,v);
743+
/* Compute union keys, unless outer recursion level will handle it */
744+
if (attno==0&&giststate->tupdesc->natts==1)
745+
{
746+
v->spl_dontcare=NULL;
747+
gistunionsubkey(giststate,itup,v);
748+
}
737749
}
738750
else
739751
{
740752
/*
741-
*all keys are not-null, so apply user-defined PickSplit method
753+
*All keys are not-null, so apply user-defined PickSplit method
742754
*/
743755
if (gistUserPicksplit(r,entryvec,attno,v,itup,len,giststate))
744756
{
745757
/*
746758
* Splitting on attno column is not optimal, so consider
747759
* redistributing don't-care tuples according to the next column
748760
*/
749-
Assert(attno+1<r->rd_att->natts);
761+
Assert(attno+1<giststate->tupdesc->natts);
750762

751763
if (v->spl_dontcare==NULL)
752764
{
753765
/*
754-
*Simple case: left and right keys for attno column are
755-
*equal, so just split according to the next column.
766+
*This split was actually degenerate, so ignore it altogether
767+
*and just split according to the next column.
756768
*/
757769
gistSplitByKey(r,page,itup,len,giststate,v,attno+1);
758770
}
@@ -799,10 +811,27 @@ gistSplitByKey(Relation r, Page page, IndexTuple *itup, int len,
799811
backupSplit.spl_right[backupSplit.spl_nright++]=map[v->splitVector.spl_right[i]-1];
800812

801813
v->splitVector=backupSplit;
802-
803-
/* recompute left and right union datums */
804-
gistunionsubkey(giststate,itup,v);
805814
}
806815
}
807816
}
817+
818+
/*
819+
* If we're handling a multicolumn index, at the end of the recursion
820+
* recompute the left and right union datums for all index columns. This
821+
* makes sure we hand back correct union datums in all corner cases,
822+
* including when we haven't processed all columns to start with, or when
823+
* a secondary split moved "don't care" tuples from one side to the other
824+
* (we really shouldn't assume that that didn't change the union datums).
825+
*
826+
* Note: when we're in an internal recursion (attno > 0), we do not worry
827+
* about whether the union datums we return with are sensible, since
828+
* calling levels won't care. Also, in a single-column index, we expect
829+
* that PickSplit (or the special cases above) produced correct union
830+
* datums.
831+
*/
832+
if (attno==0&&giststate->tupdesc->natts>1)
833+
{
834+
v->spl_dontcare=NULL;
835+
gistunionsubkey(giststate,itup,v);
836+
}
808837
}

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp