NotificationsYou must be signed in to change notification settings
Fork6
Star31

Commitc8ba697

committed

Fix logic bug in gistchoose and gistRelocateBuildBuffersOnSplit.

Every time the best-tuple-found-so-far changes, we need to reset allthe penalty values in which_grow[] to the penalties for the new besttuple. The old code failed to do this, resulting in inferior indexquality.The original patch from Alexander Korotkov was just two lines; I tookthe liberty of fleshing that out by adding a bunch of comments that Ihope will make this logic easier for others to understand than it wasfor me.

1 parentd1a4db8 commitc8ba697Copy full SHA for c8ba697

File tree

2 files changed

+80

-8

lines changed

src/backend/access/gist
- gistbuildbuffers.c
- gistutil.c

2 files changed

+80

-8

lines changed

`‎src/backend/access/gist/gistbuildbuffers.c`

Lines changed: 37 additions & 6 deletions

Original file line number	Diff line number	Diff line change
`@@ -625,8 +625,13 @@ gistRelocateBuildBuffersOnSplit(GISTBuildBuffers gfbb, GISTSTATE giststate,`
`625`	`625`	`}`
`626`	`626`
`627`	`627`	`/*`
`628`		`- * Loop through all index tuples on the buffer on the splitted page,`
`629`		`- * moving them to buffers on the new pages.`
	`628`	`+ * Loop through all index tuples on the buffer on the page being split,`
	`629`	`+ * moving them to buffers on the new pages. We try to move each tuple`
	`630`	`+ * the page that will result in the lowest penalty for the leading column`
	`631`	`+ * or, in the case of a tie, the lowest penalty for the earliest column`
	`632`	`+ * that is not tied.`
	`633`	`+ *`
	`634`	`+ * The guts of this loop are very similar to gistchoose().`
`630`	`635`	`*/`
`631`	`636`	`while (gistPopItupFromNodeBuffer(gfbb,&oldBuf,&itup))`
`632`	`637`	`{`
`@@ -637,14 +642,18 @@ gistRelocateBuildBuffersOnSplit(GISTBuildBuffers gfbb, GISTSTATE giststate,`
`637`	`642`	`IndexTuplenewtup;`
`638`	`643`	`RelocationBufferInfo*targetBufferInfo;`
`639`	`644`
`640`		`-/*`
`641`		`- * Choose which page this tuple should go to.`
`642`		`- */`
`643`	`645`	`gistDeCompressAtt(giststate,r,`
`644`	`646`	`itup,NULL, (OffsetNumber)0,entry,isnull);`
`645`	`647`
`646`	`648`	`which=-1;`
`647`	`649`	`*which_grow=-1.0f;`
	`650`	`+`
	`651`	`+/*`
	`652`	`+ * Loop over possible target pages. We'll exit early if we find an index key that`
	`653`	`+ * can accommodate the new key with no penalty on any column. sum_grow is used to`
	`654`	`+ * track this condition. It doesn't need to be exactly accurate, just >0 whenever`
	`655`	`+ * we want the loop to continue and equal to 0 when we want it to terminate.`
	`656`	`+ */`
`648`	`657`	`sum_grow=1.0f;`
`649`	`658`
`650`	`659`	`for (i=0;i<splitPagesCount&&sum_grow;i++)`
`@@ -653,6 +662,8 @@ gistRelocateBuildBuffersOnSplit(GISTBuildBuffers gfbb, GISTSTATE giststate,`
`653`	`662`	`RelocationBufferInfo*splitPageInfo=&relocationBuffersInfos[i];`
`654`	`663`
`655`	`664`	`sum_grow=0.0f;`
	`665`	`+`
	`666`	`+/* Loop over index attributes. */`
`656`	`667`	`for (j=0;j<r->rd_att->natts;j++)`
`657`	`668`	`{`
`658`	`669`	`floatusize;`
`@@ -664,16 +675,36 @@ gistRelocateBuildBuffersOnSplit(GISTBuildBuffers gfbb, GISTSTATE giststate,`
`664`	`675`
`665`	`676`	`if (which_grow[j]<0\|\|usize<which_grow[j])`
`666`	`677`	`{`
	`678`	`+/*`
	`679`	`+ * We get here in two cases. First, we may have just discovered that the`
	`680`	`+ * current tuple is the best one we've seen so far; that is, for the first`
	`681`	`+ * column for which the penalty is not equal to the best tuple seen so far,`
	`682`	`+ * this one has a lower penalty than the previously-seen one. But, when`
	`683`	`+ * a new best tuple is found, we must record the best penalty value for`
	`684`	`+ * all the remaining columns. We'll end up here for each remaining index`
	`685`	`+ * column in that case, too.`
	`686`	`+ */`
`667`	`687`	`which=i;`
`668`	`688`	`which_grow[j]=usize;`
`669`		`-if (j<r->rd_att->natts-1&&i==0)`
	`689`	`+if (j<r->rd_att->natts-1)`
`670`	`690`	`which_grow[j+1]=-1;`
`671`	`691`	`sum_grow+=which_grow[j];`
`672`	`692`	`}`
`673`	`693`	`elseif (which_grow[j]==usize)`
	`694`	`+{`
	`695`	`+/*`
	`696`	`+ * The current tuple is exactly as good for this column as the best tuple`
	`697`	`+ * seen so far. The next iteration of this loop will compare the next`
	`698`	`+ * column.`
	`699`	`+ */`
`674`	`700`	`sum_grow+=usize;`
	`701`	`+}`
`675`	`702`	`else`
`676`	`703`	`{`
	`704`	`+/*`
	`705`	`+ * The current tuple is worse for this column than the best tuple seen so`
	`706`	`+ * far. Skip the remaining columns and move on to the next tuple, if any.`
	`707`	`+ */`
`677`	`708`	`sum_grow=1;`
`678`	`709`	`break;`
`679`	`710`	`}`

`‎src/backend/access/gist/gistutil.c`

Lines changed: 43 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -363,7 +363,12 @@ gistgetadjusted(Relation r, IndexTuple oldtup, IndexTuple addtup, GISTSTATE *gis`
`363`	`363`	`}`
`364`	`364`
`365`	`365`	`/*`
`366`		`- * find entry with lowest penalty`
	`366`	`+ * Search a page for the entry with lowest penalty.`
	`367`	`+ *`
	`368`	`+ * The index may have multiple columns, and there's a penalty value for each column.`
	`369`	`+ * The penalty associated with a column which appears earlier in the index definition is`
	`370`	`+ * strictly more important than the penalty of column which appears later in the index`
	`371`	`+ * definition.`
`367`	`372`	`*/`
`368`	`373`	`OffsetNumber`
`369`	`374`	`gistchoose(Relationr,Pagep,IndexTupleit,/* it has compressed entry */`
`@@ -389,12 +394,28 @@ gistchoose(Relation r, Page p, IndexTuple it,/* it has compressed entry */`
`389`	`394`	`Assert(maxoff >=FirstOffsetNumber);`
`390`	`395`	`Assert(!GistPageIsLeaf(p));`
`391`	`396`
	`397`	`+/*`
	`398`	`+ * Loop over tuples on page.`
	`399`	`+ *`
	`400`	`+ * We'll exit early if we find an index key that can accommodate the new key with no`
	`401`	`+ * penalty on any column. sum_grow is used to track this condition. Normally, it is the`
	`402`	`+ * sum of the penalties we've seen for this column so far, which is not a very useful`
	`403`	`+ * quantity in general because the penalties for each column are only considered`
	`404`	`+ * independently, but all we really care about is whether or not it's greater than zero.`
	`405`	`+ * Since penalties can't be negative, the sum of the penalties will be greater than`
	`406`	`+ * zero if and only if at least one penalty was greater than zero. To make things just`
	`407`	`+ * a bit more complicated, we arbitrarily set sum_grow to 1.0 whenever we want to force`
	`408`	`+ * the at least one more iteration of this outer loop. Any non-zero value would serve`
	`409`	`+ * just as well.`
	`410`	`+ */`
`392`	`411`	`for (i=FirstOffsetNumber;i <=maxoff&&sum_grow;i=OffsetNumberNext(i))`
`393`	`412`	`{`
`394`	`413`	`intj;`
`395`	`414`	`IndexTupleitup= (IndexTuple)PageGetItem(p,PageGetItemId(p,i));`
`396`	`415`
`397`	`416`	`sum_grow=0;`
	`417`	`+`
	`418`	`+/* Loop over indexed attribtues. */`
`398`	`419`	`for (j=0;j<r->rd_att->natts;j++)`
`399`	`420`	`{`
`400`	`421`	`Datumdatum;`
`@@ -409,16 +430,36 @@ gistchoose(Relation r, Page p, IndexTuple it,/* it has compressed entry */`
`409`	`430`
`410`	`431`	`if (which_grow[j]<0\|\|usize<which_grow[j])`
`411`	`432`	`{`
	`433`	`+/*`
	`434`	`+ * We get here in two cases. First, we may have just discovered that the`
	`435`	`+ * current tuple is the best one we've seen so far; that is, for the first`
	`436`	`+ * column for which the penalty is not equal to the best tuple seen so far,`
	`437`	`+ * this one has a lower penalty than the previously-seen one. But, when`
	`438`	`+ * a new best tuple is found, we must record the best penalty value for`
	`439`	`+ * all the remaining columns. We'll end up here for each remaining index`
	`440`	`+ * column in that case, too.`
	`441`	`+ */`
`412`	`442`	`which=i;`
`413`	`443`	`which_grow[j]=usize;`
`414`		`-if (j<r->rd_att->natts-1&&i==FirstOffsetNumber)`
	`444`	`+if (j<r->rd_att->natts-1)`
`415`	`445`	`which_grow[j+1]=-1;`
`416`	`446`	`sum_grow+=which_grow[j];`
`417`	`447`	`}`
`418`	`448`	`elseif (which_grow[j]==usize)`
	`449`	`+{`
	`450`	`+/*`
	`451`	`+ * The current tuple is exactly as good for this column as the best tuple`
	`452`	`+ * seen so far. The next iteration of this loop will compare the next`
	`453`	`+ * column.`
	`454`	`+ */`
`419`	`455`	`sum_grow+=usize;`
	`456`	`+}`
`420`	`457`	`else`
`421`	`458`	`{`
	`459`	`+/*`
	`460`	`+ * The current tuple is worse for this column than the best tuple seen so`
	`461`	`+ * far. Skip the remaining columns and move on to the next tuple, if any.`
	`462`	`+ */`
`422`	`463`	`sum_grow=1;`
`423`	`464`	`break;`
`424`	`465`	`}`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commitc8ba697

File tree

2 files changed

2 files changed

`‎src/backend/access/gist/gistbuildbuffers.c`

`‎src/backend/access/gist/gistutil.c`

0 commit comments