Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitd4fe61b

Browse files
committed
Fix an additional set of problems in GIN's handling of lossy page pointers.
Although the key-combining code claimed to work correctly if its inputcontained both lossy and exact pointers for a single page in a single TIDstream, in fact this did not work, and could not work without prettyfundamental redesign. Modify keyGetItem so that it will not return such astream, by handling lossy-pointer cases a bit more explicitly than we didbefore.Per followup investigation of a gripe from Artur Dabrowski.An example of a query that failed given his data set isselect count(*) from search_tab where(to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* | dd:*')) and(to_tsvector('german', keywords ) @@ to_tsquery('german', 'aa:*'));Back-patch to 8.4 where the lossy pointer code was introduced.
1 parent0454f13 commitd4fe61b

File tree

2 files changed

+137
-66
lines changed

2 files changed

+137
-66
lines changed

‎src/backend/access/gin/ginget.c

Lines changed: 126 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
* Portions Copyright (c) 1994, Regents of the University of California
99
*
1010
* IDENTIFICATION
11-
*$PostgreSQL: pgsql/src/backend/access/gin/ginget.c,v 1.31 2010/07/31 00:30:54 tgl Exp $
11+
*$PostgreSQL: pgsql/src/backend/access/gin/ginget.c,v 1.32 2010/08/01 19:16:39 tgl Exp $
1212
*-------------------------------------------------------------------------
1313
*/
1414

@@ -43,13 +43,12 @@ findItemInPage(Page page, ItemPointer item, OffsetNumber *off)
4343
intres;
4444

4545
if (GinPageGetOpaque(page)->flags&GIN_DELETED)
46-
/* page was deleted by concurrentvacuum */
46+
/* page was deleted by concurrent vacuum */
4747
return false;
4848

4949
/*
5050
* scan page to find equal or first greater value
5151
*/
52-
5352
for (*off=FirstOffsetNumber;*off <=maxoff; (*off)++)
5453
{
5554
res=compareItemPointers(item, (ItemPointer)GinDataPageGetItem(page,*off));
@@ -527,12 +526,40 @@ entryGetNextItem(Relation index, GinScanEntry entry)
527526
}
528527
}
529528

529+
/* convenience function for invoking a key's consistentFn */
530+
staticinlinebool
531+
callConsistentFn(GinState*ginstate,GinScanKeykey)
532+
{
533+
/*
534+
* Initialize recheckCurItem in case the consistentFn doesn't know it
535+
* should set it. The safe assumption in that case is to force recheck.
536+
*/
537+
key->recheckCurItem= true;
538+
539+
returnDatumGetBool(FunctionCall6(&ginstate->consistentFn[key->attnum-1],
540+
PointerGetDatum(key->entryRes),
541+
UInt16GetDatum(key->strategy),
542+
key->query,
543+
UInt32GetDatum(key->nentries),
544+
PointerGetDatum(key->extra_data),
545+
PointerGetDatum(&key->recheckCurItem)));
546+
}
547+
530548
#definegin_rand() (((double) random()) / ((double) MAX_RANDOM_VALUE))
531549
#definedropItem(e) ( gin_rand() > ((double)GinFuzzySearchLimit)/((double)((e)->predictNumberResult)) )
532550

533551
/*
534552
* Sets entry->curItem to next heap item pointer for one entry of one scan key,
535553
* or sets entry->isFinished to TRUE if there are no more.
554+
*
555+
* Item pointers must be returned in ascending order.
556+
*
557+
* Note: this can return a "lossy page" item pointer, indicating that the
558+
* entry potentially matches all items on that heap page. However, it is
559+
* not allowed to return both a lossy page pointer and exact (regular)
560+
* item pointers for the same page. (Doing so would break the key-combination
561+
* logic in keyGetItem and scanGetItem; see comment in scanGetItem.) In the
562+
* current implementation this is guaranteed by the behavior of tidbitmaps.
536563
*/
537564
staticvoid
538565
entryGetItem(Relationindex,GinScanEntryentry)
@@ -625,15 +652,20 @@ entryGetItem(Relation index, GinScanEntry entry)
625652
* item pointer (including the case where the item pointer is a lossy page
626653
* pointer).
627654
*
628-
* Note: lossy page could be returned after single items from the same page.
629-
* This is OK since the results will just be used to build a bitmap; we'll
630-
* set a bitmap entry more than once, but never actually return a row twice.
655+
* Item pointers must be returned in ascending order.
656+
*
657+
* Note: this can return a "lossy page" item pointer, indicating that the
658+
* key potentially matches all items on that heap page. However, it is
659+
* not allowed to return both a lossy page pointer and exact (regular)
660+
* item pointers for the same page. (Doing so would break the key-combination
661+
* logic in scanGetItem.)
631662
*/
632663
staticvoid
633664
keyGetItem(Relationindex,GinState*ginstate,MemoryContexttempCtx,
634665
GinScanKeykey,ItemPointeradvancePast)
635666
{
636667
ItemPointerDatamyAdvancePast=*advancePast;
668+
ItemPointerDatacurPageLossy;
637669
uint32i;
638670
uint32lossyEntry;
639671
boolhaveLossyEntry;
@@ -691,87 +723,112 @@ keyGetItem(Relation index, GinState *ginstate, MemoryContext tempCtx,
691723
myAdvancePast=key->curItem;
692724

693725
/*
694-
* Prepare entryRes array to be passed to consistentFn.
695-
*
696-
* If key->nentries == 1 then the consistentFn should always succeed,
697-
* but we must call it anyway to find out the recheck status.
698-
*
699726
* Lossy-page entries pose a problem, since we don't know the correct
700727
* entryRes state to pass to the consistentFn, and we also don't know
701728
* what its combining logic will be (could be AND, OR, or even NOT).
702-
* Our strategy for a single lossy-page entry is to try the
703-
* consistentFn both ways and return a hit if it accepts either one
704-
* (forcing the hit to be marked lossy so it will be rechecked).
729+
* If the logic is OR then the consistentFn might succeed for all
730+
* items in the lossy page even when none of the other entries match.
731+
*
732+
* If we have a single lossy-page entry then we check to see if the
733+
* consistentFn will succeed with only that entry TRUE. If so,
734+
* we return a lossy-page pointer to indicate that the whole heap
735+
* page must be checked. (On the next call, we'll advance past all
736+
* regular and lossy entries for this page before resuming search,
737+
* thus ensuring that we never return both regular and lossy pointers
738+
* for the same page.)
705739
*
706740
* This idea could be generalized to more than one lossy-page entry,
707741
* but ideally lossy-page entries should be infrequent so it would
708-
* seldom be the case that we have more than one. If we do find more
709-
* than one, we just punt and return the item as lossy.
742+
* seldom be the case that we have more than one at once. So it
743+
* doesn't seem worth the extra complexity to optimize that case.
744+
* If we do find more than one, we just punt and return a lossy-page
745+
* pointer always.
710746
*
711747
* Note that only lossy-page entries pointing to the current item's
712-
* page should trigger this processing.
748+
* page should trigger this processing; we might have future lossy
749+
* pages in the entry array, but they aren't relevant yet.
713750
*/
751+
ItemPointerSetLossyPage(&curPageLossy,
752+
GinItemPointerGetBlockNumber(&key->curItem));
753+
714754
lossyEntry=0;
715755
haveLossyEntry= false;
716756
for (i=0;i<key->nentries;i++)
717757
{
718758
entry=key->scanEntry+i;
719-
720-
if (entry->isFinished)
721-
key->entryRes[i]= FALSE;
722-
elseif (ItemPointerIsLossyPage(&entry->curItem)&&
723-
GinItemPointerGetBlockNumber(&entry->curItem)==
724-
GinItemPointerGetBlockNumber(&key->curItem))
759+
if (entry->isFinished== FALSE&&
760+
compareItemPointers(&entry->curItem,&curPageLossy)==0)
725761
{
726762
if (haveLossyEntry)
727763
{
728-
/* Too many lossy entries, punt */
764+
/* Multiple lossy entries, punt */
765+
key->curItem=curPageLossy;
729766
key->recheckCurItem= true;
730767
return;
731768
}
732769
lossyEntry=i;
733770
haveLossyEntry= true;
734-
/* initially assume TRUE */
735-
key->entryRes[i]= TRUE;
736771
}
737-
elseif (compareItemPointers(&entry->curItem,&key->curItem)==0)
738-
key->entryRes[i]= TRUE;
739-
else
740-
key->entryRes[i]= FALSE;
741772
}
742773

774+
/* prepare for calling consistentFn in temp context */
743775
oldCtx=MemoryContextSwitchTo(tempCtx);
744776

777+
if (haveLossyEntry)
778+
{
779+
/* Single lossy-page entry, so see if whole page matches */
780+
memset(key->entryRes, FALSE,key->nentries);
781+
key->entryRes[lossyEntry]= TRUE;
782+
783+
if (callConsistentFn(ginstate,key))
784+
{
785+
/* Yes, so clean up ... */
786+
MemoryContextSwitchTo(oldCtx);
787+
MemoryContextReset(tempCtx);
788+
789+
/* and return lossy pointer for whole page */
790+
key->curItem=curPageLossy;
791+
key->recheckCurItem= true;
792+
return;
793+
}
794+
}
795+
745796
/*
746-
* Initialize recheckCurItem in case the consistentFn doesn't know it
747-
* should set it. The safe assumption in that case is to force
748-
* recheck.
797+
* At this point we know that we don't need to return a lossy
798+
* whole-page pointer, but we might have matches for individual exact
799+
* item pointers, possibly in combination with a lossy pointer. Our
800+
* strategy if there's a lossy pointer is to try the consistentFn both
801+
* ways and return a hit if it accepts either one (forcing the hit to
802+
* be marked lossy so it will be rechecked).
803+
*
804+
* Prepare entryRes array to be passed to consistentFn.
805+
*
806+
* (If key->nentries == 1 then the consistentFn should always succeed,
807+
* but we must call it anyway to find out the recheck status.)
749808
*/
750-
key->recheckCurItem= true;
809+
for (i=0;i<key->nentries;i++)
810+
{
811+
entry=key->scanEntry+i;
812+
if (entry->isFinished== FALSE&&
813+
compareItemPointers(&entry->curItem,&key->curItem)==0)
814+
key->entryRes[i]= TRUE;
815+
else
816+
key->entryRes[i]= FALSE;
817+
}
818+
if (haveLossyEntry)
819+
key->entryRes[lossyEntry]= TRUE;
751820

752-
res=DatumGetBool(FunctionCall6(&ginstate->consistentFn[key->attnum-1],
753-
PointerGetDatum(key->entryRes),
754-
UInt16GetDatum(key->strategy),
755-
key->query,
756-
UInt32GetDatum(key->nentries),
757-
PointerGetDatum(key->extra_data),
758-
PointerGetDatum(&key->recheckCurItem)));
821+
res=callConsistentFn(ginstate,key);
759822

760823
if (!res&&haveLossyEntry)
761824
{
762825
/* try the other way for the lossy item */
763826
key->entryRes[lossyEntry]= FALSE;
764-
key->recheckCurItem= true;
765827

766-
res=DatumGetBool(FunctionCall6(&ginstate->consistentFn[key->attnum-1],
767-
PointerGetDatum(key->entryRes),
768-
UInt16GetDatum(key->strategy),
769-
key->query,
770-
UInt32GetDatum(key->nentries),
771-
PointerGetDatum(key->extra_data),
772-
PointerGetDatum(&key->recheckCurItem)));
828+
res=callConsistentFn(ginstate,key);
773829
}
774830

831+
/* clean up after consistentFn calls */
775832
MemoryContextSwitchTo(oldCtx);
776833
MemoryContextReset(tempCtx);
777834

@@ -1108,7 +1165,6 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
11081165
GinScanOpaqueso= (GinScanOpaque)scan->opaque;
11091166
MemoryContextoldCtx;
11101167
boolrecheck,
1111-
keyrecheck,
11121168
match;
11131169
inti;
11141170
pendingPositionpos;
@@ -1152,8 +1208,8 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
11521208
continue;
11531209

11541210
/*
1155-
* Matching of entries of one row is finished, so check rowby
1156-
* consistentfunction.
1211+
* Matching of entries of one row is finished, so check rowusing
1212+
* consistentfunctions.
11571213
*/
11581214
oldCtx=MemoryContextSwitchTo(so->tempCtx);
11591215
recheck= false;
@@ -1163,21 +1219,12 @@ scanPendingInsert(IndexScanDesc scan, TIDBitmap *tbm, int64 *ntids)
11631219
{
11641220
GinScanKeykey=so->keys+i;
11651221

1166-
keyrecheck= true;
1167-
1168-
if (!DatumGetBool(FunctionCall6(&so->ginstate.consistentFn[key->attnum-1],
1169-
PointerGetDatum(key->entryRes),
1170-
UInt16GetDatum(key->strategy),
1171-
key->query,
1172-
UInt32GetDatum(key->nentries),
1173-
PointerGetDatum(key->extra_data),
1174-
PointerGetDatum(&keyrecheck))))
1222+
if (!callConsistentFn(&so->ginstate,key))
11751223
{
11761224
match= false;
11771225
break;
11781226
}
1179-
1180-
recheck |=keyrecheck;
1227+
recheck |=key->recheckCurItem;
11811228
}
11821229

11831230
MemoryContextSwitchTo(oldCtx);
@@ -1248,11 +1295,26 @@ scanGetItem(IndexScanDesc scan, ItemPointer advancePast,
12481295

12491296
Assert(!ItemPointerIsMax(item));
12501297

1251-
/*
1298+
/*----------
12521299
* Now *item contains first ItemPointer after previous result.
12531300
*
12541301
* The item is a valid hit only if all the keys returned either
12551302
* that exact TID, or a lossy reference to the same page.
1303+
*
1304+
* This logic works only if a keyGetItem stream can never contain both
1305+
* exact and lossy pointers for the same page. Else we could have a
1306+
* case like
1307+
*
1308+
*stream 1stream 2
1309+
*......
1310+
*42/642/7
1311+
*50/142/0xffff
1312+
*......
1313+
*
1314+
* We would conclude that 42/6 is not a match and advance stream 1,
1315+
* thus never detecting the match to the lossy pointer in stream 2.
1316+
* (keyGetItem has a similar problem versus entryGetItem.)
1317+
*----------
12561318
*/
12571319
match= true;
12581320
for (i=0;i<so->nkeys;i++)

‎src/include/access/gin.h

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
*
55
*Copyright (c) 2006-2010, PostgreSQL Global Development Group
66
*
7-
*$PostgreSQL: pgsql/src/include/access/gin.h,v 1.40 2010/08/0102:12:42 tgl Exp $
7+
*$PostgreSQL: pgsql/src/include/access/gin.h,v 1.41 2010/08/0119:16:39 tgl Exp $
88
*--------------------------------------------------------------------------
99
*/
1010
#ifndefGIN_H
@@ -107,13 +107,22 @@ typedef struct GinMetaPageData
107107
* We use our own ItemPointerGet(BlockNumber|GetOffsetNumber)
108108
* to avoid Asserts, since sometimes the ip_posid isn't "valid"
109109
*/
110-
111110
#defineGinItemPointerGetBlockNumber(pointer) \
112111
BlockIdGetBlockNumber(&(pointer)->ip_blkid)
113112

114113
#defineGinItemPointerGetOffsetNumber(pointer) \
115114
((pointer)->ip_posid)
116115

116+
/*
117+
* Special-case item pointer values needed by the GIN search logic.
118+
*MIN: sorts less than any valid item pointer
119+
*MAX: sorts greater than any valid item pointer
120+
*LOSSY PAGE: indicates a whole heap page, sorts after normal item
121+
*pointers for that page
122+
* Note that these are all distinguishable from an "invalid" item pointer
123+
* (which is InvalidBlockNumber/0) as well as from all normal item
124+
* pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage).
125+
*/
117126
#defineItemPointerSetMin(p) \
118127
ItemPointerSet((p), (BlockNumber)0, (OffsetNumber)0)
119128
#defineItemPointerIsMin(p) \

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp