Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit35fcb1b

Browse files
committed
Allow GiST distance function to return merely a lower-bound.
The distance function can now set *recheck = false, like index quals. Theexecutor will then re-check the ORDER BY expressions, and use a queue toreorder the results on the fly.This makes it possible to do kNN-searches on polygons and circles, whichdon't store the exact value in the index, but just a bounding box.Alexander Korotkov and me
1 parentecd222e commit35fcb1b

File tree

19 files changed

+699
-40
lines changed

19 files changed

+699
-40
lines changed

‎doc/src/sgml/gist.sgml

Lines changed: 27 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,7 @@
105105
<literal>~=</>
106106
</entry>
107107
<entry>
108+
<literal>&lt;-&gt;</>
108109
</entry>
109110
</row>
110111
<row>
@@ -163,6 +164,7 @@
163164
<literal>~=</>
164165
</entry>
165166
<entry>
167+
<literal>&lt;-&gt;</>
166168
</entry>
167169
</row>
168170
<row>
@@ -206,6 +208,12 @@
206208
</tgroup>
207209
</table>
208210

211+
<para>
212+
Currently, ordering by the distance operator <literal>&lt;-&gt;</>
213+
is supported only with <literal>point</> by the operator classes
214+
of the geometric types.
215+
</para>
216+
209217
<para>
210218
For historical reasons, the <literal>inet_ops</> operator class is
211219
not the default class for types <type>inet</> and <type>cidr</>.
@@ -780,6 +788,7 @@ my_distance(PG_FUNCTION_ARGS)
780788
data_type *query = PG_GETARG_DATA_TYPE_P(1);
781789
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
782790
/* Oid subtype = PG_GETARG_OID(3); */
791+
/* bool *recheck = (bool *) PG_GETARG_POINTER(4); */
783792
data_type *key = DatumGetDataType(entry-&gt;key);
784793
double retval;
785794

@@ -792,14 +801,24 @@ my_distance(PG_FUNCTION_ARGS)
792801
</programlisting>
793802

794803
The arguments to the <function>distance</> function are identical to
795-
the arguments of the <function>consistent</> function, except that no
796-
recheck flag is used. The distance to a leaf index entry must always
797-
be determined exactly, since there is no way to re-order the tuples
798-
once they are returned. Some approximation is allowed when determining
799-
the distance to an internal tree node, so long as the result is never
800-
greater than any child's actual distance. Thus, for example, distance
801-
to a bounding box is usually sufficient in geometric applications. The
802-
result value can be any finite <type>float8</> value. (Infinity and
804+
the arguments of the <function>consistent</> function.
805+
</para>
806+
807+
<para>
808+
Some approximation is allowed when determining the distance, as long as
809+
the result is never greater than the entry's actual distance. Thus, for
810+
example, distance to a bounding box is usually sufficient in geometric
811+
applications. For an internal tree node, the distance returned must not
812+
be greater than the distance to any of the child nodes. If the returned
813+
distance is not accurate, the function must set *recheck to false. (This
814+
is not necessary for internal tree nodes; for them, the calculation is
815+
always assumed to be inaccurate). The executor will calculate the
816+
accurate distance after fetching the tuple from the heap, and reorder
817+
the tuples if necessary.
818+
</para>
819+
820+
<para>
821+
The result value can be any finite <type>float8</> value. (Infinity and
803822
minus infinity are used internally to handle cases such as nulls, so it
804823
is not recommended that <function>distance</> functions return these
805824
values.)

‎src/backend/access/gist/gistget.c

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,10 @@
3030
* The index tuple might represent either a heap tuple or a lower index page,
3131
* depending on whether the containing page is a leaf page or not.
3232
*
33-
* On success return for a heap tuple, *recheck_p is set to indicate
34-
*whetherrecheck is needed. We recheck if any of the consistent()functions
35-
* request it. recheck is not interesting when examining a non-leaf entry,
36-
* since we must visit the lower index page if there's any doubt.
33+
* On success return for a heap tuple, *recheck_p is set to indicate whether
34+
* recheck is needed. We recheck if any of the consistent()or distance()
35+
*functionsrequest it. recheck is not interesting when examining a non-leaf
36+
*entry,since we must visit the lower index page if there's any doubt.
3737
*
3838
* If we are doing an ordered scan, so->distances[] is filled with distance
3939
* data from the distance() functions before returning success.
@@ -176,6 +176,7 @@ gistindex_keytest(IndexScanDesc scan,
176176
else
177177
{
178178
Datumdist;
179+
boolrecheck;
179180
GISTENTRYde;
180181

181182
gistdentryinit(giststate,key->sk_attno-1,&de,
@@ -192,16 +193,21 @@ gistindex_keytest(IndexScanDesc scan,
192193
* always be zero, but might as well pass it for possible future
193194
* use.)
194195
*
195-
* Note that Distance functions don't get a recheck argument. We
196-
* can't tolerate lossy distance calculations on leaf tuples;
197-
* there is no opportunity to re-sort the tuples afterwards.
196+
* Distance functions get a recheck argument as well. In this
197+
* case the returned distance is the lower bound of distance
198+
* and needs to be rechecked. We return single recheck flag
199+
* which means that both quals and distances are to be
200+
* rechecked.
198201
*/
199-
dist=FunctionCall4Coll(&key->sk_func,
202+
dist=FunctionCall5Coll(&key->sk_func,
200203
key->sk_collation,
201204
PointerGetDatum(&de),
202205
key->sk_argument,
203206
Int32GetDatum(key->sk_strategy),
204-
ObjectIdGetDatum(key->sk_subtype));
207+
ObjectIdGetDatum(key->sk_subtype),
208+
PointerGetDatum(&recheck));
209+
210+
*recheck_p |=recheck;
205211

206212
*distance_p=DatumGetFloat8(dist);
207213
}
@@ -434,6 +440,7 @@ getNextNearest(IndexScanDesc scan)
434440
{
435441
GISTScanOpaqueso= (GISTScanOpaque)scan->opaque;
436442
boolres= false;
443+
inti;
437444

438445
if (scan->xs_itup)
439446
{
@@ -454,6 +461,11 @@ getNextNearest(IndexScanDesc scan)
454461
/* found a heap item at currently minimal distance */
455462
scan->xs_ctup.t_self=item->data.heap.heapPtr;
456463
scan->xs_recheck=item->data.heap.recheck;
464+
for (i=0;i<scan->numberOfOrderBys;i++)
465+
{
466+
scan->xs_orderbyvals[i]=Float8GetDatum(item->distances[i]);
467+
scan->xs_orderbynulls[i]= false;
468+
}
457469

458470
/* in an index-only scan, also return the reconstructed tuple. */
459471
if (scan->xs_want_itup)

‎src/backend/access/gist/gistproc.c

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1478,3 +1478,40 @@ gist_point_distance(PG_FUNCTION_ARGS)
14781478

14791479
PG_RETURN_FLOAT8(distance);
14801480
}
1481+
1482+
/*
1483+
* The inexact GiST distance method for geometric types that store bounding
1484+
* boxes.
1485+
*
1486+
* Compute lossy distance from point to index entries. The result is inexact
1487+
* because index entries are bounding boxes, not the exact shapes of the
1488+
* indexed geometric types. We use distance from point to MBR of index entry.
1489+
* This is a lower bound estimate of distance from point to indexed geometric
1490+
* type.
1491+
*/
1492+
Datum
1493+
gist_bbox_distance(PG_FUNCTION_ARGS)
1494+
{
1495+
GISTENTRY*entry= (GISTENTRY*)PG_GETARG_POINTER(0);
1496+
StrategyNumberstrategy= (StrategyNumber)PG_GETARG_UINT16(2);
1497+
bool*recheck= (bool*)PG_GETARG_POINTER(4);
1498+
doubledistance;
1499+
StrategyNumberstrategyGroup=strategy /GeoStrategyNumberOffset;
1500+
1501+
/* Bounding box distance is always inexact. */
1502+
*recheck= true;
1503+
1504+
switch (strategyGroup)
1505+
{
1506+
casePointStrategyNumberGroup:
1507+
distance=computeDistance(false,
1508+
DatumGetBoxP(entry->key),
1509+
PG_GETARG_POINT_P(1));
1510+
break;
1511+
default:
1512+
elog(ERROR,"unknown strategy number: %d",strategy);
1513+
distance=0.0;/* keep compiler quiet */
1514+
}
1515+
1516+
PG_RETURN_FLOAT8(distance);
1517+
}

‎src/backend/access/gist/gistscan.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,11 @@ gistbeginscan(PG_FUNCTION_ARGS)
8585
/* workspaces with size dependent on numberOfOrderBys: */
8686
so->distances=palloc(sizeof(double)*scan->numberOfOrderBys);
8787
so->qual_ok= true;/* in case there are zero keys */
88+
if (scan->numberOfOrderBys>0)
89+
{
90+
scan->xs_orderbyvals=palloc(sizeof(Datum)*scan->numberOfOrderBys);
91+
scan->xs_orderbynulls=palloc(sizeof(bool)*scan->numberOfOrderBys);
92+
}
8893

8994
scan->opaque=so;
9095

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp