Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit854823f

Browse files
committed
Add optional compression method to SP-GiST
Patch allows to have different types of column and value stored in leaf tuplesof SP-GiST. The main application of feature is to transform complex column typeto simple indexed type or for truncating too long value, transformation couldbe lossy. Simple example: polygons are converted to their bounding boxes,this opclass follows.Authors: me, Heikki Linnakangas, Alexander Korotkov, Nikita GlukhovReviewed-By: all authors + Darafei PraliaskouskiDiscussions:https://www.postgresql.org/message-id/5447B3FF.2080406@sigaev.ruhttps://www.postgresql.org/message-id/flat/54907069.1030506@sigaev.ru#54907069.1030506@sigaev.ru
1 parent9373baa commit854823f

File tree

7 files changed

+182
-37
lines changed

7 files changed

+182
-37
lines changed

‎doc/src/sgml/spgist.sgml

Lines changed: 72 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -240,20 +240,22 @@
240240

241241
<para>
242242
There are five user-defined methods that an index operator class for
243-
<acronym>SP-GiST</acronym> must provide. All five follow the convention
244-
of accepting two <type>internal</type> arguments, the first of which is a
245-
pointer to a C struct containing input values for the support method,
246-
while the second argument is a pointer to a C struct where output values
247-
must be placed. Four of the methods just return <type>void</type>, since
248-
all their results appear in the output struct; but
243+
<acronym>SP-GiST</acronym> must provide, and one is optional. All five
244+
mandatory methods follow the conventionof accepting two <type>internal</type>
245+
arguments, the first of which is apointer to a C struct containing input
246+
values for the support method,while the second argument is a pointer to a
247+
C struct where output valuesmust be placed. Four of themandatorymethods just
248+
return <type>void</type>, sinceall their results appear in the output struct; but
249249
<function>leaf_consistent</function> additionally returns a <type>boolean</type> result.
250250
The methods must not modify any fields of their input structs. In all
251251
cases, the output struct is initialized to zeroes before calling the
252-
user-defined method.
252+
user-defined method. Optional sixth method <function>compress</function>
253+
accepts datum to be indexed as the only argument and returns value suitable
254+
for physical storage in leaf tuple.
253255
</para>
254256

255257
<para>
256-
The five user-defined methods are:
258+
The fivemandatoryuser-defined methods are:
257259
</para>
258260

259261
<variablelist>
@@ -283,6 +285,7 @@ typedef struct spgConfigOut
283285
{
284286
Oid prefixType; /* Data type of inner-tuple prefixes */
285287
Oid labelType; /* Data type of inner-tuple node labels */
288+
Oid leafType; /* Data type of leaf-tuple values */
286289
bool canReturnData; /* Opclass can reconstruct original data */
287290
bool longValuesOK; /* Opclass can cope with values &gt; 1 page */
288291
} spgConfigOut;
@@ -305,6 +308,22 @@ typedef struct spgConfigOut
305308
class is capable of segmenting long values by repeated suffixing
306309
(see <xref linkend="spgist-limits"/>).
307310
</para>
311+
312+
<para>
313+
<structfield>leafType</structfield> is typically the same as
314+
<structfield>attType</structfield>. For the reasons of backward
315+
compatibility, method <function>config</function> can
316+
leave <structfield>leafType</structfield> uninitialized; that would
317+
give the same effect as setting <structfield>leafType</structfield> equal
318+
to <structfield>attType</structfield>. When <structfield>attType</structfield>
319+
and <structfield>leafType</structfield> are different, then optional
320+
method <function>compress</function> must be provided.
321+
Method <function>compress</function> is responsible
322+
for transformation of datums to be indexed from <structfield>attType</structfield>
323+
to <structfield>leafType</structfield>.
324+
Note: both consistent functions will get <structfield>scankeys</structfield>
325+
unchanged, without transformation using <function>compress</function>.
326+
</para>
308327
</listitem>
309328
</varlistentry>
310329

@@ -380,10 +399,16 @@ typedef struct spgChooseOut
380399
} spgChooseOut;
381400
</programlisting>
382401

383-
<structfield>datum</structfield> is the original datum that was to be inserted
384-
into the index.
385-
<structfield>leafDatum</structfield> is initially the same as
386-
<structfield>datum</structfield>, but can change at lower levels of the tree
402+
<structfield>datum</structfield> is the original datum of
403+
<structname>spgConfigIn</structname>.<structfield>attType</structfield>
404+
type that was to be inserted into the index.
405+
<structfield>leafDatum</structfield> is a value of
406+
<structname>spgConfigOut</structname>.<structfield>leafType</structfield>
407+
type which is initially an result of method
408+
<function>compress</function> applied to <structfield>datum</structfield>
409+
when method <function>compress</function> is provided, or same value as
410+
<structfield>datum</structfield> otherwise.
411+
<structfield>leafDatum</structfield> can change at lower levels of the tree
387412
if the <function>choose</function> or <function>picksplit</function>
388413
methods change it. When the insertion search reaches a leaf page,
389414
the current value of <structfield>leafDatum</structfield> is what will be stored
@@ -418,7 +443,7 @@ typedef struct spgChooseOut
418443
Set <structfield>levelAdd</structfield> to the increment in
419444
<structfield>level</structfield> caused by descending through that node,
420445
or leave it as zero if the operator class does not use levels.
421-
Set <structfield>restDatum</structfield> to equal <structfield>datum</structfield>
446+
Set <structfield>restDatum</structfield> to equal <structfield>leafDatum</structfield>
422447
if the operator class does not modify datums from one level to the
423448
next, or otherwise set it to the modified value to be used as
424449
<structfield>leafDatum</structfield> at the next level.
@@ -509,7 +534,9 @@ typedef struct spgPickSplitOut
509534
</programlisting>
510535

511536
<structfield>nTuples</structfield> is the number of leaf tuples provided.
512-
<structfield>datums</structfield> is an array of their datum values.
537+
<structfield>datums</structfield> is an array of their datum values of
538+
<structname>spgConfigOut</structname>.<structfield>leafType</structfield>
539+
type.
513540
<structfield>level</structfield> is the current level that all the leaf tuples
514541
share, which will become the level of the new inner tuple.
515542
</para>
@@ -624,7 +651,8 @@ typedef struct spgInnerConsistentOut
624651
<structfield>reconstructedValue</structfield> is the value reconstructed for the
625652
parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
626653
<function>inner_consistent</function> function did not provide a value at the
627-
parent level.
654+
parent level. <structfield>reconstructedValue</structfield> is always of
655+
<structname>spgConfigOut</structname>.<structfield>leafType</structfield> type.
628656
<structfield>traversalValue</structfield> is a pointer to any traverse data
629657
passed down from the previous call of <function>inner_consistent</function>
630658
on the parent index tuple, or NULL at the root level.
@@ -659,6 +687,7 @@ typedef struct spgInnerConsistentOut
659687
necessarily so, so an array is used.)
660688
If value reconstruction is needed, set
661689
<structfield>reconstructedValues</structfield> to an array of the values
690+
of <structname>spgConfigOut</structname>.<structfield>leafType</structfield> type
662691
reconstructed for each child node to be visited; otherwise, leave
663692
<structfield>reconstructedValues</structfield> as NULL.
664693
If it is desired to pass down additional out-of-band information
@@ -730,7 +759,8 @@ typedef struct spgLeafConsistentOut
730759
<structfield>reconstructedValue</structfield> is the value reconstructed for the
731760
parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
732761
<function>inner_consistent</function> function did not provide a value at the
733-
parent level.
762+
parent level. <structfield>reconstructedValue</structfield> is always of
763+
<structname>spgConfigOut</structname>.<structfield>leafType</structfield> type.
734764
<structfield>traversalValue</structfield> is a pointer to any traverse data
735765
passed down from the previous call of <function>inner_consistent</function>
736766
on the parent index tuple, or NULL at the root level.
@@ -739,16 +769,18 @@ typedef struct spgLeafConsistentOut
739769
<structfield>returnData</structfield> is <literal>true</literal> if reconstructed data is
740770
required for this query; this will only be so if the
741771
<function>config</function> function asserted <structfield>canReturnData</structfield>.
742-
<structfield>leafDatum</structfield> is the key value stored in the current
743-
leaf tuple.
772+
<structfield>leafDatum</structfield> is the key value of
773+
<structname>spgConfigOut</structname>.<structfield>leafType</structfield>
774+
stored in the current leaf tuple.
744775
</para>
745776

746777
<para>
747778
The function must return <literal>true</literal> if the leaf tuple matches the
748779
query, or <literal>false</literal> if not. In the <literal>true</literal> case,
749780
if <structfield>returnData</structfield> is <literal>true</literal> then
750-
<structfield>leafValue</structfield> must be set to the value originally supplied
751-
to be indexed for this leaf tuple. Also,
781+
<structfield>leafValue</structfield> must be set to the value of
782+
<structname>spgConfigIn</structname>.<structfield>attType</structfield> type
783+
originally supplied to be indexed for this leaf tuple. Also,
752784
<structfield>recheck</structfield> may be set to <literal>true</literal> if the match
753785
is uncertain and so the operator(s) must be re-applied to the actual
754786
heap tuple to verify the match.
@@ -757,6 +789,26 @@ typedef struct spgLeafConsistentOut
757789
</varlistentry>
758790
</variablelist>
759791

792+
<para>
793+
The optional user-defined method is:
794+
</para>
795+
796+
<variablelist>
797+
<varlistentry>
798+
<term><function>Datum compress(Datum in)</function></term>
799+
<listitem>
800+
<para>
801+
Converts the data item into a format suitable for physical storage in
802+
a leaf tuple of index page. It accepts
803+
<structname>spgConfigIn</structname>.<structfield>attType</structfield>
804+
value and return
805+
<structname>spgConfigOut</structname>.<structfield>leafType</structfield>
806+
value. Output value should not be toasted.
807+
</para>
808+
</listitem>
809+
</varlistentry>
810+
</variablelist>
811+
760812
<para>
761813
All the SP-GiST support methods are normally called in a short-lived
762814
memory context; that is, <varname>CurrentMemoryContext</varname> will be reset

‎src/backend/access/spgist/spgdoinsert.c

Lines changed: 30 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1906,14 +1906,37 @@ spgdoinsert(Relation index, SpGistState *state,
19061906
procinfo=index_getprocinfo(index,1,SPGIST_CHOOSE_PROC);
19071907

19081908
/*
1909-
* Since we don't use index_form_tuple in this AM, we have to make sure
1909+
* Prepare the leaf datum to insert.
1910+
*
1911+
* If an optional "compress" method is provided, then call it to form
1912+
* the leaf datum from the input datum. Otherwise store the input datum as
1913+
* is. Since we don't use index_form_tuple in this AM, we have to make sure
19101914
* value to be inserted is not toasted; FormIndexDatum doesn't guarantee
1911-
* that.
1915+
* that. But we assume the "compress" method to return an untoasted value.
19121916
*/
1913-
if (!isnull&&state->attType.attlen==-1)
1914-
datum=PointerGetDatum(PG_DETOAST_DATUM(datum));
1917+
if (!isnull)
1918+
{
1919+
if (OidIsValid(index_getprocid(index,1,SPGIST_COMPRESS_PROC)))
1920+
{
1921+
FmgrInfo*compressProcinfo=NULL;
1922+
1923+
compressProcinfo=index_getprocinfo(index,1,SPGIST_COMPRESS_PROC);
1924+
leafDatum=FunctionCall1Coll(compressProcinfo,
1925+
index->rd_indcollation[0],
1926+
datum);
1927+
}
1928+
else
1929+
{
1930+
Assert(state->attLeafType.type==state->attType.type);
19151931

1916-
leafDatum=datum;
1932+
if (state->attType.attlen==-1)
1933+
leafDatum=PointerGetDatum(PG_DETOAST_DATUM(datum));
1934+
else
1935+
leafDatum=datum;
1936+
}
1937+
}
1938+
else
1939+
leafDatum= (Datum)0;
19171940

19181941
/*
19191942
* Compute space needed for a leaf tuple containing the given datum.
@@ -1923,7 +1946,7 @@ spgdoinsert(Relation index, SpGistState *state,
19231946
*/
19241947
if (!isnull)
19251948
leafSize=SGLTHDRSZ+sizeof(ItemIdData)+
1926-
SpGistGetTypeSize(&state->attType,leafDatum);
1949+
SpGistGetTypeSize(&state->attLeafType,leafDatum);
19271950
else
19281951
leafSize=SGDTSIZE+sizeof(ItemIdData);
19291952

@@ -2138,7 +2161,7 @@ spgdoinsert(Relation index, SpGistState *state,
21382161
{
21392162
leafDatum=out.result.matchNode.restDatum;
21402163
leafSize=SGLTHDRSZ+sizeof(ItemIdData)+
2141-
SpGistGetTypeSize(&state->attType,leafDatum);
2164+
SpGistGetTypeSize(&state->attLeafType,leafDatum);
21422165
}
21432166

21442167
/*

‎src/backend/access/spgist/spgscan.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ typedef struct ScanStackEntry
4040
staticvoid
4141
freeScanStackEntry(SpGistScanOpaqueso,ScanStackEntry*stackEntry)
4242
{
43-
if (!so->state.attType.attbyval&&
43+
if (!so->state.attLeafType.attbyval&&
4444
DatumGetPointer(stackEntry->reconstructedValue)!=NULL)
4545
pfree(DatumGetPointer(stackEntry->reconstructedValue));
4646
if (stackEntry->traversalValue)
@@ -527,8 +527,8 @@ spgWalk(Relation index, SpGistScanOpaque so, bool scanWholeIndex,
527527
if (out.reconstructedValues)
528528
newEntry->reconstructedValue=
529529
datumCopy(out.reconstructedValues[i],
530-
so->state.attType.attbyval,
531-
so->state.attType.attlen);
530+
so->state.attLeafType.attbyval,
531+
so->state.attLeafType.attlen);
532532
else
533533
newEntry->reconstructedValue= (Datum)0;
534534

‎src/backend/access/spgist/spgutils.c

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,22 @@ spgGetCache(Relation index)
125125

126126
/* Get the information we need about each relevant datatype */
127127
fillTypeDesc(&cache->attType,atttype);
128+
129+
if (OidIsValid(cache->config.leafType)&&
130+
cache->config.leafType!=atttype)
131+
{
132+
if (!OidIsValid(index_getprocid(index,1,SPGIST_COMPRESS_PROC)))
133+
ereport(ERROR,
134+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
135+
errmsg("compress method must not defined when leaf type is different from input type")));
136+
137+
fillTypeDesc(&cache->attLeafType,cache->config.leafType);
138+
}
139+
else
140+
{
141+
cache->attLeafType=cache->attType;
142+
}
143+
128144
fillTypeDesc(&cache->attPrefixType,cache->config.prefixType);
129145
fillTypeDesc(&cache->attLabelType,cache->config.labelType);
130146

@@ -164,6 +180,7 @@ initSpGistState(SpGistState *state, Relation index)
164180

165181
state->config=cache->config;
166182
state->attType=cache->attType;
183+
state->attLeafType=cache->attLeafType;
167184
state->attPrefixType=cache->attPrefixType;
168185
state->attLabelType=cache->attLabelType;
169186

@@ -618,7 +635,7 @@ spgFormLeafTuple(SpGistState *state, ItemPointer heapPtr,
618635
/* compute space needed (note result is already maxaligned) */
619636
size=SGLTHDRSZ;
620637
if (!isnull)
621-
size+=SpGistGetTypeSize(&state->attType,datum);
638+
size+=SpGistGetTypeSize(&state->attLeafType,datum);
622639

623640
/*
624641
* Ensure that we can replace the tuple with a dead tuple later. This
@@ -634,7 +651,7 @@ spgFormLeafTuple(SpGistState *state, ItemPointer heapPtr,
634651
tup->nextOffset=InvalidOffsetNumber;
635652
tup->heapPtr=*heapPtr;
636653
if (!isnull)
637-
memcpyDatum(SGLTDATAPTR(tup),&state->attType,datum);
654+
memcpyDatum(SGLTDATAPTR(tup),&state->attLeafType,datum);
638655

639656
returntup;
640657
}

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp