85
85
</entry>
86
86
</row>
87
87
<row>
88
- <entry><function>substring_similarity (text, text)</function><indexterm><primary>substring_similarity </primary></indexterm></entry>
88
+ <entry><function>subword_similarity (text, text)</function><indexterm><primary>subword_similarity </primary></indexterm></entry>
89
89
<entry><type>real</type></entry>
90
90
<entry>
91
91
Returns a number that indicates how similar the first string
92
- to the most similarsubstring of the second string. The range of
92
+ to the most similarword of the second string. The range of
93
93
the result is zero (indicating that the two strings are completely
94
94
dissimilar) to one (indicating that the first string is identical
95
- tosubstring of the secondsubstring ).
95
+ toone of theword of the secondstring ).
96
96
</entry>
97
97
</row>
98
98
<row>
145
145
<entry>
146
146
Returns <literal>true</> if its arguments have a similarity that is
147
147
greater than the current similarity threshold set by
148
- <varname>pg_trgm.limit </>.
148
+ <varname>pg_trgm.sml_limit </>.
149
149
</entry>
150
150
</row>
151
151
<row>
152
- <entry><type>text</> <literal><% </literal> <type>text</></entry>
152
+ <entry><type>text</> <literal>%> </literal> <type>text</></entry>
153
153
<entry><type>boolean</type></entry>
154
154
<entry>
155
- Returns <literal>true</> if its arguments have asubstring similarity
156
- that is greater than the currentsubstring similarity threshold set by
157
- <varname>pg_trgm.substring_limit </> parameter.
155
+ Returns <literal>true</> if its arguments have asubword similarity
156
+ that is greater than the currentsubword similarity threshold set by
157
+ <varname>pg_trgm.subword_limit </> parameter.
158
158
</entry>
159
159
</row>
160
160
<row>
165
165
one minus the <function>similarity()</> value.
166
166
</entry>
167
167
</row>
168
+ <row>
169
+ <entry><type>text</> <literal><->></literal> <type>text</></entry>
170
+ <entry><type>real</type></entry>
171
+ <entry>
172
+ Returns the <quote>distance</> between the arguments, that is
173
+ one minus the <function>subword_similarity()</> value.
174
+ </entry>
175
+ </row>
168
176
</tbody>
169
177
</tgroup>
170
178
</table>
174
182
<title>GUC Parameters</title>
175
183
176
184
<variablelist>
177
- <varlistentry id="guc-pgtrgm-limit" xreflabel="pg_trgm.limit ">
185
+ <varlistentry id="guc-pgtrgm-sml- limit" xreflabel="pg_trgm.sml_limit ">
178
186
<term>
179
- <varname>pg_trgm.limit </> (<type>real</type>)
187
+ <varname>pg_trgm.sml_limit </> (<type>real</type>)
180
188
<indexterm>
181
- <primary><varname>pg_trgm.limit </> configuration parameter</primary>
189
+ <primary><varname>pg_trgm.sml_limit </> configuration parameter</primary>
182
190
</indexterm>
183
191
</term>
184
192
<listitem>
189
197
</listitem>
190
198
</varlistentry>
191
199
192
- <varlistentry id="guc-pgtrgm-substring -limit" xreflabel="pg_trgm.substring_limit ">
200
+ <varlistentry id="guc-pgtrgm-subword -limit" xreflabel="pg_trgm.subword_limit ">
193
201
<term>
194
- <varname>pg_trgm.substring_limit </> (<type>real</type>)
202
+ <varname>pg_trgm.subword_limit </> (<type>real</type>)
195
203
<indexterm>
196
- <primary><varname>pg_trgm.substring_limit </> configuration parameter</primary>
204
+ <primary><varname>pg_trgm.subword_limit </> configuration parameter</primary>
197
205
</indexterm>
198
206
</term>
199
207
<listitem>
200
208
<para>
201
- Sets the currentsubstring similarity threshold that is used by
202
- the <literal><% </> operator. The threshold must be between
209
+ Sets the currentsubword similarity threshold that is used by
210
+ the <literal>%> </> operator. The threshold must be between
203
211
0 and 1 (default is 0.6).
204
212
</para>
205
213
</listitem>
@@ -262,20 +270,31 @@ SELECT t, t <-> '<replaceable>word</>' AS dist
262
270
</para>
263
271
264
272
<para>
265
- Also you can use an index on the <structfield>t</> column forsubstring
273
+ Also you can use an index on the <structfield>t</> column forsubword
266
274
similarity. For example:
267
275
<programlisting>
268
- SELECT t,substring_similarity ('<replaceable>word</>', t) AS sml
276
+ SELECT t,subword_similarity ('<replaceable>word</>', t) AS sml
269
277
FROM test_trgm
270
- WHERE '<replaceable>word</>' <% t
278
+ WHEREt %> '<replaceable>word</>'
271
279
ORDER BY sml DESC, t;
272
280
</programlisting>
273
- This will return all values in the text column that have asubstring
281
+ This will return all values in the text column that have aword
274
282
which sufficiently similar to <replaceable>word</>, sorted from best
275
283
match to worst. The index will be used to make this a fast operation
276
284
even over very large data sets.
277
285
</para>
278
286
287
+ <para>
288
+ A variant of the above query is
289
+ <programlisting>
290
+ SELECT t, t <->> '<replaceable>word</>' AS dist
291
+ FROM test_trgm
292
+ ORDER BY dist LIMIT 10;
293
+ </programlisting>
294
+ This can be implemented quite efficiently by GiST indexes, but not
295
+ by GIN indexes.
296
+ </para>
297
+
279
298
<para>
280
299
Beginning in <productname>PostgreSQL</> 9.1, these index types also support
281
300
index searches for <literal>LIKE</> and <literal>ILIKE</>, for example
@@ -396,4 +415,4 @@ CREATE INDEX words_idx ON words USING GIN (word gin_trgm_ops);
396
415
</para>
397
416
</sect2>
398
417
399
- </sect1>
418
+ </sect1>