@@ -263,9 +263,10 @@ SELECT 'fat & cow'::tsquery @@ 'a fat cat sat on a mat and ate a fat rat'::t
263
263
As the above example suggests, a <type>tsquery</type> is not just raw
264
264
text, any more than a <type>tsvector</type> is. A <type>tsquery</type>
265
265
contains search terms, which must be already-normalized lexemes, and
266
- may combine multiple terms using AND, OR, andNOT operators.
266
+ may combine multiple terms using AND, OR,NOT andFOLLOWED BY operators.
267
267
(For details see <xref linkend="datatype-textsearch">.) There are
268
- functions <function>to_tsquery</> and <function>plainto_tsquery</>
268
+ functions <function>to_tsquery</>, <function>plainto_tsquery</>
269
+ and <function>phraseto_tsquery</>
269
270
that are helpful in converting user-written text into a proper
270
271
<type>tsquery</type>, for example by normalizing words appearing in
271
272
the text. Similarly, <function>to_tsvector</> is used to parse and
@@ -293,6 +294,35 @@ SELECT 'fat cats ate fat rats'::tsvector @@ to_tsquery('fat & rat');
293
294
already normalized, so <literal>rats</> does not match <literal>rat</>.
294
295
</para>
295
296
297
+ <para>
298
+ Phrase search is made possible with the help of the <literal><-></>
299
+ (FOLLOWED BY) operator, which enforces lexeme order. This allows you
300
+ to discard strings not containing the desired phrase, for example:
301
+
302
+ <programlisting>
303
+ SELECT q @@ to_tsquery('fatal <-> error')
304
+ FROM unnest(array[to_tsvector('fatal error'),
305
+ to_tsvector('error is not fatal')]) AS q;
306
+ ?column?
307
+ ----------
308
+ t
309
+ f
310
+ </programlisting>
311
+
312
+ A more generic version of the FOLLOWED BY operator takes form of
313
+ <literal><N></>, where N stands for the greatest allowed distance
314
+ between the specified lexemes. The <literal>phraseto_tsquery</>
315
+ function makes use of this behavior in order to construct a
316
+ <literal>tsquery</> capable of matching the provided phrase:
317
+
318
+ <programlisting>
319
+ SELECT phraseto_tsquery('cat ate some rats');
320
+ phraseto_tsquery
321
+ -------------------------------
322
+ ( 'cat' <-> 'ate' ) <2> 'rat'
323
+ </programlisting>
324
+ </para>
325
+
296
326
<para>
297
327
The <literal>@@</literal> operator also
298
328
supports <type>text</type> input, allowing explicit conversion of a text
@@ -732,7 +762,7 @@ to_tsquery(<optional> <replaceable class="PARAMETER">config</replaceable> <type>
732
762
<replaceable>querytext</replaceable>, which must consist of single tokens
733
763
separated by the Boolean operators <literal>&</literal> (AND),
734
764
<literal>|</literal> (OR), <literal>!</literal> (NOT), and also the
735
- <literal>? </literal> (FOLLOWED BY) phrase search operator. These operators
765
+ <literal><-> </literal> (FOLLOWED BY) phrase search operator. These operators
736
766
can be grouped using parentheses. In other words, the input to
737
767
<function>to_tsquery</function> must already follow the general rules for
738
768
<type>tsquery</> input, as described in <xref
@@ -842,7 +872,7 @@ phraseto_tsquery(<optional> <replaceable class="PARAMETER">config</replaceable>
842
872
<para>
843
873
<function>phraseto_tsquery</> behaves much like
844
874
<function>plainto_tsquery</>, with the exception
845
- that it utilizes the <literal>? </literal> (FOLLOWED BY) phrase search
875
+ that it utilizes the <literal><-> </literal> (FOLLOWED BY) phrase search
846
876
operator instead of the <literal>&</literal> (AND) Boolean operator.
847
877
This is particularly useful when searching for exact lexeme sequences,
848
878
since the phrase search operator helps to maintain lexeme order.
@@ -853,9 +883,9 @@ phraseto_tsquery(<optional> <replaceable class="PARAMETER">config</replaceable>
853
883
854
884
<screen>
855
885
SELECT phraseto_tsquery('english', 'The Fat Rats');
856
- phraseto_tsquery
886
+ phraseto_tsquery
857
887
------------------
858
- 'fat'? 'rat'
888
+ 'fat'<-> 'rat'
859
889
</screen>
860
890
861
891
Just like the <function>plainto_tsquery</>, the
@@ -865,9 +895,20 @@ SELECT phraseto_tsquery('english', 'The Fat Rats');
865
895
866
896
<screen>
867
897
SELECT phraseto_tsquery('english', 'The Fat & Rats:C');
868
- phraseto_tsquery
869
- -------------------------
870
- ( 'fat' ? 'rat' ) ? 'c'
898
+ phraseto_tsquery
899
+ -----------------------------
900
+ ( 'fat' <-> 'rat' ) <-> 'c'
901
+ </screen>
902
+
903
+ It is possible to specify the configuration to be used to parse the document,
904
+ for example, we could create a new one using the hunspell dictionary
905
+ (namely 'eng_hunspell') in order to match phrases with different word forms:
906
+
907
+ <screen>
908
+ SELECT phraseto_tsquery('eng_hunspell', 'developer of the building which collapsed');
909
+ phraseto_tsquery
910
+ --------------------------------------------------------------------------------------------
911
+ ( 'developer' <3> 'building' ) <2> 'collapse' | ( 'developer' <3> 'build' ) <2> 'collapse'
871
912
</screen>
872
913
</para>
873
914
@@ -1430,18 +1471,18 @@ FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank
1430
1471
<varlistentry>
1431
1472
1432
1473
<term>
1433
- <literal><type>tsquery</>?? <type>tsquery</></literal>
1474
+ <literal><type>tsquery</><-> <type>tsquery</></literal>
1434
1475
</term>
1435
1476
1436
1477
<listitem>
1437
1478
<para>
1438
1479
Returns the phrase-concatenation of the two given queries.
1439
1480
1440
1481
<screen>
1441
- SELECT to_tsquery('fat')?? to_tsquery('cat | rat');
1442
- ?column?
1443
- -------------------------------
1444
- 'fat'? 'cat' | 'fat'? 'rat'
1482
+ SELECT to_tsquery('fat')<-> to_tsquery('cat | rat');
1483
+ ?column?
1484
+ -----------------------------------
1485
+ 'fat'<-> 'cat' | 'fat'<-> 'rat'
1445
1486
</screen>
1446
1487
</para>
1447
1488
</listitem>
@@ -1461,13 +1502,13 @@ SELECT to_tsquery('fat') ?? to_tsquery('cat | rat');
1461
1502
<listitem>
1462
1503
<para>
1463
1504
Returns the distanced phrase-concatenation of the two given queries.
1464
- This function lies in the implementation of the <literal>?? </> operator.
1505
+ This function lies in the implementation of the <literal><-> </> operator.
1465
1506
1466
1507
<screen>
1467
1508
SELECT tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10);
1468
- tsquery_phrase
1469
- -------------------
1470
- 'fat'?[10] 'cat'
1509
+ tsquery_phrase
1510
+ ------------------
1511
+ 'fat'<10> 'cat'
1471
1512
</screen>
1472
1513
</para>
1473
1514
</listitem>
@@ -1487,10 +1528,10 @@ SELECT tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10);
1487
1528
<listitem>
1488
1529
<para>
1489
1530
<function>setweight</> returns a copy of the input query in which every
1490
- position has been labeled with the given <replaceable>weight</>, either
1491
- <literal>A</literal>, <literal>B</literal>, <literal>C</literal>, or
1492
- <literal>D</literal>. These labels are retained when queries are
1493
- concatenated, allowing words from different parts of a document
1531
+ position has been labeled with the given <replaceable>weight</>(s) , either
1532
+ <literal>A</literal>, <literal>B</literal>, <literal>C</literal>,
1533
+ <literal>D</literal> or their combination . These labels are retained when
1534
+ queries are concatenated, allowing words from different parts of a document
1494
1535
to be weighted differently by ranking functions.
1495
1536
</para>
1496
1537