Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit68af452

Browse files
tglsfdcdmpgpro
authored andcommitted
Adjust text search documentation for recent commits.
Fix some now-obsolete statements that were overlooked in commits6734a1c,3dbbd0f,028350f. Document the behavior of <0>.Also do a little bit of rearranging and copy-editing for clarity.Conflicts:doc/src/sgml/datatype.sgmldoc/src/sgml/textsearch.sgml
1 parent7cb279a commit68af452

File tree

2 files changed

+73
-43
lines changed

2 files changed

+73
-43
lines changed

‎doc/src/sgml/datatype.sgml

Lines changed: 51 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -3861,12 +3861,12 @@ SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;
38613861

38623862
<para>
38633863
It is important to understand that the
3864-
<type>tsvector</type> type itself does not perform anynormalization;
3865-
it assumes the words it is given are normalized appropriately
3866-
for the application. For example,
3864+
<type>tsvector</type> type itself does not perform anyword
3865+
normalization;it assumes the words it is given are normalized
3866+
appropriatelyfor the application. For example,
38673867

38683868
<programlisting>
3869-
select 'The Fat Rats'::tsvector;
3869+
SELECT 'The Fat Rats'::tsvector;
38703870
tsvector
38713871
--------------------
38723872
'Fat' 'Rats' 'The'
@@ -3899,11 +3899,26 @@ SELECT to_tsvector('english', 'The Fat Rats');
38993899

39003900
<para>
39013901
A <type>tsquery</type> value stores lexemes that are to be
3902-
searched for, and combines them honoring the Boolean operators
3903-
<literal>&amp;</literal> (AND), <literal>|</literal> (OR),
3904-
<literal>!</> (NOT) and <literal>&lt;-&gt;</> (FOLLOWED BY) phrase search
3905-
operator. Parentheses can be used to enforce grouping
3906-
of the operators:
3902+
searched for, and can combine them using the Boolean operators
3903+
<literal>&amp;</literal> (AND), <literal>|</literal> (OR), and
3904+
<literal>!</> (NOT), as well as the phrase search operator
3905+
<literal>&lt;-&gt;</> (FOLLOWED BY). There is also a variant
3906+
<literal>&lt;<replaceable>N</>&gt;</literal> of the FOLLOWED BY
3907+
operator, where <replaceable>N</> is an integer constant that
3908+
specifies the distance between the two lexemes being searched
3909+
for. <literal>&lt;-&gt;</> is equivalent to <literal>&lt;1&gt;</>.
3910+
</para>
3911+
3912+
<para>
3913+
Parentheses can be used to enforce grouping of these operators.
3914+
In the absence of parentheses, <literal>!</> (NOT) binds most tightly,
3915+
<literal>&lt;-&gt;</literal> (FOLLOWED BY) next most tightly, then
3916+
<literal>&amp;</literal> (AND), with <literal>|</literal> (OR) binding
3917+
the least tightly.
3918+
</para>
3919+
3920+
<para>
3921+
Here are some examples:
39073922

39083923
<programlisting>
39093924
SELECT 'fat &amp; rat'::tsquery;
@@ -3920,17 +3935,21 @@ SELECT 'fat &amp; rat &amp; ! cat'::tsquery;
39203935
tsquery
39213936
------------------------
39223937
'fat' &amp; 'rat' &amp; !'cat'
3938+
3939+
SELECT '(fat | rat) &lt;-&gt; cat'::tsquery;
3940+
tsquery
3941+
-----------------------------------
3942+
'fat' &lt;-&gt; 'cat' | 'rat' &lt;-&gt; 'cat'
39233943
</programlisting>
39243944

3925-
In the absence of parentheses, <literal>!</> (NOT) binds most tightly,
3926-
and <literal>&amp;</literal> (AND) and <literal>&lt;-&gt;</literal> (FOLLOWED BY)
3927-
both bind more tightly than <literal>|</literal> (OR).
3945+
The last example demonstrates that <type>tsquery</type> sometimes
3946+
rearranges nested operators into a logically equivalent formulation.
39283947
</para>
39293948

39303949
<para>
39313950
Optionally, lexemes in a <type>tsquery</type> can be labeled with
39323951
one or more weight letters, which restricts them to match only
3933-
<type>tsvector</> lexemes withmatching weights:
3952+
<type>tsvector</> lexemes withone of those weights:
39343953

39353954
<programlisting>
39363955
SELECT 'fat:ab &amp; cat'::tsquery;
@@ -3950,25 +3969,7 @@ SELECT 'super:*'::tsquery;
39503969
'super':*
39513970
</programlisting>
39523971
This query will match any word in a <type>tsvector</> that begins
3953-
with <quote>super</>. Note that prefixes are first processed by
3954-
text search configurations, which means this comparison returns
3955-
true:
3956-
<programlisting>
3957-
SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
3958-
?column?
3959-
----------
3960-
t
3961-
(1 row)
3962-
</programlisting>
3963-
because <literal>postgres</> gets stemmed to <literal>postgr</>:
3964-
<programlisting>
3965-
SELECT to_tsquery('postgres:*');
3966-
to_tsquery
3967-
------------
3968-
'postgr':*
3969-
(1 row)
3970-
</programlisting>
3971-
which then matches <literal>postgraduate</>.
3972+
with <quote>super</>.
39723973
</para>
39733974

39743975
<para>
@@ -3984,6 +3985,24 @@ SELECT to_tsquery('Fat:ab &amp; Cats');
39843985
------------------
39853986
'fat':AB &amp; 'cat'
39863987
</programlisting>
3988+
3989+
Note that <function>to_tsquery</> will process prefixes in the same way
3990+
as other words, which means this comparison returns true:
3991+
3992+
<programlisting>
3993+
SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
3994+
?column?
3995+
----------
3996+
t
3997+
</programlisting>
3998+
because <literal>postgres</> gets stemmed to <literal>postgr</>:
3999+
<programlisting>
4000+
SELECT to_tsvector( 'postgraduate' ), to_tsquery( 'postgres:*' );
4001+
to_tsvector | to_tsquery
4002+
---------------+------------
4003+
'postgradu':1 | 'postgr':*
4004+
</programlisting>
4005+
which will match the stemmed form of <literal>postgraduate</>.
39874006
</para>
39884007

39894008
</sect2>

‎doc/src/sgml/textsearch.sgml

Lines changed: 22 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -351,8 +351,7 @@ text @@ text
351351
match. Similarly, the <literal>|</literal> (OR) operator specifies that
352352
at least one of its arguments must appear, while the <literal>!</> (NOT)
353353
operator specifies that its argument must <emphasis>not</> appear in
354-
order to have a match. Parentheses can be used to control nesting of
355-
these operators.
354+
order to have a match.
356355
</para>
357356

358357
<para>
@@ -375,10 +374,10 @@ SELECT to_tsvector('error is not fatal') @@ to_tsquery('fatal &lt;-&gt; error');
375374

376375
There is a more general version of the FOLLOWED BY operator having the
377376
form <literal>&lt;<replaceable>N</>&gt;</literal>,
378-
where <replaceable>N</> is an integer standing for theexact distance
379-
allowed between the matching lexemes. <literal>&lt;1&gt;</literal> is
377+
where <replaceable>N</> is an integer standing for thedifference between
378+
the positions of the matching lexemes. <literal>&lt;1&gt;</literal> is
380379
the same as <literal>&lt;-&gt;</>, while <literal>&lt;2&gt;</literal>
381-
allows one other lexeme to appear between the matches, and so
380+
allowsexactlyone other lexeme to appear between the matches, and so
382381
on. The <literal>phraseto_tsquery</> function makes use of this
383382
operator to construct a <literal>tsquery</> that can match a multi-word
384383
phrase when some of the words are stop words. For example:
@@ -395,9 +394,17 @@ SELECT phraseto_tsquery('the cats ate the rats');
395394
'cat' &lt;-&gt; 'ate' &lt;2&gt; 'rat'
396395
</programlisting>
397396
</para>
397+
398+
<para>
399+
A special case that's sometimes useful is that <literal>&lt;0&gt;</literal>
400+
can be used to require that two patterns match the same word.
401+
</para>
402+
398403
<para>
399-
The precedence of tsquery operators is as follows: <literal>|</literal>, <literal>&amp;</literal>,
400-
<literal>&lt;-&gt;</literal>, <literal>!</literal>.
404+
Parentheses can be used to control nesting of the <type>tsquery</>
405+
operators. Without parentheses, <literal>|</literal> binds least tightly,
406+
then <literal>&amp;</literal>, then <literal>&lt;-&gt;</literal>,
407+
and <literal>!</literal> most tightly.
401408
</para>
402409
</sect2>
403410

@@ -1455,10 +1462,14 @@ FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank
14551462

14561463
<listitem>
14571464
<para>
1458-
Returns a vector which lists the same lexemes as the given vector, but
1459-
which lacks any position or weight information. While the returned
1460-
vector is much less useful than an unstripped vector for relevance
1461-
ranking, it will usually be much smaller.
1465+
Returns a vector that lists the same lexemes as the given vector, but
1466+
lacks any position or weight information. The result is usually much
1467+
smaller than an unstripped vector, but it is also less useful.
1468+
Relevance ranking does not work as well on stripped vectors as
1469+
unstripped ones. Also,
1470+
the <literal>&lt;-&gt;</> (FOLLOWED BY) <type>tsquery</> operator
1471+
will never match stripped input, since it cannot determine the
1472+
distance between lexeme occurrences.
14621473
</para>
14631474
</listitem>
14641475

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp