Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit7627f64

Browse files
committed
Doc: improve documentation about ts_headline() function.
Now that I've had my nose in that code, I thought the docs aboutit left something to be desired.
1 parent91be1d1 commit7627f64

File tree

1 file changed

+57
-47
lines changed

1 file changed

+57
-47
lines changed

‎doc/src/sgml/textsearch.sgml

Lines changed: 57 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1301,64 +1301,75 @@ ts_headline(<optional> <replaceable class="parameter">config</replaceable> <type
13011301
<itemizedlist spacing="compact" mark="bullet">
13021302
<listitem>
13031303
<para>
1304-
<literal>StartSel</literal>, <literal>StopSel</literal>: the strings with
1305-
which to delimit query words appearing in the document, to distinguish
1306-
them from other excerpted words. You must double-quote these strings
1307-
if they contain spaces or commas.
1304+
<literal>MaxWords</literal>, <literal>MinWords</literal> (integers):
1305+
these numbers determine the longest and shortest headlines to output.
1306+
The default values are 35 and 15.
13081307
</para>
13091308
</listitem>
13101309
<listitem>
13111310
<para>
1312-
<literal>MaxWords</literal>, <literal>MinWords</literal>: these numbers
1313-
determine the longest and shortest headlines to output.
1311+
<literal>ShortWord</literal> (integer): words of this length or less
1312+
will be dropped at the start and end of a headline, unless they are
1313+
query terms. The default value of three eliminates common English
1314+
articles.
13141315
</para>
13151316
</listitem>
13161317
<listitem>
13171318
<para>
1318-
<literal>ShortWord</literal>: words of this length or less will be
1319-
dropped at the start and end of a headline. The default
1320-
value of three eliminates common English articles.
1319+
<literal>HighlightAll</literal> (boolean): if
1320+
<literal>true</literal> the whole document will be used as the
1321+
headline, ignoring the preceding three parameters. The default
1322+
is <literal>false</literal>.
13211323
</para>
13221324
</listitem>
13231325
<listitem>
13241326
<para>
1325-
<literal>HighlightAll</literal>: Boolean flag; if
1326-
<literal>true</literal> the whole document will be used as the
1327-
headline, ignoring the preceding three parameters.
1327+
<literal>MaxFragments</literal> (integer): maximum number of text
1328+
fragments to display. The default value of zero selects a
1329+
non-fragment-based headline generation method. A value greater
1330+
than zero selects fragment-based headline generation (see below).
13281331
</para>
13291332
</listitem>
13301333
<listitem>
13311334
<para>
1332-
<literal>MaxFragments</literal>: maximum number of text excerpts
1333-
or fragments to display. The default value of zero selects a
1334-
non-fragment-oriented headline generation method. A value greater than
1335-
zero selects fragment-based headline generation. This method
1336-
finds text fragments with as many query words as possible and
1337-
stretches those fragments around the query words. As a result
1338-
query words are close to the middle of each fragment and have words on
1339-
each side. Each fragment will be of at most <literal>MaxWords</literal> and
1340-
words of length <literal>ShortWord</literal> or less are dropped at the start
1341-
and end of each fragment. If not all query words are found in the
1342-
document, then a single fragment of the first <literal>MinWords</literal>
1343-
in the document will be displayed.
1335+
<literal>StartSel</literal>, <literal>StopSel</literal> (strings):
1336+
the strings with which to delimit query words appearing in the
1337+
document, to distinguish them from other excerpted words. The
1338+
default values are <quote><literal>&lt;b&gt;</literal></quote> and
1339+
<quote><literal>&lt;/b&gt;</literal></quote>, which can be suitable
1340+
for HTML output.
13441341
</para>
13451342
</listitem>
13461343
<listitem>
13471344
<para>
1348-
<literal>FragmentDelimiter</literal>: When more than one fragment is
1349-
displayed, the fragments will be separated by this string.
1345+
<literal>FragmentDelimiter</literal> (string): When more than one
1346+
fragment is displayed, the fragments will be separated by this string.
1347+
The default is <quote><literal> ... </literal></quote>.
13501348
</para>
13511349
</listitem>
13521350
</itemizedlist>
13531351

13541352
These option names are recognized case-insensitively.
1355-
Any unspecified options receive these defaults:
1353+
You must double-quote string values if they contain spaces or commas.
1354+
</para>
13561355

1357-
<programlisting>
1358-
StartSel=&lt;b&gt;, StopSel=&lt;/b&gt;,
1359-
MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE,
1360-
MaxFragments=0, FragmentDelimiter=" ... "
1361-
</programlisting>
1356+
<para>
1357+
In non-fragment-based headline
1358+
generation, <function>ts_headline</function> locates matches for the
1359+
given <replaceable class="parameter">query</replaceable> and chooses a
1360+
single one to display, preferring matches that have more query words
1361+
within the allowed headline length.
1362+
In fragment-based headline generation, <function>ts_headline</function>
1363+
locates the query matches and splits each match
1364+
into <quote>fragments</quote> of no more than <literal>MaxWords</literal>
1365+
words each, preferring fragments with more query words, and when
1366+
possible <quote>stretching</quote> fragments to include surrounding
1367+
words. The fragment-based mode is thus more useful when the query
1368+
matches span large sections of the document, or when it's desirable to
1369+
display multiple matches.
1370+
In either mode, if no query matches can be identified, then a single
1371+
fragment of the first <literal>MinWords</literal> words in the document
1372+
will be displayed.
13621373
</para>
13631374

13641375
<para>
@@ -1370,25 +1381,24 @@ SELECT ts_headline('english',
13701381
is to find all documents containing given query terms
13711382
and return them in order of their similarity to the
13721383
query.',
1373-
to_tsquery('query &amp; similarity'));
1374-
ts_headline
1384+
to_tsquery('english', 'query &amp; similarity'));
1385+
ts_headline
13751386
------------------------------------------------------------
1376-
containing given &lt;b&gt;query&lt;/b&gt; terms
1377-
and return them in order of their &lt;b&gt;similarity&lt;/b&gt; to the
1387+
containing given &lt;b&gt;query&lt;/b&gt; terms +
1388+
and return them in order of their &lt;b&gt;similarity&lt;/b&gt; to the+
13781389
&lt;b&gt;query&lt;/b&gt;.
13791390

13801391
SELECT ts_headline('english',
1381-
'The most common type of search
1382-
is to find all documents containing given query terms
1383-
and return them in order of their similarity to the
1384-
query.',
1385-
to_tsquery('query &amp; similarity'),
1386-
'StartSel = &lt;, StopSel = &gt;');
1387-
ts_headline
1388-
-------------------------------------------------------
1389-
containing given &lt;query&gt; terms
1390-
and return them in order of their &lt;similarity&gt; to the
1391-
&lt;query&gt;.
1392+
'Search terms may occur
1393+
many times in a document,
1394+
requiring ranking of the search matches to decide which
1395+
occurrences to display in the result.',
1396+
to_tsquery('english', 'search &amp; term'),
1397+
'MaxFragments=10, MaxWords=7, MinWords=3, StartSel=&lt;&lt;, StopSel=&gt;&gt;');
1398+
ts_headline
1399+
------------------------------------------------------------
1400+
&lt;&lt;Search&gt;&gt; &lt;&lt;terms&gt;&gt; may occur +
1401+
many times ... ranking of the &lt;&lt;search&gt;&gt; matches to decide
13921402
</screen>
13931403
</para>
13941404

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp