Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit2971180

Browse files
committed
Doc: improve documentation about ts_headline() function.
Now that I've had my nose in that code, I thought the docs aboutit left something to be desired.
1 parent5f7247b commit2971180

File tree

1 file changed

+58
-47
lines changed

1 file changed

+58
-47
lines changed

‎doc/src/sgml/textsearch.sgml

Lines changed: 58 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -1221,63 +1221,75 @@ ts_headline(<optional> <replaceable class="PARAMETER">config</replaceable> <type
12211221
<itemizedlist spacing="compact" mark="bullet">
12221222
<listitem>
12231223
<para>
1224-
<literal>StartSel</>, <literal>StopSel</literal>: the strings with
1225-
which to delimit query words appearing in the document, to distinguish
1226-
them from other excerpted words. You must double-quote these strings
1227-
if they contain spaces or commas.
1224+
<literal>MaxWords</literal>, <literal>MinWords</literal> (integers):
1225+
these numbers determine the longest and shortest headlines to output.
1226+
The default values are 35 and 15.
12281227
</para>
12291228
</listitem>
12301229
<listitem>
12311230
<para>
1232-
<literal>MaxWords</>, <literal>MinWords</literal>: these numbers
1233-
determine the longest and shortest headlines to output.
1231+
<literal>ShortWord</literal> (integer): words of this length or less
1232+
will be dropped at the start and end of a headline, unless they are
1233+
query terms. The default value of three eliminates common English
1234+
articles.
12341235
</para>
12351236
</listitem>
12361237
<listitem>
12371238
<para>
1238-
<literal>ShortWord</literal>: words of this length or less will be
1239-
dropped at the start and end of a headline. The default
1240-
value of three eliminates common English articles.
1239+
<literal>HighlightAll</literal> (boolean): if
1240+
<literal>true</literal> the whole document will be used as the
1241+
headline, ignoring the preceding three parameters. The default
1242+
is <literal>false</literal>.
12411243
</para>
12421244
</listitem>
12431245
<listitem>
12441246
<para>
1245-
<literal>HighlightAll</literal>: Boolean flag; if
1246-
<literal>true</literal> the whole document will be used as the
1247-
headline, ignoring the preceding three parameters.
1247+
<literal>MaxFragments</literal> (integer): maximum number of text
1248+
fragments to display. The default value of zero selects a
1249+
non-fragment-based headline generation method. A value greater
1250+
than zero selects fragment-based headline generation (see below).
12481251
</para>
12491252
</listitem>
12501253
<listitem>
12511254
<para>
1252-
<literal>MaxFragments</literal>: maximum number of text excerpts
1253-
or fragments to display. The default value of zero selects a
1254-
non-fragment-oriented headline generation method. A value greater than
1255-
zero selects fragment-based headline generation. This method
1256-
finds text fragments with as many query words as possible and
1257-
stretches those fragments around the query words. As a result
1258-
query words are close to the middle of each fragment and have words on
1259-
each side. Each fragment will be of at most <literal>MaxWords</> and
1260-
words of length <literal>ShortWord</> or less are dropped at the start
1261-
and end of each fragment. If not all query words are found in the
1262-
document, then a single fragment of the first <literal>MinWords</>
1263-
in the document will be displayed.
1255+
<literal>StartSel</literal>, <literal>StopSel</literal> (strings):
1256+
the strings with which to delimit query words appearing in the
1257+
document, to distinguish them from other excerpted words. The
1258+
default values are <quote><literal>&lt;b&gt;</literal></quote> and
1259+
<quote><literal>&lt;/b&gt;</literal></quote>, which can be suitable
1260+
for HTML output.
12641261
</para>
12651262
</listitem>
12661263
<listitem>
12671264
<para>
1268-
<literal>FragmentDelimiter</literal>: When more than one fragment is
1269-
displayed, the fragments will be separated by this string.
1265+
<literal>FragmentDelimiter</literal> (string): When more than one
1266+
fragment is displayed, the fragments will be separated by this string.
1267+
The default is <quote><literal> ... </literal></quote>.
12701268
</para>
12711269
</listitem>
12721270
</itemizedlist>
12731271

1274-
Any unspecified options receive these defaults:
1272+
These option names are recognized case-insensitively.
1273+
You must double-quote string values if they contain spaces or commas.
1274+
</para>
12751275

1276-
<programlisting>
1277-
StartSel=&lt;b&gt;, StopSel=&lt;/b&gt;,
1278-
MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE,
1279-
MaxFragments=0, FragmentDelimiter=" ... "
1280-
</programlisting>
1276+
<para>
1277+
In non-fragment-based headline
1278+
generation, <function>ts_headline</function> locates matches for the
1279+
given <replaceable class="parameter">query</replaceable> and chooses a
1280+
single one to display, preferring matches that have more query words
1281+
within the allowed headline length.
1282+
In fragment-based headline generation, <function>ts_headline</function>
1283+
locates the query matches and splits each match
1284+
into <quote>fragments</quote> of no more than <literal>MaxWords</literal>
1285+
words each, preferring fragments with more query words, and when
1286+
possible <quote>stretching</quote> fragments to include surrounding
1287+
words. The fragment-based mode is thus more useful when the query
1288+
matches span large sections of the document, or when it's desirable to
1289+
display multiple matches.
1290+
In either mode, if no query matches can be identified, then a single
1291+
fragment of the first <literal>MinWords</literal> words in the document
1292+
will be displayed.
12811293
</para>
12821294

12831295
<para>
@@ -1289,25 +1301,24 @@ SELECT ts_headline('english',
12891301
is to find all documents containing given query terms
12901302
and return them in order of their similarity to the
12911303
query.',
1292-
to_tsquery('query &amp; similarity'));
1293-
ts_headline
1304+
to_tsquery('english', 'query &amp; similarity'));
1305+
ts_headline
12941306
------------------------------------------------------------
1295-
containing given &lt;b&gt;query&lt;/b&gt; terms
1296-
and return them in order of their &lt;b&gt;similarity&lt;/b&gt; to the
1307+
containing given &lt;b&gt;query&lt;/b&gt; terms +
1308+
and return them in order of their &lt;b&gt;similarity&lt;/b&gt; to the+
12971309
&lt;b&gt;query&lt;/b&gt;.
12981310

12991311
SELECT ts_headline('english',
1300-
'The most common type of search
1301-
is to find all documents containing given query terms
1302-
and return them in order of their similarity to the
1303-
query.',
1304-
to_tsquery('query &amp; similarity'),
1305-
'StartSel = &lt;, StopSel = &gt;');
1306-
ts_headline
1307-
-------------------------------------------------------
1308-
containing given &lt;query&gt; terms
1309-
and return them in order of their &lt;similarity&gt; to the
1310-
&lt;query&gt;.
1312+
'Search terms may occur
1313+
many times in a document,
1314+
requiring ranking of the search matches to decide which
1315+
occurrences to display in the result.',
1316+
to_tsquery('english', 'search &amp; term'),
1317+
'MaxFragments=10, MaxWords=7, MinWords=3, StartSel=&lt;&lt;, StopSel=&gt;&gt;');
1318+
ts_headline
1319+
------------------------------------------------------------
1320+
&lt;&lt;Search&gt;&gt; &lt;&lt;terms&gt;&gt; may occur +
1321+
many times ... ranking of the &lt;&lt;search&gt;&gt; matches to decide
13111322
</screen>
13121323
</para>
13131324

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp