Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitbf028fa

Browse files
committed
Add description of new features
1 parent7e63445 commitbf028fa

File tree

3 files changed

+503
-90
lines changed

3 files changed

+503
-90
lines changed

‎contrib/tsearch2/docs/tsearch-V2-intro.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -427,9 +427,9 @@ <h3>INDEXING FIELDS IN A TABLE</h3>
427427
<p>We need to create the index on the column idxFTI. Keep in mind
428428
that the database will update the index when some action is taken.
429429
In this case we _need_ the index (The whole point of Full Text
430-
INDEXINGi ;-)), so don't worry about any indexing overhead. We will
431-
create an index based on the gist function. GiST is an index
432-
structure for Generalized Search Tree.</p>
430+
INDEXING ;-)), so don't worry about any indexing overhead. We will
431+
create an index based on the gistor ginfunction. GiST is an index
432+
structure for Generalized Search Tree, GIN is a inverted index (see<ahref="tsearch2-ref.html#indexes">The tsearch2 Reference: Indexes</a>).</p>
433433
<pre>
434434
CREATE INDEX idxFTI_idx ON tblMessages USING gist(idxFTI);
435435
VACUUM FULL ANALYZE;

‎contrib/tsearch2/docs/tsearch2-guide.html

Lines changed: 40 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,20 @@
11
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
22
<html>
33
<head>
4-
<linktype="text/css"rel="stylesheet"href="/~megera/postgres/gist/tsearch/tsearch.css">
54
<title>tsearch2 guide</title>
65
</head>
76
<body>
87
<h1align=center>The tsearch2 Guide</h1>
98

109
<palign=center>
1110
Brandon Craig Rhodes<br>30 June 2003
11+
<br>Updated to 8.2 release by Oleg Bartunov, October 2006</br>
1212
<p>
1313
This Guide introduces the reader to the PostgreSQL tsearch2 module,
1414
version&nbsp;2.
1515
More formal descriptions of the module's types and functions
1616
are provided in the<ahref="tsearch2-ref.html">tsearch2 Reference</a>,
1717
which is a companion to this document.
18-
You can retrieve a beta copy of the tsearch2 module from the
19-
<ahref="http://www.sai.msu.su/~megera/postgres/gist/">GiST for PostgreSQL</a>
20-
page &mdash; look under the section entitled<i>Development History</i>
21-
for the current version.
2218
<p>
2319
First we will examine the<tt>tsvector</tt> and<tt>tsquery</tt> types
2420
and how they are used to search documents;
@@ -32,15 +28,40 @@ <h1 align=center>The tsearch2 Guide</h1>
3228
<hr>
3329
<h2>Table of Contents</h2>
3430
<blockquote>
31+
<ahref="#intro">Introduction to FTS with tsearch2</a><br>
3532
<ahref="#vectors_queries">Vectors and Queries</a><br>
3633
<ahref="#simple_search">A Simple Search Engine</a><br>
3734
<ahref="#weights">Ranking and Position Weights</a><br>
3835
<ahref="#casting">Casting Vectors and Queries</a><br>
3936
<ahref="#parsing_lexing">Parsing and Lexing</a><br>
37+
<ahref="#ref">Additional information</a>
4038
</blockquote>
4139

4240
<hr>
4341

42+
43+
<h2><aname="intro">Introduction to FTS with tsearch2</a></h2>
44+
The purpose of FTS is to
45+
find<b>documents</b>, which satisfy<b>query</b> and optionally return
46+
them in some<b>order</b>.
47+
Most common case: Find documents containing all query terms and return them in order
48+
of their similarity to the query. Document in database can be
49+
any text attribute, or combination of text attributes from one or many tables
50+
(using joins).
51+
Text search operators existed for years, in PostgreSQL they are
52+
<tt><b>~,~*, LIKE, ILIKE</b></tt>, but they lack linguistic support,
53+
tends to be slow and have no relevance ranking. The idea behind tsearch2 is
54+
is rather simple - preprocess document at index time to save time at search stage.
55+
Preprocessing includes
56+
<ul>
57+
<li>document parsing onto words
58+
<li>linguistic - normalize words to obtain lexemes
59+
<li>store document in optimized for searching way
60+
</ul>
61+
Tsearch2, in a nutshell, provides FTS operator (contains) for two new data types,
62+
which represent document and query -<tt>tsquery @@ tsvector</tt>.
63+
64+
<P>
4465
<h2><aname=vectors_queries>Vectors and Queries</a></h2>
4566

4667
<blockquote>
@@ -79,6 +100,8 @@ <h2><a name=vectors_queries>Vectors and Queries</a></h2>
79100
on the<tt>tsvector</tt> column of a table,
80101
which implements a form of the Berkeley
81102
<ahref="http://gist.cs.berkeley.edu/"><i>Generalized Search Tree</i></a>.
103+
Since PostgreSQL 8.2 tsearch2 supports<ahref="http://www.sigaev.ru/gin/">Gin</a> index,
104+
which is an inverted index, commonly used in search engines. It adds scalability to tsearch2.
82105
</ul>
83106
Once your documents are indexed,
84107
performing a search involves:
@@ -251,7 +274,7 @@ <h2><a name=vectors_queries>Vectors and Queries</a></h2>
251274

252275
<pre>
253276
=#<b>SELECT to_tsquery('the')</b>
254-
NOTICE: Query contains only stopword(s) or doesn't containlexeme(s), ignored
277+
NOTICE: Query contains only stopword(s) or doesn't containlexem(s), ignored
255278
to_tsquery
256279
------------
257280

@@ -483,8 +506,8 @@ <h2><a name=weights>Ranking and Position Weights</a></h2>
483506
and has the feature that you can assign different weights
484507
to words from different sections of your document.
485508
The<tt>rank_cd()</tt> uses a recent technique for weighting results
486-
but does not allow different weight to be given
487-
to different sections of your document.
509+
and also allows different weight to be given
510+
to different sections of your document (since 8.2).
488511
<p>
489512
Both ranking functions allow you to specify,
490513
as an optional last argument,
@@ -511,9 +534,6 @@ <h2><a name=weights>Ranking and Position Weights</a></h2>
511534
see the<ahref="tsearch2-ref.html#ranking">section on ranking</a>
512535
in the Reference.
513536
<p>
514-
The<tt>rank()</tt> function offers more flexibility
515-
because it pays attention to the<i>weights</i>
516-
with which you have labelled lexeme positions.
517537
Currently tsearch2 supports four different weight labels:
518538
<tt>'D'</tt>, the default weight;
519539
and<tt>'A'</tt>,<tt>'B'</tt>, and<tt>'C'</tt>.
@@ -730,7 +750,7 @@ <h2><a name=casting>Casting Vectors and Queries</a></h2>
730750
are important<i>both</i> to PostgreSQL when it is interpreting a string,
731751
<i>and</i> to the<tt>tsvector</tt> conversion function.
732752
You may want to review section
733-
<ahref="http://www.postgresql.org/docs/view.php?version=7.3&idoc=0&file=sql-syntax.html#SQL-SYNTAX-STRINGS">1.1.2.1,
753+
<ahref="http://www.postgresql.org/docs/current/static/sql-syntax.html#SQL-SYNTAX-STRINGS">
734754
&ldquo;String Constants&rdquo;</a>
735755
in the PostgreSQL documentation before proceeding.
736756
<p>
@@ -1051,6 +1071,14 @@ <h2><a name=parsing_lexing>Parsing and Lexing</a></h2>
10511071
with the difference that the query parser recognizes as special
10521072
the boolean operators that separate query words.
10531073

1074+
1075+
<h2><aname="ref">Additional information</a></h2>
1076+
More information about tsearch2 is available from
1077+
<ahref="http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2">tsearch2</a> page.
1078+
Also, it's worth to check
1079+
<ahref="http://www.sai.msu.su/~megera/wiki/Tsearch2">tsearch2 wiki</a> pages.
1080+
1081+
10541082
</body>
10551083
</html>
10561084

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp