Movatterモバイル変換


[0]ホーム

URL:



Facebook
Postgres Pro
Facebook
Downloads
12.2. Tables and Indexes
Prev UpChapter 12. Full Text SearchHome Next

12.2. Tables and Indexes#

The examples in the previous section illustrated full text matching using simple constant strings. This section shows how to search table data, optionally using indexes.

12.2.1. Searching a Table#

It is possible to do a full text search without an index. A simple query to print thetitle of each row that contains the wordfriend in itsbody field is:

SELECT titleFROM pgwebWHERE to_tsvector('english', body) @@ to_tsquery('english', 'friend');

This will also find related words such asfriends andfriendly, since all these are reduced to the same normalized lexeme.

The query above specifies that theenglish configuration is to be used to parse and normalize the strings. Alternatively we could omit the configuration parameters:

SELECT titleFROM pgwebWHERE to_tsvector(body) @@ to_tsquery('friend');

This query will use the configuration set bydefault_text_search_config.

A more complex example is to select the ten most recent documents that containcreate andtable in thetitle orbody:

SELECT titleFROM pgwebWHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create & table')ORDER BY last_mod_date DESCLIMIT 10;

For clarity we omitted thecoalesce function calls which would be needed to find rows that containNULL in one of the two fields.

Although these queries will work without an index, most applications will find this approach too slow, except perhaps for occasional ad-hoc searches. Practical use of text searching usually requires creating an index.

12.2.2. Creating Indexes#

We can create aGIN index (Section 12.9) to speed up text searches:

CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', body));

Notice that the 2-argument version ofto_tsvector is used. Only text search functions that specify a configuration name can be used in expression indexes (Section 11.7). This is because the index contents must be unaffected bydefault_text_search_config. If they were affected, the index contents might be inconsistent because different entries could containtsvectors that were created with different text search configurations, and there would be no way to guess which was which. It would be impossible to dump and restore such an index correctly.

Because the two-argument version ofto_tsvector was used in the index above, only a query reference that uses the 2-argument version ofto_tsvector with the same configuration name will use that index. That is,WHERE to_tsvector('english', body) @@ 'a & b' can use the index, butWHERE to_tsvector(body) @@ 'a & b' cannot. This ensures that an index will be used only with the same configuration used to create the index entries.

It is possible to set up more complex expression indexes wherein the configuration name is specified by another column, e.g.:

CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector(config_name, body));

whereconfig_name is a column in thepgweb table. This allows mixed configurations in the same index while recording which configuration was used for each index entry. This would be useful, for example, if the document collection contained documents in different languages. Again, queries that are meant to use the index must be phrased to match, e.g.,WHERE to_tsvector(config_name, body) @@ 'a & b'.

Indexes can even concatenate columns:

CREATE INDEX pgweb_idx ON pgweb USING GIN (to_tsvector('english', title || ' ' || body));

Another approach is to create a separatetsvector column to hold the output ofto_tsvector. To keep this column automatically up to date with its source data, use a stored generated column. This example is a concatenation oftitle andbody, usingcoalesce to ensure that one field will still be indexed when the other isNULL:

ALTER TABLE pgweb    ADD COLUMN textsearchable_index_col tsvector               GENERATED ALWAYS AS (to_tsvector('english', coalesce(title, '') || ' ' || coalesce(body, ''))) STORED;

Then we create aGIN index to speed up the search:

CREATE INDEX textsearch_idx ON pgweb USING GIN (textsearchable_index_col);

Now we are ready to perform a fast full text search:

SELECT titleFROM pgwebWHERE textsearchable_index_col @@ to_tsquery('create & table')ORDER BY last_mod_date DESCLIMIT 10;

One advantage of the separate-column approach over an expression index is that it is not necessary to explicitly specify the text search configuration in queries in order to make use of the index. As shown in the example above, the query can depend ondefault_text_search_config. Another advantage is that searches will be faster, since it will not be necessary to redo theto_tsvector calls to verify index matches. (This is more important when using a GiST index than a GIN index; seeSection 12.9.) The expression-index approach is simpler to set up, however, and it requires less disk space since thetsvector representation is not stored explicitly.


Prev Up Next
12.1. Introduction Home 12.3. Controlling Text Search
pdfepub
Go to PostgreSQL 17
By continuing to browse this website, you agree to the use of cookies. Go toPrivacy Policy.

[8]ページ先頭

©2009-2025 Movatter.jp