F.49. pg_tsparser — an extension for text search | ||||
---|---|---|---|---|
Prev | Up | Appendix F. Additional Supplied Modules and Extensions Shipped inpostgrespro-std-17-contrib | Home | Next |
F.49. pg_tsparser — an extension for text search#
pg_tsparser
is aPostgres Pro extension for text search. This extension modifies the default text parsing strategy for words that include:
underscores
numbers and letters separated by the hyphen character
In addition to separate word parts returned by default,pg_tsparser
also returns the whole word.
F.49.1. Installation and Setup#
pg_tsparser
is included into thePostgres Pro distribution. To enablepg_tsparser
, oncePostgres Pro is installed, create thepg_tsparser
extension for each database you are planning to use:
CREATE EXTENSION pg_tsparser;
Oncepg_tsparser
is enabled, you can create your own text search configuration. In addition topg_tsparser
, you can use any available dictionary.
For example, you can createenglish_ts
configuration for the English language, as follows:
CREATE TEXT SEARCH CONFIGURATION english_ts ( PARSER = tsparser);COMMENT ON TEXT SEARCH CONFIGURATION english_ts IS 'text search configuration for english language';ALTER TEXT SEARCH CONFIGURATION english_ts ADD MAPPING FOR email, file, float, host, hword_numpart, int, numhword, numword, sfloat, uint, url, url_path, version WITH simple;ALTER TEXT SEARCH CONFIGURATION english_ts ADD MAPPING FOR asciiword, asciihword, hword_asciipart, word, hword, hword_part WITH english_stem;
F.49.2. Examples#
The following examples illustrate the difference in search results returned bypg_tsparser
and the default parser:
SELECT to_tsvector('english', 'pg_trgm') as def_parser, to_tsvector('english_ts', 'pg_trgm') as new_parser; def_parser | new_parser-----------------+----------------------------- 'pg':1 'trgm':2 | 'pg':2 'pg_trgm':1 'trgm':3(1 row)SELECT to_tsvector('english', '123-abc') as def_parser, to_tsvector('english_ts', '123-abc') as new_parser; def_parser | new_parser-----------------+----------------------------- '123':1 'abc':2 | '123':2 '123-abc':1 'abc':3(1 row)SELECT to_tsvector('english', 'rel-3.2-A') as def_parser, to_tsvector('english_ts', 'rel-3.2-A') as new_parser; def_parser | new_parser------------------+------------------------------- '-3.2':2 'rel':1 | '3.2':3 'rel':2 'rel-3.2-a':1(1 row)
See Also
F.49.3. Authors#
Postgres Professional, Moscow, Russia