Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit2a43603

Browse files
author
Arthur Zakirov
committed
Do not broke words with numbers and letters, divided by hyphen
1 parent399cc98 commit2a43603

File tree

4 files changed

+38
-22
lines changed

4 files changed

+38
-22
lines changed

‎README.md

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,27 @@
33
##Introduction
44

55
The**pg_tsparser** module is the modified default text search parser from
6-
PostgreSQL 9.6.
7-
The difference between**tsparser** and**default** parsers is that**tsparser**
8-
gives also unbroken words by underscore character.
6+
PostgreSQL 9.6. The differences are:
7+
***tsparser** gives unbroken words by underscore character
8+
***tsparser** gives unbroken words with numbers and letters by hyphen character
9+
10+
For example:
11+
12+
```sql
13+
SELECT to_tsvector('english','pg_trgm')as def_parser,
14+
to_tsvector('english_ts','pg_trgm')as new_parser;
15+
def_parser | new_parser
16+
-----------------+-----------------------------
17+
'pg':1'trgm':2 |'pg':2'pg_trgm':1'trgm':3
18+
(1 row)
19+
20+
SELECT to_tsvector('english','123-abc')as def_parser,
21+
to_tsvector('english_ts','123-abc')as new_parser;
22+
def_parser | new_parser
23+
-----------------+-----------------------------
24+
'123':1'abc':2 |'123':2'123-abc':1'abc':3
25+
(1 row)
26+
```
927

1028
##License
1129

@@ -40,21 +58,3 @@ ALTER TEXT SEARCH CONFIGURATION english_ts
4058
word, hword, hword_part
4159
WITH english_stem;
4260
```
43-
44-
##Examples
45-
46-
Example of difference between**tsparser** and**default**:
47-
48-
```sql
49-
SELECT to_tsvector('english_ts','pg_trgm');
50-
to_tsvector
51-
-----------------------------
52-
'pg':2'pg_trgm':1'trgm':3
53-
(1 row)
54-
55-
SELECT to_tsvector('english','pg_trgm');
56-
to_tsvector
57-
-----------------
58-
'pg':1'trgm':2
59-
(1 row)
60-
```

‎expected/pg_tsparser.out

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -193,3 +193,15 @@ SELECT to_tsvector('english_ts', 'pg_trgm');
193193
'pg':2 'pg_trgm':1 'trgm':3
194194
(1 row)
195195

196+
SELECT to_tsvector('english_ts', '12_abc');
197+
to_tsvector
198+
---------------------------
199+
'12':2 '12_abc':1 'abc':3
200+
(1 row)
201+
202+
SELECT to_tsvector('english_ts', '12-abc');
203+
to_tsvector
204+
---------------------------
205+
'12':2 '12-abc':1 'abc':3
206+
(1 row)
207+

‎sql/pg_tsparser.sql

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,5 @@ ALTER TEXT SEARCH CONFIGURATION english_ts
2222
WITH english_stem;
2323

2424
SELECT to_tsvector('english_ts','pg_trgm');
25+
SELECT to_tsvector('english_ts','12_abc');
26+
SELECT to_tsvector('english_ts','12-abc');

‎tsparser.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1132,6 +1132,8 @@ static const TParserStateActionItem actionTPS_InUnsignedInt[] = {
11321132
{p_iseqC,'-',A_PUSH,TPS_InHostFirstAN,0,NULL},
11331133
{p_iseqC,'_',A_PUSH,TPS_InHostFirstAN,0,NULL},
11341134
{p_iseqC,'@',A_PUSH,TPS_InEmail,0,NULL},
1135+
{p_iseqC,'-',A_PUSH,TPS_InHyphenNumWordFirst,0,NULL},
1136+
{p_iseqC,'_',A_PUSH,TPS_InHyphenNumWordFirst,0,NULL},
11351137
{p_isasclet,0,A_PUSH,TPS_InHost,0,NULL},
11361138
{p_isalpha,0,A_NEXT,TPS_InNumWord,0,NULL},
11371139
{p_isspecial,0,A_NEXT,TPS_InNumWord,0,NULL},
@@ -1658,7 +1660,7 @@ static const TParserStateActionItem actionTPS_InParseHyphen[] = {
16581660
{p_isEOF,0,A_RERUN,TPS_Base,0,NULL},
16591661
{p_isasclet,0,A_NEXT,TPS_InHyphenAsciiWordPart,0,NULL},
16601662
{p_isalpha,0,A_NEXT,TPS_InHyphenWordPart,0,NULL},
1661-
{p_isdigit,0,A_PUSH,TPS_InHyphenUnsignedInt,0,NULL},
1663+
{p_isdigit,0,A_PUSH,TPS_InHyphenNumWordPart,0,NULL},
16621664
{p_iseqC,'-',A_PUSH,TPS_InParseHyphenHyphen,0,NULL},
16631665
{p_iseqC,'_',A_PUSH,TPS_InParseHyphenHyphen,0,NULL},
16641666
{NULL,0,A_RERUN,TPS_Base,0,NULL}

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp