4
4
5
5
##Introduction
6
6
7
- The** rum** module provides access method to work with RUM index. It is based
8
- on the GIN access methods code.
7
+ The** rum** module provides access method to work with` RUM ` index. It is based
8
+ on the` GIN ` access methods code.
9
+
10
+ ` GIN ` index allows to perform fast full text search using` tsvector ` and
11
+ ` tsquery ` types. But full text search with GIN index has several problems:
12
+
13
+ - Slow ranking. It is need position information about lexems to ranking.` GIN `
14
+ index doesn't store positions of lexems. So after index scan we need additional
15
+ heap scan to retreive lexems positions.
16
+ - There isn't phrase search with` GIN ` index. This problem relates with previous
17
+ problem. It is need position information to perform phrase search.
18
+ - Slow ordering by timestamp.` GIN ` index can't store some related information
19
+ in index with lexemes. So it is necessary to perform additional heap scan.
20
+
21
+ ` RUM ` solves this problems by storing additional information in posting tree.
22
+ For example, positional information of lexemes or timestamps. You can get an
23
+ idea of` RUM ` by the following picture:
24
+
25
+ ![ How RUM stores additional information] ( img/gin_rum.png )
26
+
27
+ Drawback of` RUM ` is that it has slower build and insert time than` GIN ` .
28
+ It is because we need to store additional information besides keys and because
29
+ ` RUM ` uses generic WAL.
9
30
10
31
##License
11
32
@@ -16,7 +37,7 @@ This module available under the same license as
16
37
17
38
Before build and install** rum** you should ensure following:
18
39
19
- * PostgreSQL version is 9.6.
40
+ * PostgreSQL version is 9.6+ .
20
41
21
42
Typical installation procedure may look like this:
22
43
@@ -27,18 +48,27 @@ Typical installation procedure may look like this:
27
48
$ make USE_PGXS=1 installcheck
28
49
$ psql DB -c "CREATE EXTENSION rum;"
29
50
30
- ##New access method and operator class
31
-
32
- The** rum** module provides the access method** rum** and the operator class
33
- ** rum_tsvector_ops** .
51
+ ##Common operators and functions
34
52
35
- The module providesnew operators.
53
+ ** rum ** module providesnext operators.
36
54
37
55
| Operator | Returns | Description
38
56
| -------------------- | ------- | ----------------------------------------------
39
57
| tsvector< ; => ; tsquery | float4 | Returns distance between tsvector and tsquery.
58
+ | timestamp< ; => ; timestamp | float8 | Returns distance between two timestamps.
59
+ | timestamp< ; =| ; timestamp | float8 | Returns distance only for left timestamps.
60
+ | timestamp| ; => ; timestamp | float8 | Returns distance only for right timestamps.
61
+
62
+ ##Operator classes
40
63
41
- ##Examples
64
+ ** rum** provides next operator classes
65
+
66
+ ###rum_tsvector_ops
67
+
68
+ For type:` tsvector `
69
+
70
+ This operator class stores` tsvector ` lexemes with positional information. Supports
71
+ ordering by` <=> ` operator and prefix search. There is the example.
42
72
43
73
Let us assume we have the table:
44
74
@@ -70,9 +100,9 @@ And we can execute the following queries:
70
100
71
101
``` sql
72
102
= # SELECT t, a <=> to_tsquery('english', 'beautiful | place') AS rank
73
- FROM test_rum
74
- WHERE a @@ to_tsquery(' english' ,' beautiful | place' )
75
- ORDER BY a<=> to_tsquery(' english' ,' beautiful | place' );
103
+ FROM test_rum
104
+ WHERE a @@ to_tsquery(' english' ,' beautiful | place' )
105
+ ORDER BY a<=> to_tsquery(' english' ,' beautiful | place' );
76
106
t | rank
77
107
-- -------------------------------+-----------
78
108
The situation is most beautiful |0 .0303964
@@ -81,16 +111,128 @@ And we can execute the following queries:
81
111
(3 rows)
82
112
83
113
= # SELECT t, a <=> to_tsquery('english', 'place | situation') AS rank
84
- FROM test_rum
85
- WHERE a @@ to_tsquery(' english' ,' place | situation' )
86
- ORDER BY a<=> to_tsquery(' english' ,' place | situation' );
114
+ FROM test_rum
115
+ WHERE a @@ to_tsquery(' english' ,' place | situation' )
116
+ ORDER BY a<=> to_tsquery(' english' ,' place | situation' );
87
117
t | rank
88
118
-- -------------------------------+-----------
89
119
The situation is most beautiful |0 .0303964
90
120
It lookslike a beautiful place |0 .0303964
91
121
(2 rows)
92
122
```
93
123
124
+ ###rum_tsvector_hash_ops
125
+
126
+ For type:` tsvector `
127
+
128
+ This operator class stores hash of` tsvector ` lexemes with positional information.
129
+ Supports ordering by` <=> ` operator. But** doesn't** support prefix search.
130
+
131
+ ###rum_timestamp_ops
132
+
133
+ For type:` timestamp `
134
+
135
+ Operator class provides fast search and ordering by timestamp fields. Supports
136
+ ordering by` <=> ` ,` <=| ` and` |=> ` operators. Can be used
137
+ with` rum_tsvector_timestamp_ops ` operator class.
138
+
139
+ ###rum_timestamptz_ops
140
+
141
+ For type:` timestamptz `
142
+
143
+ Operator class provides fast search and ordering by timestamptz fields. Supports
144
+ ordering by` <=> ` ,` <=| ` and` |=> ` operators. Can be used
145
+ with` rum_tsvector_timestamptz_ops ` operator class.
146
+
147
+ ###rum_tsvector_timestamp_ops
148
+
149
+ For type:` tsvector `
150
+
151
+ This operator class stores` tsvector ` lexems with timestamp field. There is the example:
152
+
153
+ Let us assume we have the table:
154
+ ``` sql
155
+ CREATE TABLE tsts (idint , t tsvector, dtimestamp );
156
+
157
+ \copy tstsfrom ' rum/data/tsts.data'
158
+
159
+ CREATE INDEX tsts_idx ON tsts USING rum (t rum_tsvector_timestamp_ops, d)
160
+ WITH (attach= ' d' , to= ' t' );
161
+ ```
162
+
163
+ Now we can execute the following queries:
164
+ ``` sql
165
+ EXPLAIN (costs off)
166
+ SELECT id, d, d<=> ' 2016-05-16 14:21:25' FROM tstsWHERE t @@' wr&qh' ORDER BY d<=> ' 2016-05-16 14:21:25' LIMIT 5 ;
167
+ QUERY PLAN
168
+ -- ---------------------------------------------------------------------------------
169
+ Limit
170
+ - > Index Scan using tsts_idxon tsts
171
+ Index Cond: (t @@' ' ' wr' ' &' ' qh' ' ' ::tsquery)
172
+ Order By : (d<=> ' Mon May 16 14:21:25 2016' ::timestamp without time zone )
173
+ (4 rows)
174
+
175
+ SELECT id, d, d<=> ' 2016-05-16 14:21:25' FROM tstsWHERE t @@' wr&qh' ORDER BY d<=> ' 2016-05-16 14:21:25' LIMIT 5 ;
176
+ id | d | ?column?
177
+ -- ---+---------------------------------+---------------
178
+ 355 | Mon May16 14 :21 :22 .326724 2016 |2 .673276
179
+ 354 | Mon May16 13 :21 :22 .326724 2016 |3602 .673276
180
+ 371 | Tue May17 06 :21 :22 .326724 2016 |57597 .326724
181
+ 406 | Wed May18 17 :21 :22 .326724 2016 |183597 .326724
182
+ 415 | Thu May19 02 :21 :22 .326724 2016 |215997 .326724
183
+ (5 rows)
184
+ ```
185
+
186
+ ###rum_tsvector_timestamptz_ops
187
+
188
+ For type:` tsvector `
189
+
190
+ See comments for` rum_tsvector_timestamp_ops ` operator class.
191
+
192
+ ###rum_tsvector_hash_timestamp_ops
193
+
194
+ For type:` tsvector `
195
+
196
+ This operator class stores hash of` tsvector ` lexems with timestamp field.
197
+ ** Doesn't** support prefix search.
198
+
199
+ ###rum_tsvector_hash_timestamptz_ops
200
+
201
+ For type:` tsvector `
202
+
203
+ This operator class stores hash of` tsvector ` lexems with timestamptz field.
204
+ ** Doesn't** support prefix search.
205
+
206
+ ###rum_tsquery_ops
207
+
208
+ For type:` tsquery `
209
+
210
+ Stores branches of query tree in additional information. For example we have the table:
211
+ ``` sql
212
+ CREATE TABLE query (q tsquery, tagtext );
213
+
214
+ INSERT INTO queryVALUES (' supernova & star' ,' sn' ),
215
+ (' black' ,' color' ),
216
+ (' big & bang & black & hole' ,' bang' ),
217
+ (' spiral & galaxy' ,' shape' ),
218
+ (' black & hole' ,' color' );
219
+
220
+ CREATE INDEX query_idx ON query USING rum(q);
221
+ ```
222
+
223
+ Now we can execute the following fast query:
224
+ ``` sql
225
+ SELECT * FROM query
226
+ WHERE to_tsvector(' black holes never exists before we think about them' ) @@ q;
227
+ q | tag
228
+ -- ----------------+-------
229
+ ' black' | color
230
+ ' black' &' hole' | color
231
+ (2 rows)
232
+
233
+
234
+ ```
235
+
94
236
##Authors
95
237
96
238
Alexander Korotkov <a.korotkov@postgrespro.ru > Postgres Professional Ltd., Russia