|
1 | | -<!-- $PostgreSQL: pgsql/doc/src/sgml/gin.sgml,v 2.3 2006/09/14 21:15:07 tgl Exp $ --> |
| 1 | +<!-- $PostgreSQL: pgsql/doc/src/sgml/gin.sgml,v 2.4 2006/09/18 12:11:36 teodor Exp $ --> |
2 | 2 |
|
3 | 3 | <chapter id="GIN"> |
4 | 4 | <title>GIN Indexes</title> |
|
14 | 14 | <para> |
15 | 15 | <acronym>GIN</acronym> stands for Generalized Inverted Index. It is |
16 | 16 | an index structure storing a set of (key, posting list) pairs, where |
17 | | - 'posting list' is a set of rows in which the key occurs.The |
| 17 | + 'posting list' is a set of rows in which the key occurs.Each |
18 | 18 | row may contain many keys. |
19 | 19 | </para> |
20 | 20 |
|
|
104 | 104 | <listitem> |
105 | 105 | <para> |
106 | 106 | Returns an array of keys of the query to be executed. n contains |
107 | | - strategy number of operation (see <xref linkend="xindex-strategies">). |
| 107 | + the strategy number of the operation |
| 108 | + (see <xref linkend="xindex-strategies">). |
108 | 109 | Depending on n, query may be different type. |
109 | 110 | </para> |
110 | 111 | </listitem> |
|
114 | 115 | <term>bool consistent( bool check[], StrategyNumber n, Datum query)</term> |
115 | 116 | <listitem> |
116 | 117 | <para> |
117 | | - Returns TRUE if indexed value satisfies query qualifier with strategy n |
118 | | - (or may satisfy in case of RECHECK mark in operator class). |
119 | | - Each element of the check array is TRUE if indexed value has a |
| 118 | + Returns TRUE iftheindexed value satisfiesthequery qualifier with |
| 119 | +strategy n(or may satisfy in case of RECHECK mark in operator class). |
| 120 | + Each element of the check array is TRUE iftheindexed value has a |
120 | 121 | corresponding key in the query: if (check[i] == TRUE ) the i-th key of |
121 | 122 | the query is present in the indexed value. |
122 | 123 | </para> |
|
135 | 136 | <term>Create vs insert</term> |
136 | 137 | <listitem> |
137 | 138 | <para> |
138 | | - In most cases, insertion into <acronym>GIN</acronym> index is slow because |
139 | | -many GIN keys may beinserted for eachtable row. So, when loading data |
140 | | -inbulk itmay be useful to dropindex and recreate it |
141 | | -after the data is loaded in the table. |
| 139 | + In most cases, insertion into <acronym>GIN</acronym> index is slow |
| 140 | +due to the likelihood of many keys beinginserted for eachvalue. |
| 141 | +So, forbulkinsertions into a tableitis advisable to to dropthe GIN |
| 142 | +index and recreate it after finishing bulk insertion. |
142 | 143 | </para> |
143 | 144 | </listitem> |
144 | 145 | </varlistentry> |
|
147 | 148 | <term>gin_fuzzy_search_limit</term> |
148 | 149 | <listitem> |
149 | 150 | <para> |
150 | | - The primary goal ofdevelopment <acronym>GIN</acronym> indices was |
| 151 | + The primary goal ofdeveloping <acronym>GIN</acronym> indices was |
151 | 152 | support for highly scalable, full-text search in |
152 | 153 | <productname>PostgreSQL</productname> and there are often situations when |
153 | 154 | a full-text search returns a very large set of results. Since reading |
|
158 | 159 | <para> |
159 | 160 | Such queries usually contain very frequent words, so the results are not |
160 | 161 | very helpful. To facilitate execution of such queries |
161 | | - <acronym>GIN</acronym> has a configurablesoft upper limit of the size |
| 162 | + <acronym>GIN</acronym> has a configurable soft upper limit of the size |
162 | 163 | of the returned set, determined by the |
163 | 164 | <varname>gin_fuzzy_search_limit</varname> GUC variable. It is set to 0 by |
164 | 165 | default (no limit). |
|
182 | 183 | <title>Limitations</title> |
183 | 184 |
|
184 | 185 | <para> |
185 | | - <acronym>GIN</acronym> doesn't support fullscan ofindex due toit's |
186 | | - extremely inefficiency: becauseof a lot of keys per value, |
| 186 | + <acronym>GIN</acronym> doesn't support full indexscansdue totheir |
| 187 | + extremely inefficiency: becausethere are often many keys per value, |
187 | 188 | each heap pointer will returned several times. |
188 | 189 | </para> |
189 | 190 |
|
190 | 191 | <para> |
191 | | - When extractQuery returns zeronumber ofkeys, <acronym>GIN</acronym> will |
192 | | -emit aerror: for differentopclass andstrategysemantic meaning ofvoid |
193 | | - query may be different (for example, any array contains void array, |
194 | | - but theyaren'toverlapped with voidone), and <acronym>GIN</acronym> can't |
| 192 | + When extractQuery returns zero keys, <acronym>GIN</acronym> will emit a |
| 193 | + error: for differentopclasses andstrategies thesemantic meaning ofa void |
| 194 | + query may be different (for example, any array containsthevoid array, |
| 195 | + but theydon'toverlap the voidarray), and <acronym>GIN</acronym> can't |
195 | 196 | suggest reasonable answer. |
196 | 197 | </para> |
197 | 198 |
|
|