1- <!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.45 2003/09/30 03: 22:33 tgl Exp $ -->
1+ <!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.46 2003/11/06 22:21:47 tgl Exp $ -->
22
33<chapter id="indexes">
44 <title id="indexes-title">Indexes</title>
@@ -77,7 +77,7 @@ CREATE INDEX test1_id_index ON test1 (id);
7777 than a sequential table scan. But you may have to run the
7878 <command>ANALYZE</command> command regularly to update
7979 statistics to allow the query planner to make educated decisions.
80- Also read <xref linkend="performance-tips"> for information about
80+ See <xref linkend="performance-tips"> for information about
8181 how to find out whether an index is used and when and why the
8282 planner may choose <emphasis>not</emphasis> to use an index.
8383 </para>
@@ -106,8 +106,8 @@ CREATE INDEX test1_id_index ON test1 (id);
106106
107107 <para>
108108 <productname>PostgreSQL</productname> provides several index types:
109- B-tree, R-tree, GiST, and Hash. Each index typeis more appropriate for
110- a particular query type because of the algorithm it uses .
109+ B-tree, R-tree, GiST, and Hash. Each index typeuses a different
110+ algorithm that is best suited to different types of queries .
111111 <indexterm>
112112 <primary>index</primary>
113113 <secondary>B-tree</secondary>
@@ -116,9 +116,10 @@ CREATE INDEX test1_id_index ON test1 (id);
116116 <primary>B-tree</primary>
117117 <see>index</see>
118118 </indexterm>
119- By
120- default, the <command>CREATE INDEX</command> command will create a
121- B-tree index, which fits the most common situations. In
119+ By default, the <command>CREATE INDEX</command> command will create a
120+ B-tree index, which fits the most common situations. B-trees can
121+ handle equality and range queries on data that can be sorted into
122+ some ordering. In
122123 particular, the <productname>PostgreSQL</productname> query planner
123124 will consider using a B-tree index whenever an indexed column is
124125 involved in a comparison using one of these operators:
@@ -154,7 +155,7 @@ CREATE INDEX test1_id_index ON test1 (id);
154155 <primary>R-tree</primary>
155156 <see>index</see>
156157 </indexterm>
157- R-tree indexes areespecially suited for spatial data. To create
158+ R-tree indexes are suited for queries on spatial data. To create
158159 an R-tree index, use a command of the form
159160<synopsis>
160161CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING RTREE (<replaceable>column</replaceable>);
@@ -185,6 +186,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
185186 <primary>hash</primary>
186187 <see>index</see>
187188 </indexterm>
189+ Hash indexes can only handle simple equality comparisons.
188190 The query planner will consider using a hash index whenever an
189191 indexed column is involved in a comparison using the
190192 <literal>=</literal> operator. The following command is used to
@@ -195,19 +197,18 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
195197 <note>
196198 <para>
197199 Testing has shown <productname>PostgreSQL</productname>'s hash
198- indexes to be similar or slower than B-tree indexes, and the
199- index size and build time for hash indexes is much worse. Hash
200- indexes also suffer poor performance under high concurrency. For
200+ indexes to perform no better than B-tree indexes, and the
201+ index size and build time for hash indexes is much worse. For
201202 these reasons, hash index use is presently discouraged.
202203 </para>
203204 </note>
204205 </para>
205206
206207 <para>
207- The B-tree index is an implementation of Lehman-Yao
208+ The B-tree indexmethod is an implementation of Lehman-Yao
208209 high-concurrency B-trees. The R-tree index method implements
209210 standard R-trees using Guttman's quadratic split algorithm. The
210- hash index is an implementation of Litwin's linear hashing. We
211+ hash indexmethod is an implementation of Litwin's linear hashing. We
211212 mention the algorithms used solely to indicate that all of these
212213 index methods are fully dynamic and do not have to be optimized
213214 periodically (as is the case with, for example, static hash methods).
@@ -233,7 +234,7 @@ CREATE TABLE test2 (
233234 name varchar
234235);
235236</programlisting>
236- (Say , you keep your <filename class="directory">/dev</filename>
237+ (say , you keep your <filename class="directory">/dev</filename>
237238 directory in a database...) and you frequently make queries like
238239<programlisting>
239240SELECT name FROM test2 WHERE major = <replaceable>constant</replaceable> AND minor = <replaceable>constant</replaceable>;
@@ -263,8 +264,8 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
263264 <literal>a</literal> and <literal>b</literal>, or in queries
264265 involving only <literal>a</literal>, but not in other combinations.
265266 (In a query involving <literal>a</literal> and <literal>c</literal>
266- the plannermight choose to use the index for
267- <literal>a</literal> only and treat <literal>c</literal> like an
267+ the plannercould choose to use the index for
268+ <literal>a</literal>, while treating <literal>c</literal> like an
268269 ordinary unindexed column.) Of course, each column must be used with
269270 operators appropriate to the index type; clauses that involve other
270271 operators will not be considered.
@@ -310,16 +311,16 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
310311 <para>
311312 When an index is declared unique, multiple table rows with equal
312313 indexed values will not be allowed. Null values are not considered
313- equal.
314+ equal. A multicolumn unique index will only reject cases where all
315+ of the indexed columns are equal in two rows.
314316 </para>
315317
316318 <para>
317- <productname>PostgreSQL</productname> automatically creates unique
318- indexes when a table is declared with a unique constraint or a
319- primary key, on the columns that make up the primary key or unique
320- columns (a multicolumn index, if appropriate), to enforce that
321- constraint. A unique index can be added to a table at any later
322- time, to add a unique constraint.
319+ <productname>PostgreSQL</productname> automatically creates a unique
320+ index when a unique constraint or a primary key is defined for a table.
321+ The index covers the columns that make up the primary key or unique
322+ columns (a multicolumn index, if appropriate), and is the mechanism
323+ that enforces the constraint.
323324 </para>
324325
325326 <note>
@@ -328,6 +329,9 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
328329 <literal>ALTER TABLE ... ADD CONSTRAINT</literal>. The use of
329330 indexes to enforce unique constraints could be considered an
330331 implementation detail that should not be accessed directly.
332+ One should, however, be aware that there's no need to manually
333+ create indexes on unique columns; doing so would just duplicate
334+ the automatically-created index.
331335 </para>
332336 </note>
333337 </sect1>
@@ -362,6 +366,14 @@ CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
362366</programlisting>
363367 </para>
364368
369+ <para>
370+ If we were to declare this index <literal>UNIQUE</>, it would prevent
371+ creation of rows whose <literal>col1</> values differ only in case,
372+ as well as rows whose <literal>col1</> values are actually identical.
373+ Thus, indexes on expressions can be used to enforce constraints that
374+ are not definable as simple unique constraints.
375+ </para>
376+
365377 <para>
366378 As another example, if one often does queries like this:
367379<programlisting>
@@ -409,7 +421,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
409421 In practice the default operator class for the column's data type is
410422 usually sufficient. The main point of having operator classes is
411423 that for some data types, there could be more than one meaningful
412- ordering . For example, we might want to sort a complex-number data
424+ index behavior . For example, we might want to sort a complex-number data
413425 type either by absolute value or by real part. We could do this by
414426 defining two operator classes for the data type and then selecting
415427 the proper class when making an index.
@@ -419,20 +431,6 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
419431 There are also some built-in operator classes besides the default ones:
420432
421433 <itemizedlist>
422- <listitem>
423- <para>
424- The operator classes <literal>box_ops</literal> and
425- <literal>bigbox_ops</literal> both support R-tree indexes on the
426- <type>box</type> data type. The difference between them is
427- that <literal>bigbox_ops</literal> scales box coordinates down,
428- to avoid floating-point exceptions from doing multiplication,
429- addition, and subtraction on very large floating-point
430- coordinates. If the field on which your rectangles lie is about
431- 20 000 square units or larger, you should use
432- <literal>bigbox_ops</literal>.
433- </para>
434- </listitem>
435-
436434 <listitem>
437435 <para>
438436 The operator classes <literal>text_pattern_ops</literal>,
@@ -644,7 +642,8 @@ SELECT * FROM orders WHERE order_nr = 3501;
644642 create, it would probably be too slow to be of any real use.)
645643 The system can recognize simple inequality implications, for example
646644 <quote>x < 1</quote> implies <quote>x < 2</quote>; otherwise
647- the predicate condition must exactly match the query's <literal>WHERE</> condition
645+ the predicate condition must exactly match part of the query's
646+ <literal>WHERE</> condition
648647 or the index will not be recognized to be usable.
649648 </para>
650649
@@ -723,7 +722,8 @@ CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
723722 maintenance and tuning, it is still important to check
724723 which indexes are actually used by the real-life query workload.
725724 Examining index usage for an individual query is done with the
726- <command>EXPLAIN</> command; its application for this purpose is
725+ <xref linkend="sql-explain" endterm="sql-explain-title">
726+ command; its application for this purpose is
727727 illustrated in <xref linkend="using-explain">.
728728 It is also possible to gather overall statistics about index usage
729729 in a running server, as described in <xref linkend="monitoring-stats">.
@@ -740,7 +740,8 @@ CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
740740 <itemizedlist>
741741 <listitem>
742742 <para>
743- Always run <command>ANALYZE</command> first. This command
743+ Always run <xref linkend="sql-analyze" endterm="sql-analyze-title">
744+ first. This command
744745 collects statistics about the distribution of the values in the
745746 table. This information is required to guess the number of rows
746747 returned by a query, which is needed by the planner to assign
@@ -813,8 +814,8 @@ CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
813814 run-time parameters (described in <xref linkend="runtime-config">).
814815 An inaccurate selectivity estimate is due to
815816 insufficient statistics. It may be possible to help this by
816- tuning the statistics-gathering parameters (see <command>ALTER
817- TABLE</command> reference ).
817+ tuning the statistics-gathering parameters (see
818+ <xref linkend="sql-altertable" endterm="sql-altertable-title"> ).
818819 </para>
819820
820821 <para>