11<!--
2- $Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.9 2001/09/13 15:55:23 petere Exp $
2+ $Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.10 2001/09/18 01:59:05 tgl Exp $
33-->
44
55 <chapter id="performance-tips">
@@ -122,7 +122,7 @@ select * from pg_class where relname = 'tenk1';
122122regression=# explain select * from tenk1 where unique1 < 1000;
123123NOTICE: QUERY PLAN:
124124
125- Seq Scan on tenk1 (cost=0.00..358.00 rows=1003 width=148)
125+ Seq Scan on tenk1 (cost=0.00..358.00 rows=1007 width=148)
126126 </programlisting>
127127
128128 The estimate of output rows has gone down because of the WHERE clause.
@@ -147,7 +147,7 @@ Seq Scan on tenk1 (cost=0.00..358.00 rows=1003 width=148)
147147regression=# explain select * from tenk1 where unique1 < 50;
148148NOTICE: QUERY PLAN:
149149
150- Index Scan using tenk1_unique1 on tenk1 (cost=0.00..173.32 rows=47 width=148)
150+ Index Scan using tenk1_unique1 on tenk1 (cost=0.00..181.09 rows=49 width=148)
151151 </programlisting>
152152
153153 and you will see that if we make the WHERE condition selective
@@ -166,7 +166,7 @@ regression=# explain select * from tenk1 where unique1 < 50 and
166166regression-# stringu1 = 'xxx';
167167NOTICE: QUERY PLAN:
168168
169- Index Scan using tenk1_unique1 on tenk1 (cost=0.00..173.44 rows=1 width=148)
169+ Index Scan using tenk1_unique1 on tenk1 (cost=0.00..181.22 rows=1 width=148)
170170 </programlisting>
171171
172172 The added clause <literal>stringu1 = 'xxx'</literal> reduces the output-rows estimate,
@@ -181,11 +181,11 @@ regression=# explain select * from tenk1 t1, tenk2 t2 where t1.unique1 < 50
181181regression-# and t1.unique2 = t2.unique2;
182182NOTICE: QUERY PLAN:
183183
184- Nested Loop (cost=0.00..269.11 rows=47 width=296)
184+ Nested Loop (cost=0.00..330.41 rows=49 width=296)
185185 -> Index Scan using tenk1_unique1 on tenk1 t1
186- (cost=0.00..173.32 rows=47 width=148)
186+ (cost=0.00..181.09 rows=49 width=148)
187187 -> Index Scan using tenk2_unique2 on tenk2 t2
188- (cost=0.00..2 .01 rows=1 width=148)
188+ (cost=0.00..3 .01 rows=1 width=148)
189189 </programlisting>
190190 </para>
191191
@@ -202,7 +202,7 @@ Nested Loop (cost=0.00..269.11 rows=47 width=296)
202202 same inner-scan plan and costs that we'd get from, say, <literal>explain select
203203 * from tenk2 where unique2 = 42</literal>. The costs of the loop node are then set
204204 on the basis of the cost of the outer scan, plus one repetition of the
205- inner scan for each outer tuple (47 *2 .01, here), plus a little CPU
205+ inner scan for each outer tuple (49 *3 .01, here), plus a little CPU
206206 time for join processing.
207207 </para>
208208
@@ -229,12 +229,12 @@ regression=# explain select * from tenk1 t1, tenk2 t2 where t1.unique1 < 50
229229regression-# and t1.unique2 = t2.unique2;
230230NOTICE: QUERY PLAN:
231231
232- Hash Join (cost=173.44..557.03 rows=47 width=296)
232+ Hash Join (cost=181.22..564.83 rows=49 width=296)
233233 -> Seq Scan on tenk2 t2
234234 (cost=0.00..333.00 rows=10000 width=148)
235- -> Hash (cost=173.32..173.32 rows=47 width=148)
235+ -> Hash (cost=181.09..181.09 rows=49 width=148)
236236 -> Index Scan using tenk1_unique1 on tenk1 t1
237- (cost=0.00..173.32 rows=47 width=148)
237+ (cost=0.00..181.09 rows=49 width=148)
238238 </programlisting>
239239
240240 This plan proposes to extract the 50 interesting rows of <classname>tenk1</classname>
@@ -245,10 +245,61 @@ Hash Join (cost=173.44..557.03 rows=47 width=296)
245245 cost for the hash join, since we won't get any tuples out until we can
246246 start reading <classname>tenk2</classname>. The total time estimate for the join also
247247 includes a hefty charge for CPU time to probe the hash table
248- 10000 times. Note, however, that we are NOT charging 10000 times173.32 ;
248+ 10000 times. Note, however, that we are NOT charging 10000 times181.09 ;
249249 the hash table setup is only done once in this plan type.
250250 </para>
251251
252+ <para>
253+ It is possible to check on the accuracy of the planner's estimated costs
254+ by using EXPLAIN ANALYZE. This command actually executes the query,
255+ and then displays the true runtime accumulated within each plan node
256+ along with the same estimated costs that a plain EXPLAIN shows.
257+ For example, we might get a result like this:
258+
259+ <programlisting>
260+ regression=# explain analyze select * from tenk1 t1, tenk2 t2 where t1.unique1 < 50
261+ regression-# and t1.unique2 = t2.unique2;
262+ NOTICE: QUERY PLAN:
263+
264+ Nested Loop (cost=0.00..330.41 rows=49 width=296) (actual time=1.31..28.90 rows=50 loops=1)
265+ -> Index Scan using tenk1_unique1 on tenk1 t1
266+ (cost=0.00..181.09 rows=49 width=148) (actual time=0.69..8.84 rows=50 loops=1)
267+ -> Index Scan using tenk2_unique2 on tenk2 t2
268+ (cost=0.00..3.01 rows=1 width=148) (actual time=0.28..0.31 rows=1 loops=50)
269+ Total runtime: 30.67 msec
270+ </programlisting>
271+
272+ Note that the <quote>actual time</quote> values are in milliseconds of
273+ real time, whereas the <quote>cost</quote> estimates are expressed in
274+ arbitrary units of disk fetches; so they are unlikely to match up.
275+ The thing to pay attention to is the ratios.
276+ </para>
277+
278+ <para>
279+ In some query plans, it is possible for a subplan node to be executed more
280+ than once. For example, the inner indexscan is executed once per outer
281+ tuple in the above nested-loop plan. In such cases, the
282+ <quote>loops</quote> value reports the
283+ total number of executions of the node, and the actual time and rows
284+ values shown are averages per-execution. This is done to make the numbers
285+ comparable with the way that the cost estimates are shown. Multiply by
286+ the <quote>loops</quote> value to get the total time actually spent in
287+ the node.
288+ </para>
289+
290+ <para>
291+ The <quote>total runtime</quote> shown by EXPLAIN ANALYZE includes
292+ executor startup and shutdown time, as well as time spent processing
293+ the result tuples. It does not include parsing, rewriting, or planning
294+ time. For a SELECT query, the total runtime will normally be just a
295+ little larger than the total time reported for the top-level plan node.
296+ For INSERT, UPDATE, and DELETE queries, the total runtime may be
297+ considerably larger, because it includes the time spent processing the
298+ output tuples. In these queries, the time for the top plan node
299+ essentially is the time spent computing the new tuples and/or locating
300+ the old ones, but it doesn't include the time spent making the changes.
301+ </para>
302+
252303 <para>
253304 It is worth noting that EXPLAIN results should not be extrapolated
254305 to situations other than the one you are actually testing; for example,