Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit2bb6b98

Browse files
committed
Add a section about the planner's statistics, including a description
of the new pg_stats view.
1 parentdaf72bb commit2bb6b98

File tree

2 files changed

+199
-3
lines changed

2 files changed

+199
-3
lines changed

‎doc/src/sgml/perform.sgml

Lines changed: 195 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.12 2001/10/12 23:32:34 momjian Exp $
2+
$Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.13 2001/10/16 01:13:44 tgl Exp $
33
-->
44

55
<chapter id="performance-tips">
@@ -315,6 +315,200 @@ Total runtime: 30.67 msec
315315
</para>
316316
</sect1>
317317

318+
<sect1 id="planner-stats">
319+
<title>Statistics used by the Planner</title>
320+
321+
<para>
322+
As we saw in the previous section, the query planner needs to estimate
323+
the number of rows retrieved by a query in order to make good choices
324+
of query plans. This section provides a quick look at the statistics
325+
that the system uses for these estimates.
326+
</para>
327+
328+
<para>
329+
One component of the statistics is the total number of entries in each
330+
table and index, as well as the number of disk blocks occupied by each
331+
table and index. This information is kept in
332+
<structname>pg_class</structname>'s <structfield>reltuples</structfield>
333+
and <structfield>relpages</structfield> columns. We can look at it
334+
with queries similar to this one:
335+
336+
<screen>
337+
regression=# select relname, relkind, reltuples, relpages from pg_class
338+
regression-# where relname like 'tenk1%';
339+
relname | relkind | reltuples | relpages
340+
---------------+---------+-----------+----------
341+
tenk1 | r | 10000 | 233
342+
tenk1_hundred | i | 10000 | 30
343+
tenk1_unique1 | i | 10000 | 30
344+
tenk1_unique2 | i | 10000 | 30
345+
(4 rows)
346+
</screen>
347+
348+
Here we can see that <structname>tenk1</structname> contains 10000
349+
rows, as do its indexes, but the indexes are (unsurprisingly) much
350+
smaller than the table.
351+
</para>
352+
353+
<para>
354+
For efficiency reasons, <structfield>reltuples</structfield>
355+
and <structfield>relpages</structfield> are not updated on-the-fly,
356+
and so they usually contain only approximate values (which is good
357+
enough for the planner's purposes). They are initialized with dummy
358+
values (presently 1000 and 10 respectively) when a table is created.
359+
They are updated by certain commands, presently <command>VACUUM</>,
360+
<command>ANALYZE</>, and <command>CREATE INDEX</>. A stand-alone
361+
<command>ANALYZE</>, that is one not part of <command>VACUUM</>,
362+
generates an approximate <structfield>reltuples</structfield> value
363+
since it does not read every row of the table.
364+
</para>
365+
366+
<para>
367+
Most queries retrieve only a fraction of the rows in a table, due
368+
to having WHERE clauses that restrict the rows to be examined.
369+
The planner thus needs to make an estimate of the
370+
<firstterm>selectivity</> of WHERE clauses, that is, the fraction of
371+
rows that match each clause of the WHERE condition. The information
372+
used for this task is stored in the <structname>pg_statistic</structname>
373+
system catalog. Entries in <structname>pg_statistic</structname> are
374+
updated by <command>ANALYZE</> and <command>VACUUM ANALYZE</> commands,
375+
and are always approximate even when freshly updated.
376+
</para>
377+
378+
<para>
379+
Rather than look at <structname>pg_statistic</structname> directly,
380+
it's better to look at its view <structname>pg_stats</structname>
381+
when examining the statistics manually. <structname>pg_stats</structname>
382+
is designed to be more easily readable. Furthermore,
383+
<structname>pg_stats</structname> is readable by all, whereas
384+
<structname>pg_statistic</structname> is only readable by the superuser.
385+
(This prevents unprivileged users from learning something about
386+
the contents of other people's tables from the statistics. The
387+
<structname>pg_stats</structname> view is restricted to show only
388+
rows about tables that the current user can read.)
389+
For example, we might do:
390+
391+
<screen>
392+
regression=# select attname, n_distinct, most_common_vals from pg_stats where tablename = 'road';
393+
attname | n_distinct | most_common_vals
394+
---------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
395+
name | -0.467008 | {"I- 580 Ramp","I- 880 Ramp","Sp Railroad ","I- 580 ","I- 680 Ramp","I- 80 Ramp","14th St ","5th St ","Mission Blvd","I- 880 "}
396+
thepath | 20 | {"[(-122.089,37.71),(-122.0886,37.711)]"}
397+
(2 rows)
398+
regression=#
399+
</screen>
400+
401+
As of <productname>Postgres</productname> 7.2 the following columns exist
402+
in <structname>pg_stats</structname>:
403+
</para>
404+
405+
<table>
406+
<title>pg_stats Columns</title>
407+
408+
<tgroup cols=3>
409+
<thead>
410+
<row>
411+
<entry>Name</entry>
412+
<entry>Type</entry>
413+
<entry>Description</entry>
414+
</row>
415+
</thead>
416+
417+
<tbody>
418+
<row>
419+
<entry>tablename</entry>
420+
<entry><type>name</type></entry>
421+
<entry>Name of table containing column</entry>
422+
</row>
423+
424+
<row>
425+
<entry>attname</entry>
426+
<entry><type>name</type></entry>
427+
<entry>Column described by this row</entry>
428+
</row>
429+
430+
<row>
431+
<entry>null_frac</entry>
432+
<entry><type>real</type></entry>
433+
<entry>Fraction of column's entries that are NULL</entry>
434+
</row>
435+
436+
<row>
437+
<entry>avg_width</entry>
438+
<entry><type>integer</type></entry>
439+
<entry>Average width in bytes of column's entries</entry>
440+
</row>
441+
442+
<row>
443+
<entry>n_distinct</entry>
444+
<entry><type>real</type></entry>
445+
<entry>If greater than zero, the estimated number of distinct values
446+
in the column. If less than zero, the negative of the number of
447+
distinct values divided by the number of rows. (The negated form
448+
is used when ANALYZE believes that the number of distinct values
449+
is likely to increase as the table grows; the positive form is used
450+
when the column seems to have a fixed number of possible values.)
451+
For example, -1 indicates a unique column in which the number of
452+
distinct values is the same as the number of rows.
453+
</entry>
454+
</row>
455+
456+
<row>
457+
<entry>most_common_vals</entry>
458+
<entry><type>text[]</type></entry>
459+
<entry>A list of the most common values in the column. (Omitted if
460+
no values seem to be more common than any others.)</entry>
461+
</row>
462+
463+
<row>
464+
<entry>most_common_freqs</entry>
465+
<entry><type>real[]</type></entry>
466+
<entry>A list of the frequencies of the most common values,
467+
ie, number of occurrences of each divided by total number of rows.
468+
</entry>
469+
</row>
470+
471+
<row>
472+
<entry>histogram_bounds</entry>
473+
<entry><type>text[]</type></entry>
474+
<entry>A list of values that divide the column's values into
475+
groups of approximately equal population. The
476+
<structfield>most_common_vals</>, if present, are omitted from the
477+
histogram calculation. (Omitted if column datatype does not have a
478+
<literal>&lt;</> operator, or if the <structfield>most_common_vals</>
479+
list accounts for the entire population.)
480+
</entry>
481+
</row>
482+
483+
<row>
484+
<entry>correlation</entry>
485+
<entry><type>real</type></entry>
486+
<entry>Statistical correlation between physical row ordering and
487+
logical ordering of the column values. This ranges from -1 to +1.
488+
When the value is near -1 or +1, an indexscan on the column will
489+
be estimated to be cheaper than when it is near zero, due to reduction
490+
of random access to the disk. (Omitted if column datatype does
491+
not have a <literal>&lt;</> operator.)
492+
</entry>
493+
</row>
494+
</tbody>
495+
</tgroup>
496+
</table>
497+
498+
<para>
499+
The maximum number of entries in the <structfield>most_common_vals</>
500+
and <structfield>histogram_bounds</> arrays can be set on a
501+
column-by-column basis using the <command>ALTER TABLE SET STATISTICS</>
502+
command. The default limit is presently 10 entries. Raising the limit
503+
may allow more accurate planner estimates to be made, particularly for
504+
columns with irregular data distributions, at the price of consuming
505+
more space in <structname>pg_statistic</structname> and slightly more
506+
time to compute the estimates. Conversely, a lower limit may be
507+
appropriate for columns with simple data distributions.
508+
</para>
509+
510+
</sect1>
511+
318512
<sect1 id="explicit-joins">
319513
<title>Controlling the Planner with Explicit JOINs</title>
320514

‎doc/src/sgml/ref/analyze.sgml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
$Header: /cvsroot/pgsql/doc/src/sgml/ref/analyze.sgml,v 1.2 2001/09/03 12:57:49 petere Exp $
2+
$Header: /cvsroot/pgsql/doc/src/sgml/ref/analyze.sgml,v 1.3 2001/10/16 01:13:44 tgl Exp $
33
Postgres documentation
44
-->
55

@@ -152,7 +152,9 @@ ANALYZE [ VERBOSE ] [ <replaceable class="PARAMETER">table</replaceable> [ (<rep
152152
distribution in each column. One or both of these may be omitted if
153153
<command>ANALYZE</command> deems them uninteresting (for example, in
154154
a unique-key column, there are no common values) or if the column
155-
datatype does not support the appropriate operators.
155+
datatype does not support the appropriate operators. There is more
156+
information about the statistics in the <citetitle>User's
157+
Guide</citetitle>.
156158
</para>
157159

158160
<para>

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp