Movatterモバイル変換


[0]ホーム

URL:



Facebook
Postgres Pro
Facebook
Downloads
14.2. Statistics Used by the Planner
Prev UpChapter 14. Performance TipsHome Next

14.2. Statistics Used by the Planner

As we saw in the previous section, the query planner needs to estimate the number of rows retrieved by a query in order to make good choices of query plans. This section provides a quick look at the statistics that the system uses for these estimates.

One component of the statistics is the total number of entries in each table and index, as well as the number of disk blocks occupied by each table and index. This information is kept in the tablepg_class, in the columnsreltuples andrelpages. We can look at it with queries similar to this one:

SELECT relname, relkind, reltuples, relpagesFROM pg_classWHERE relname LIKE 'tenk1%';       relname        | relkind | reltuples | relpages----------------------+---------+-----------+---------- tenk1                | r       |     10000 |      358 tenk1_hundred        | i       |     10000 |       30 tenk1_thous_tenthous | i       |     10000 |       30 tenk1_unique1        | i       |     10000 |       30 tenk1_unique2        | i       |     10000 |       30(5 rows)

Here we can see thattenk1 contains 10000 rows, as do its indexes, but the indexes are (unsurprisingly) much smaller than the table.

For efficiency reasons,reltuples andrelpages are not updated on-the-fly, and so they usually contain somewhat out-of-date values. They are updated byVACUUM,ANALYZE, and a few DDL commands such asCREATE INDEX. AVACUUM orANALYZE operation that does not scan the entire table (which is commonly the case) will incrementally update thereltuples count on the basis of the part of the table it did scan, resulting in an approximate value. In any case, the planner will scale the values it finds inpg_class to match the current physical table size, thus obtaining a closer approximation.

Most queries retrieve only a fraction of the rows in a table, due toWHERE clauses that restrict the rows to be examined. The planner thus needs to make an estimate of theselectivity ofWHERE clauses, that is, the fraction of rows that match each condition in theWHERE clause. The information used for this task is stored in thepg_statistic system catalog. Entries inpg_statistic are updated by theANALYZE andVACUUM ANALYZE commands, and are always approximate even when freshly updated.

Rather than look atpg_statistic directly, it's better to look at its viewpg_stats when examining the statistics manually.pg_stats is designed to be more easily readable. Furthermore,pg_stats is readable by all, whereaspg_statistic is only readable by a superuser. (This prevents unprivileged users from learning something about the contents of other people's tables from the statistics. Thepg_stats view is restricted to show only rows about tables that the current user can read.) For example, we might do:

SELECT attname, inherited, n_distinct,       array_to_string(most_common_vals, E'\n') as most_common_valsFROM pg_statsWHERE tablename = 'road'; attname | inherited | n_distinct |          most_common_vals---------+-----------+------------+------------------------------------ name    | f         |  -0.363388 | I- 580                        Ramp+         |           |            | I- 880                        Ramp+         |           |            | Sp Railroad                       +         |           |            | I- 580                            +         |           |            | I- 680                        Ramp name    | t         |  -0.284859 | I- 880                        Ramp+         |           |            | I- 580                        Ramp+         |           |            | I- 680                        Ramp+         |           |            | I- 580                            +         |           |            | State Hwy 13                  Ramp(2 rows)

Note that two rows are displayed for the same column, one corresponding to the complete inheritance hierarchy starting at theroad table (inherited=t), and another one including only theroad table itself (inherited=f).

The amount of information stored inpg_statistic byANALYZE, in particular the maximum number of entries in themost_common_vals andhistogram_bounds arrays for each column, can be set on a column-by-column basis using theALTER TABLE SET STATISTICS command, or globally by setting thedefault_statistics_target configuration variable. The default limit is presently 100 entries. Raising the limit might allow more accurate planner estimates to be made, particularly for columns with irregular data distributions, at the price of consuming more space inpg_statistic and slightly more time to compute the estimates. Conversely, a lower limit might be sufficient for columns with simple data distributions.

Further details about the planner's use of statistics can be found inChapter 67.


Prev Up Next
14.1. UsingEXPLAIN Home 14.3. Controlling the Planner with ExplicitJOIN Clauses
epubpdf
Go to PostgreSQL 9.6
By continuing to browse this website, you agree to the use of cookies. Go toPrivacy Policy.

[8]ページ先頭

©2009-2025 Movatter.jp