11<!--
2- $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.48 2005/09/23 02:01:34 momjian Exp $
2+ $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.49 2005/10/21 19:39:08 tgl Exp $
33-->
44
55<chapter id="maintenance">
@@ -474,9 +474,9 @@ HINT: Stop the postmaster and use a standalone backend to VACUUM in "mydb".
474474 tuples. These checks use the row-level statistics collection facility;
475475 therefore, the autovacuum daemon cannot be used unless <xref
476476 linkend="guc-stats-start-collector"> and <xref
477- linkend="guc-stats-row-level"> are set <literal>true</literal>. Also, it's
478- important to allow a slot for the autovacuum process when choosing the
479- value of <xref linkend="guc-superuser-reserved-connections">.
477+ linkend="guc-stats-row-level"> are setto <literal>true</literal>. Also,
478+ it's important to allow a slot for the autovacuum process when choosing
479+ the value of <xref linkend="guc-superuser-reserved-connections">.
480480 </para>
481481
482482 <para>
@@ -487,75 +487,91 @@ HINT: Stop the postmaster and use a standalone backend to VACUUM in "mydb".
487487 database-wide <command>VACUUM</command> call, or <command>VACUUM
488488 FREEZE</command> if it's a template database, and then terminates. If
489489 no database fulfills this criterion, the one that was least recently
490- processed by autovacuum itself is chosen. In this mode, each table in
491- the database is checked for new and obsolete tuples, according to the
492- applicable autovacuum parameters. If a <link linkend="catalog-pg-autovacuum">
493- <structname>pg_autovacuum</structname></link> tuple is found for this
494- table, these settings are applied; otherwise the global values in
495- <filename>postgresql.conf</filename> are used. See <xref linkend="runtime-config-autovacuum">
496- for more details on the global settings.
490+ processed by autovacuum is chosen. In this case each table in
491+ the selected database is checked, and individual <command>VACUUM</command>
492+ or <command>ANALYZE</command> commands are issued as needed.
497493 </para>
498494
499495 <para>
500- For each table, two conditions are used to determine which operation to
501- apply. If the number of obsolete tuples since the last
496+ For each table, two conditions are used to determine which operation(s)
497+ to apply. If the number of obsolete tuples since the last
502498 <command>VACUUM</command> exceeds the <quote>vacuum threshold</quote>, the
503- table is vacuumed and analyzed . The vacuum threshold is defined as:
499+ table is vacuumed. The vacuum threshold is defined as:
504500<programlisting>
505501vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples
506502</programlisting>
507503 where the vacuum base threshold is
508- <structname>pg_autovacuum</structname>.<structfield>vac_base_thresh</structfield >,
504+ <xref linkend="guc-autovacuum-vacuum-threshold" >,
509505 the vacuum scale factor is
510- <structname>pg_autovacuum</structname>.<structfield>vac_scale_factor</structfield>
506+ <xref linkend="guc-autovacuum-vacuum-scale-factor">,
511507 and the number of tuples is
512508 <structname>pg_class</structname>.<structfield>reltuples</structfield>.
513- The number of obsolete tuples istaken from the statistics
514- collector, which is a semi-accurate count updated by each
509+ The number of obsolete tuples isobtained from the statistics
510+ collector; it is a semi-accurate count updated by each
515511 <command>UPDATE</command> and <command>DELETE</command> operation. (It
516512 is only semi-accurate because some information may be lost under heavy
517- load.) For analyze, a similar condition is used: the threshold, calculated
518- by an equivalent equation to that above, is compared to the number of
519- new tuples, that is, those created by the <command>INSERT</command> and
520- <command>COPY</command> commands.
513+ load.) For analyze, a similar condition is used: the threshold, defined as
514+ <programlisting>
515+ analyze threshold = analyze base threshold + analyze scale factor * number of tuples
516+ </programlisting>
517+ is compared to the total number of tuples inserted, updated, or deleted
518+ since the last <command>ANALYZE</command>.
521519 </para>
522520
523521 <para>
524- Note that if any of the values in <structname>pg_autovacuum</structname>
525- are set to a negative number, or if a tuple is not present at all in
526- <structname>pg_autovacuum</structname> for any particular table, the
527- equivalent value from <filename>postgresql.conf</filename> is used.
522+ The default thresholds and scale factors are taken from
523+ <filename>postgresql.conf</filename>, but it is possible to override them
524+ on a table-by-table basis by making entries in the system catalog
525+ <link
526+ linkend="catalog-pg-autovacuum"><structname>pg_autovacuum</></link>.
527+ If a <structname>pg_autovacuum</structname> row exists for a particular
528+ table, the settings it specifies are applied; otherwise the global
529+ settings are used. See <xref linkend="runtime-config-autovacuum"> for
530+ more details on the global settings.
528531 </para>
529532
530533 <para>
531534 Besides the base threshold values and scale factors, there are three
532- parameters that can be set for each table in <structname>pg_autovacuum</structname>.
533- The first parameter, <structname>pg_autovacuum</>.<structfield>enabled</>,
534- can be used to instruct the autovacuum daemon to skip any particular table
535- by setting it to <literal>false</literal>.
536- The other two, the vacuum cost delay
535+ more parameters that can be set for each table in
536+ <structname>pg_autovacuum</structname>.
537+ The first, <structname>pg_autovacuum</>.<structfield>enabled</>,
538+ can be set to <literal>false</literal> to instruct the autovacuum daemon
539+ to skip that particular table entirely. In this case
540+ autovacuum will only touch the table when it vacuums the entire database
541+ to prevent transaction ID wraparound.
542+ The other two parameters, the vacuum cost delay
537543 (<structname>pg_autovacuum</structname>.<structfield>vac_cost_delay</structfield>)
538544 and the vacuum cost limit
539545 (<structname>pg_autovacuum</structname>.<structfield>vac_cost_limit</structfield>),
540546 are used to set table-specific values for the
541547 <xref linkend="runtime-config-resource-vacuum-cost" endterm="runtime-config-resource-vacuum-cost-title">
542- feature. The above note about negative values also applies here, but
543- also note that if the <filename>postgresql.conf</filename> variables
544- <varname>autovacuum_vacuum_cost_limit</varname> and
545- <varname>autovacuum_vacuum_cost_delay</varname> are also set to negative
546- values, the global <varname>vacuum_cost_limit</varname> and
547- <varname>vacuum_cost_delay</varname> values will be used instead.
548+ feature.
548549 </para>
549550
550- <note>
551+ <para>
552+ If any of the values in <structname>pg_autovacuum</structname>
553+ are set to a negative number, or if a row is not present at all in
554+ <structname>pg_autovacuum</structname> for any particular table, the
555+ corresponding values from <filename>postgresql.conf</filename> are used.
556+ </para>
557+
558+ <para>
559+ There is not currently any support for making
560+ <structname>pg_autovacuum</structname> entries, except by doing
561+ manual <command>INSERT</>s into the catalog. This feature will be
562+ improved in future releases, and it is likely that the catalog
563+ definition will change.
564+ </para>
565+
566+ <caution>
551567 <para>
552568 The contents of the <structname>pg_autovacuum</structname> system
553569 catalog are currently not saved in database dumps created by
554570 the tools <command>pg_dump</command> and <command>pg_dumpall</command>.
555- If youneed to preserve them across a dump/reload cycle, make sure you
571+ If youwant to preserve them across a dump/reload cycle, make sure you
556572 dump the catalog manually.
557573 </para>
558- </note >
574+ </caution >
559575
560576 </sect2>
561577 </sect1>
@@ -571,8 +587,42 @@ vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuple
571587 <para>
572588 In some situations it is worthwhile to rebuild indexes periodically
573589 with the <command>REINDEX</> command.
574- However, <productname>PostgreSQL</> 7.4 has substantially reduced the need
575- for this activity compared to earlier releases.
590+ </para>
591+
592+ <para>
593+ In <productname>PostgreSQL</> releases before 7.4, periodic reindexing
594+ was frequently necessary to avoid <quote>index bloat</>, due to lack of
595+ internal space reclamation in btree indexes. Any situation in which the
596+ range of index keys changed over time — for example, an index on
597+ timestamps in a table where old entries are eventually deleted —
598+ would result in bloat, because index pages for no-longer-needed portions
599+ of the key range were not reclaimed for re-use. Over time, the index size
600+ could become indefinitely much larger than the amount of useful data in it.
601+ </para>
602+
603+ <para>
604+ In <productname>PostgreSQL</> 7.4 and later, index pages that have become
605+ completely empty are reclaimed for re-use. There is still a possibility
606+ for inefficient use of space: if all but a few index keys on a page have
607+ been deleted, the page remains allocated. So a usage pattern in which all
608+ but a few keys in each range are eventually deleted will see poor use of
609+ space. The potential for bloat is not indefinite — at worst there
610+ will be one key per page — but it may still be worthwhile to schedule
611+ periodic reindexing for indexes that have such usage patterns.
612+ </para>
613+
614+ <para>
615+ The potential for bloat in non-btree indexes has not been well
616+ characterized. It is a good idea to keep an eye on the index's physical
617+ size when using any non-btree index type.
618+ </para>
619+
620+ <para>
621+ Also, for btree indexes a freshly-constructed index is somewhat faster to
622+ access than one that has been updated many times, because logically
623+ adjacent pages are usually also physically adjacent in a newly built index.
624+ (This consideration does not currently apply to non-btree indexes.) It
625+ might be worthwhile to reindex periodically just to improve access speed.
576626 </para>
577627 </sect1>
578628