@@ -2,74 +2,105 @@ pg_autovacuum README
22--------------------
33
44pg_autovacuum is a libpq client program that monitors all the
5- databases associated with apostgresql server. It uses thestats
5+ databases associated with aPostgreSQL server. It uses thestatistics
66collector to monitor insert, update and delete activity.
77
8- When a table exceeds its insert or delete threshold (more detail
9- on thresholds below) then that table will be vacuumed or analyzed.
8+ When a table exceeds a insert or delete threshold (for more detail on
9+ thresholds, see "Vacuum and Analyze" below) then that table will be
10+ vacuumed and/or analyzed.
1011
11- This allows postgresql to keep the fsm and table statistics up to
12- date, and eliminates the need to schedule periodic vacuums.
12+ This allows PostgreSQL to keep the FSM (Free Space Map) and table
13+ statistics up to date, and eliminates the need to schedule periodic
14+ vacuums.
1315
1416The primary benefit of pg_autovacuum is that the FSM and table
15- statistic information are updated as needed. When a table is actively
16- changing, pg_autovacuum will perform the necessary vacuums and
17- analyzes, whereas if a table remains static, no cycles will be wasted
18- performing unnecessary vacuums/analyzes.
17+ statistic information are updated more nearly as frequently as needed.
18+ When a table is actively changing, pg_autovacuum will perform the
19+ VACUUMs and ANALYZEs that such a table needs, whereas if a table
20+ remains static, no cycles will be wasted performing this
21+ unnecessarily.
1922
2023A secondary benefit of pg_autovacuum is that it ensures that a
21- database wide vacuum is performed prior toxid wraparound. This is an
24+ database wide vacuum is performed prior toXID wraparound. This is an
2225important, if rare, problem, as failing to do so can result in major
23- data loss.
24-
26+ data loss. (See the section in the _Administrator's Guide_ entitled
27+ "Preventing transaction ID wraparound failures" for more details.)
2528
2629KNOWN ISSUES:
2730-------------
28- pg_autovacuum has been tested under Redhat Linux (by me) and Solaris (by
29- Christopher B. Browne) and all known bugs have been resolved. Please report
30- any problems to the hackers list.
31-
32- pg_autovacuum does not get started automatically by either the postmaster or
33- by pg_ctl. Along the sames lines, when the postmaster exits no one tells
34- pg_autovacuum. The result is that at the start of the next loop,
35- pg_autovacuum fails to connect to the server and exits. Any time it fails
36- to connect pg_autovacuum exits.
37-
38- pg_autovacuum requires that the stats system be enabled and reporting row
39- level stats. The overhead of the stats system has been shown to be
40- significant under certain workloads. For instance a tight loop of queries
41- performing "select 1" was nearly 30% slower with stats enabled. However,
42- in practice with more realistic workloads, the stats system overhead is
43- usually nominal.
4431
32+ pg_autovacuum has been tested under Redhat Linux (by me) and Debian
33+ GNU/Linux and Solaris (by Christopher B. Browne) and all known bugs
34+ have been resolved. Please report any problems to the hackers list.
35+
36+ pg_autovacuum requires that the statistics system be enabled and
37+ reporting row level stats. The overhead of the stats system has been
38+ shown to be significant costly under certain workloads. For instance,
39+ a tight loop of queries performing "select 1" was found to run nearly
40+ 30% slower when stats were enabled. However, in practice, with more
41+ realistic workloads, the stats system overhead is usually nominal.
42+
43+ pg_autovacuum does not get started automatically by either the
44+ postmaster or by pg_ctl. Similarly, when the postmaster exits, no one
45+ tells pg_autovacuum. The result of that is that at the start of the
46+ next loop, pg_autovacuum will fail to connect to the server and
47+ exit(). Any time it fails to connect pg_autovacuum exit()s.
48+
49+ While pg_autovacuum can manage vacuums for as many databases as you
50+ may have tied to a particular PostgreSQL postmaster, it can only
51+ connect to a single PostgreSQL postmaster. Thus, if you have multiple
52+ postmasters on a particular host, you will need multiple pg_autovacuum
53+ instances, and they have no way, at present, to coordinate between one
54+ another to ensure that they do not concurrently vacuum big tables.
55+
56+ TODO:
57+ -----
58+
59+ At present, there are no sample scripts to automatically start up
60+ pg_autovacuum along with the database. It would be desirable to have
61+ a SysV script to start up pg_autovacuum after PostgreSQL has been
62+ started.
63+
64+ Some users have expressed interest in making pg_autovacuum more
65+ configurable so that certain tables known to be inactive could be
66+ excluded from being vacuumed. It would probably make sense to
67+ introduce this sort of functionality by providing arguments to specify
68+ the database and schema in which to find a configuration table.
4569
4670INSTALL:
4771--------
4872
49- As of postgresql v7.4 pg_autovacuum is included in the main source tree
50- under contrib. Therefore you just make && make install (similar to most other
51- contrib modules) and it will be installed for you.
73+ As of postgresql v7.4 pg_autovacuum is included in the main source
74+ tree under contrib. Therefore you merely need to "make && make
75+ install" (similar to most other contrib modules) and it will be
76+ installed for you.
5277
53- If you are using an earlier version ofpostgresql just uncompress the tar.gz
54- into the contrib directory and modify the contrib/Makefile to include the pg_autovacuum
55- directory. pg_autovacuum will then be made as part of the standard
56- postgresql install.
78+ If you are using an earlier version ofPostgreSQL, uncompress the
79+ tar.gz file into the contrib directory and modify the contrib/Makefile
80+ to include the pg_autovacuum directory. pg_autovacuum will then be
81+ built as part of the standard postgresql install.
5782
58- make sure that thefolowing are set in postgresql.conf
83+ make sure that thefollowing are set in postgresql.conf:
5984
6085 stats_start_collector = true
6186 stats_row_level = true
6287
63- start up the postmaster, then execute the pg_autovacuum executable.
88+ Start up the postmaster, then execute the pg_autovacuum executable.
89+
90+ If you have a script that automatically starts up the PostgreSQL
91+ instance, you might add in, after that, something similar to the
92+ following:
6493
94+ sleep 10 # To give the database some time to start up
95+ $PGBINS/pg_autovacuum -D -s $SBASE -S $SSCALE ... [other arguments]
6596
6697Command line arguments:
6798-----------------------
6899
69100pg_autovacuum has the following optional arguments:
70101
71102-d debug: 0 silent, 1 basic info, 2 more debug info, etc...
72- -Ddameonize : Detach from tty and run in background.
103+ -Ddaemonize : Detach from tty and run in background.
73104-s sleep base value: see "Sleeping" below.
74105-S sleep scaling factor: see "Sleeping" below.
75106-v vacuum base threshold: see Vacuum and Analyze.
@@ -80,32 +111,33 @@ pg_autovacuum has the following optional arguments:
80111-U username: Username pg_autovacuum will use to connect with, if not
81112 specified the current username is used.
82113-P password: Password pg_autovacuum will use to connect with.
83- -H host: host name or IP to connecttoo .
114+ -H host: host name or IP to connectto .
84115-p port: port used for connection.
85116-h help: list of command line options.
86117
87- All arguments have default values defined in pg_autovacuum.h. At the
88- time of writing they are:
118+ Numerous arguments have default values defined in pg_autovacuum.h. At
119+ the time of writing they are:
89120
90121-d 1
91122-v 1000
92123-V 2
93- -a 500 (half of -vis not specified)
94- -A 1 (half of -vis not specified)
124+ -a 500 (half of -vif not specified)
125+ -A 1 (half of -vif not specified)
95126-s 300 (5 minutes)
96127-S 2
97128
98129
99130Vacuum and Analyze:
100131-------------------
101132
102- pg_autovacuum performs either a vacuum analyze or just analyze depending
103- on the quantity and type of table activity (insert, update, or delete):
133+ pg_autovacuum performs either a VACUUM ANALYZE or just ANALYZE
134+ depending on the mixture of table activity (insert, update, or
135+ delete):
104136
105137- If the number of (inserts + updates + deletes) > AnalyzeThreshold, then
106138 only an analyze is performed.
107139
108- - If the number of (deletes + updates ) > VacuumThreshold, then a
140+ - If the number of (deletes + updates) > VacuumThreshold, then a
109141 vacuum analyze is performed.
110142
111143deleteThreshold is equal to:
@@ -115,42 +147,45 @@ insertThreshold is equal to:
115147 analyze_base_value + (analyze_scaling_factor * "number of tuples in the table")
116148
117149The AnalyzeThreshold defaults to half of the VacuumThreshold since it
118- represents a much less expensive operation (approx 5%-10% of vacuum), and
119- running it more often should not substantially degrade system performance.
150+ represents a much less expensive operation (approx 5%-10% of vacuum),
151+ and running ANALYZE more often should not substantially degrade system
152+ performance.
120153
121154Sleeping:
122155---------
123156
124157pg_autovacuum sleeps for a while after it is done checking all the
125158databases. It does this in order to limit the amount of system
126- resources it consumes. Thisalso allows the system administrator to
159+ resources it consumes. This allows the system administrator to
127160configure pg_autovacuum to be more or less aggressive.
128161
129162Reducing the sleep time will cause pg_autovacuum to respond more
130163quickly to changes, whether they be database addition/removal, table
131164addition/removal, or just normal table activity.
132165
133- On the other hand, settingpg_autovaccum to sleep valuesto agressivly
134- (for too shorta period of time) can have a negative effect on server
135- performance.If a table gets vacuumed 5 times during the course of a
136- largeupdate , this is likely to takemuch longer than if the table was
137- vacuumedonly once, at the end.
166+ On the other hand, settingpg_autovacuum to sleep valuestoo
167+ aggressively (to too shortperiods of time) can have a negative effect
168+ on server performance.For instance, if a table gets vacuumed 5 times
169+ during the course of a largeset of updates , this is likely to takea
170+ lot more work than if the table was vacuumedjust once, at the end.
138171
139172The total time it sleeps is equal to:
140173
141174 base_sleep_value + sleep_scaling_factor * "duration of the previous
142175 loop"
143176
144177Note that timing measurements are made in seconds; specifying
145- "pg_vacuum -s 1" means pg_autovacuum could poll the database upto 60 times
146- minute. In a system with large tables where vacuums may run for several
147- minutes, longer times between vacuums are likely to be appropriate.
178+ "pg_vacuum -s 1" means pg_autovacuum could poll the database up to 60
179+ times minute. In a system with large tables where vacuums may run for
180+ several minutes, rather longer times between vacuums are likely to be
181+ appropriate.
148182
149183What pg_autovacuum monitors:
150184----------------------------
151185
152- pg_autovacuum dynamically generates a list of all databases and tables that
153- exist on the server. It will dynamically add and remove databases and
154- tables that are removed from the database server while pg_autovacuum is
155- running. Overhead is fairly small per object. For example: 10 databases
156- with 10 tables each appears to less than 10k of memory on my Linux box.
186+ pg_autovacuum dynamically generates a list of all databases and tables
187+ that exist on the server. It will dynamically add and remove
188+ databases and tables that are removed from the database server while
189+ pg_autovacuum is running. Overhead is fairly small per object. For
190+ example: 10 databases with 10 tables each appears to less than 10k of
191+ memory on my Linux box.