NotificationsYou must be signed in to change notification settings
Fork5
Star27

Commit10c70b8

committed

Add note about space usage of 'manual' approach to clustering, per

suggestion from Sergey Koposov. Also some other minor editing.

1 parent6fada49 commit10c70b8Copy full SHA for 10c70b8

File tree

1 file changed

+20

-17

lines changed

doc/src/sgml/ref
- cluster.sgml

1 file changed

+20

-17

lines changed

`‎doc/src/sgml/ref/cluster.sgml‎`

Lines changed: 20 additions & 17 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,5 +1,5 @@`
`1`	`1`	`<!--`
`2`		`-$PostgreSQL: pgsql/doc/src/sgml/ref/cluster.sgml,v 1.37 2006/10/31 01:52:31 neilc Exp $`
	`2`	`+$PostgreSQL: pgsql/doc/src/sgml/ref/cluster.sgml,v 1.38 2006/11/04 19:03:51 tgl Exp $`
`3`	`3`	`PostgreSQL documentation`
`4`	`4`	`-->`
`5`	`5`
`@@ -108,8 +108,8 @@ CLUSTER`
`108`	`108`	`If you are requesting a range of indexed values from a table, or a`
`109`	`109`	`single indexed value that has multiple rows that match,`
`110`	`110`	`<command>CLUSTER</command> will help because once the index identifies the`
`111`		`-heap page for the first row that matches, all other rows`
`112`		`- that match are probably already on the sameheap page,`
	`111`	`+table page for the first row that matches, all other rows`
	`112`	`+ that match are probably already on the sametable page,`
`113`	`113`	`and so you save disk accesses and speed up the query.`
`114`	`114`	`</para>`
`115`	`115`
`@@ -137,30 +137,33 @@ CLUSTER`
`137`	`137`
`138`	`138`	`<para>`
`139`	`139`	`There is another way to cluster data. The`
`140`		`- <command>CLUSTER</command> command reorders the original tableusing`
`141`		`-the ordering of the index you specify. This can be slow`
`142`		`- on large tables because the rows are fetched from theheap`
`143`		`- in index order, and if theheaptable isunordered, the`
	`140`	`+ <command>CLUSTER</command> command reorders the original tableby`
	`141`	`+scanning it using the index you specify. This can be slow`
	`142`	`+ on large tables because the rows are fetched from thetable`
	`143`	`+ in index order, and if the table isdisordered, the`
`144`	`144`	`entries are on random pages, so there is one disk page`
`145`		`- retrieved for every row moved. (<productname>PostgreSQL</productname> has a cache,`
`146`		`- but the majority of a big table will not fit in the cache.)`
	`145`	`+ retrieved for every row moved. (<productname>PostgreSQL</productname> has`
	`146`	`+a cache,but the majority of a big table will not fit in the cache.)`
`147`	`147`	`The other way to cluster a table is to use`
`148`	`148`
`149`	`149`	`<programlisting>`
`150`	`150`	`CREATE TABLE <replaceable class="parameter">newtable</replaceable> AS`
`151`		`- SELECT<replaceable class="parameter">columnlist</replaceable> FROM <replaceable class="parameter">table</replaceable> ORDER BY <replaceable class="parameter">columnlist</replaceable>;`
	`151`	`+ SELECT* FROM <replaceable class="parameter">table</replaceable> ORDER BY <replaceable class="parameter">columnlist</replaceable>;`
`152`	`152`	`</programlisting>`
`153`	`153`
`154`		`- which uses the <productname>PostgreSQL</productname> sorting code in`
`155`		`-the <literal>ORDER BY</literal> clausetocreate the desired order; this is usually much`
`156`		`- faster than an index scan for`
`157`		`-unordered data. You then drop the old table, use`
	`154`	`+ which uses the <productname>PostgreSQL</productname> sorting code`
	`155`	`+ toproduce the desired order;`
	`156`	`+this is usually muchfaster than an index scan for disordered data.`
	`157`	`+Then you drop the old table, use`
`158`	`158`	`<command>ALTER TABLE ... RENAME</command>`
`159`		`- to rename <replaceable class="parameter">newtable</replaceable> to the old name, and`
`160`		`- recreate the table's indexes. However, this approach does not preserve`
	`159`	`+ to rename <replaceable class="parameter">newtable</replaceable> to the`
	`160`	`+ old name, and recreate the table's indexes.`
	`161`	`+ The big disadvantage of this approach is that it does not preserve`
`161`	`162`	`OIDs, constraints, foreign key relationships, granted privileges, and`
`162`	`163`	`other ancillary properties of the table — all such items must be`
`163`		`- manually recreated.`
	`164`	`+ manually recreated. Another disadvantage is that this way requires a sort`
	`165`	`+ temporary file about the same size as the table itself, so peak disk usage`
	`166`	`+ is about three times the table size instead of twice the table size.`
`164`	`167`	`</para>`
`165`	`168`	`</refsect1>`
`166`	`169`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit10c70b8

File tree

1 file changed

1 file changed

`‎doc/src/sgml/ref/cluster.sgml‎`

0 commit comments