@@ -6,6 +6,8 @@ All work was done by Teodor Sigaev (teodor@stack.net) and Oleg Bartunov
66
77CHANGES:
88
9+ August 29, 2002
10+ Space usage and using CLUSTER command documented
911August 22, 2002
1012Fix works with 'bad' queries
1113August 13, 2002
@@ -286,8 +288,8 @@ is strongly depends on many factors (query, collection, dictionaries
286288and hardware).
287289
288290Collection is available for download from
289- http://www.sai.msu.su/~megera/postgres/gist/tsearch/
290- as mw_titles.gz ( about 3Mb).
291+ http://www.sai.msu.su/~megera/postgres/gist/tsearch/mw_titles.gz
292+ (377905 titles from postgresql mailing lists, about 3Mb).
291293
2922940. install contrib/tsearch module
2932951. createdb test
@@ -353,3 +355,61 @@ using gist indices (morph)
353355
354356There are no visible difference between these 2 cases but your
355357mileage may vary.
358+
359+
360+ NOTES:
361+
362+ 1. The size of txtidx column should be lesser than size of corresponding column.
363+ Below some real numbers from test database (link above).
364+
365+ a) After loading data
366+
367+ -rw------- 1 postgres users 23191552 Aug 29 14:08 53016937
368+ -rw------- 1 postgres users 81059840 Aug 29 14:08 52639027
369+
370+ Table titles (52639027) occupies 80Mb, index on txtidx column (53016937)
371+ occupies 22Mb. Use contrib/oid2name to get mappings from oid to names.
372+ After doing
373+
374+ test=# select title into titles_tmp from titles;
375+ SELECT
376+
377+ I got size of table 'titles' without txtidx field
378+
379+ -rw------- 1 postgres users 30105600 Aug 29 14:14 53016938
380+
381+ So, txtidx column itself occupies about 50Mb.
382+
383+ b) after running 'vacuum full analyze' I got:
384+
385+ -rw------- 1 postgres users 30105600 Aug 29 14:26 53016938
386+ -rw------- 1 postgres users 36880384 Aug 29 14:26 53016937
387+ -rw------- 1 postgres users 51494912 Aug 29 14:26 52639027
388+
389+ 53016938 = titles_tmp
390+
391+ So, actual size of 'txtidx' field is 20 Mb ! "quod erat demonstrandum"
392+
393+ 2. CLUSTER command is highly recommended if you need fast searching.
394+ For example:
395+
396+ test=# cluster t_idx on titles;
397+
398+ BUT ! In 7.2 CLUSTER command forgets about other indices and permissions,
399+ so you need be carefull and rebuild these indices and restore permissions
400+ after clustering. Also, clustering isn't dynamic, so you'd need to
401+ use CLUSTER from time to time. In 7.3 CLUSTER command should works
402+ fine.
403+
404+ after clustering:
405+
406+ -rw------- 1 postgres users 23404544 Aug 29 14:59 53394850
407+ -rw------- 1 postgres users 30105600 Aug 29 14:26 53016938
408+ -rw------- 1 postgres users 50995200 Aug 29 14:45 53394845
409+ pg@zen:/usr/local/pgsql/data/base/52638986$ oid2name -d test
410+ All tables from database "test":
411+ ---------------------------------
412+ 53394850 = t_idx
413+ 53394845 = titles
414+ 53016938 = titles_tmp
415+