1+ <!-- doc/src/sgml/shared-ispell.sgml -->
2+
3+ <sect1 id="shared-ispell" xreflabel="shared_ispell">
4+ <title>shared_ispell</title>
5+
6+ <indexterm zone="shared-ispell">
7+ <primary>shared_ispell</primary>
8+ </indexterm>
9+
10+ <para>
11+ The <filename>shared_ispell</filename> module provides a shared ispell
12+ dictionary, i.e. a dictionary that's stored in shared segment. The traditional
13+ ispell implementation means that each session initializes and stores the
14+ dictionary on it's own, which means a lot of CPU/RAM is wasted.
15+ </para>
16+
17+ <para>
18+ This extension allocates an area in shared segment (you have to choose the
19+ size in advance) and then loads the dictionary into it when it's used for the
20+ first time.
21+ </para>
22+
23+ <sect2>
24+ <title>Functions</title>
25+
26+ <para>
27+ The functions provided by the <filename>shared_ispell</filename> module
28+ are shown in <xref linkend="shared-ispell-func-table">.
29+ </para>
30+
31+ <table id="shared-ispell-func-table">
32+ <title><filename>shared_ispell</filename> Functions</title>
33+ <tgroup cols="3">
34+ <thead>
35+ <row>
36+ <entry>Function</entry>
37+ <entry>Returns</entry>
38+ <entry>Description</entry>
39+ </row>
40+ </thead>
41+
42+ <tbody>
43+ <row>
44+ <entry><function>shared_ispell_reset()</function><indexterm><primary>shared_ispell_reset</primary></indexterm></entry>
45+ <entry><type>void</type></entry>
46+ <entry>
47+ Resets the dictionaries (e.g. so that you can reload the updated files
48+ from disk). The sessions that already use the dictionaries will be forced
49+ to reinitialize them.
50+ </entry>
51+ </row>
52+ <row>
53+ <entry><function>shared_ispell_mem_used()</function><indexterm><primary>shared_ispell_mem_used</primary></indexterm></entry>
54+ <entry><type>int</type></entry>
55+ <entry>
56+ Returns a value of used memory of the shared segment by loaded shared
57+ dictionaries in bytes.
58+ </entry>
59+ </row>
60+ <row>
61+ <entry><function>shared_ispell_mem_available()</function><indexterm><primary>shared_ispell_mem_available</primary></indexterm></entry>
62+ <entry><type>int</type></entry>
63+ <entry>
64+ Returns a value of available memory of the shared segment.
65+ </entry>
66+ </row>
67+ <row>
68+ <entry><function>shared_ispell_dicts()</function><indexterm><primary>shared_ispell_dicts</primary></indexterm></entry>
69+ <entry><type>setof(dict_name varchar, affix_name varchar, words int, affixes int, bytes int)</type></entry>
70+ <entry>
71+ Returns a list of dictionaries loaded in the shared segment.
72+ </entry>
73+ </row>
74+ <row>
75+ <entry><function>shared_ispell_stoplists()</function><indexterm><primary>shared_ispell_stoplists</primary></indexterm></entry>
76+ <entry><type>setof(stop_name varchar, words int, bytes int)</type></entry>
77+ <entry>
78+ Returns a list of stopwords loaded in the shared segment.
79+ </entry>
80+ </row>
81+ </tbody>
82+ </tgroup>
83+ </table>
84+ </sect2>
85+
86+ <sect2>
87+ <title>GUC Parameters</title>
88+
89+ <variablelist>
90+ <varlistentry id="guc-shared-ispell-max-size" xreflabel="shared_ispell.max_size">
91+ <term>
92+ <varname>shared_ispell.max_size</> (<type>int</type>)
93+ <indexterm>
94+ <primary><varname>shared_ispell.max_size</> configuration parameter</primary>
95+ </indexterm>
96+ </term>
97+ <listitem>
98+ <para>
99+ Defines the maximum size of the shared segment. This is a hard limit, the
100+ shared segment is not extensible and you need to set it so that all the
101+ dictionaries fit into it and not much memory is wasted.
102+ </para>
103+ </listitem>
104+ </varlistentry>
105+ </variablelist>
106+ </sect2>
107+
108+ <sect2>
109+ <title>Using the dictionary</title>
110+
111+ <para>
112+ The module needs to allocate space in the shared memory segment. So add this
113+ to the config file (or update the current values):
114+
115+ <programlisting>
116+ # libraries to load
117+ shared_preload_libraries = 'shared_ispell'
118+
119+ # config of the shared memory
120+ shared_ispell.max_size = 32MB
121+ </programlisting>
122+ </para>
123+
124+ <para>
125+ To find out how much memory you actually need, use a large value (e.g. 200MB)
126+ and load all the dictionaries you want to use. Then use the
127+ <function>shared_ispell_mem_used()</function> function to find out how much
128+ memory was actually used (and set the <varname>shared_ispell.max_size</varname>
129+ GUC variable accordingly).
130+ </para>
131+
132+ <para>
133+ Don't set it exactly to that value, leave there some free space, so that you
134+ can reload the dictionaries without changing the GUC max_size limit
135+ (which requires a restart of the DB). Something like 512kB should be just fine.
136+ </para>
137+
138+ <para>
139+ The extension defines a <literal>shared_ispell</literal> template that you
140+ may use to define custom dictionaries. E.g. you may do this:
141+
142+ <programlisting>
143+ CREATE TEXT SEARCH DICTIONARY english_shared (
144+ TEMPLATE = shared_ispell,
145+ DictFile = en_us,
146+ AffFile = en_us,
147+ StopWords = english
148+ );
149+
150+ CREATE TEXT SEARCH CONFIGURATION public.english_shared
151+ ( COPY = pg_catalog.simple );
152+
153+ ALTER TEXT SEARCH CONFIGURATION english_shared
154+ ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
155+ word, hword, hword_part
156+ WITH english_shared, english_stem;
157+ </programlisting>
158+ </para>
159+
160+ <para>
161+ We can test created configuration:
162+
163+ <programlisting>
164+ SELECT * FROM ts_debug('english_shared', 'abilities');
165+ alias | description | token | dictionaries | dictionary | lexemes
166+ -----------+-----------------+-----------+-------------------------------+----------------+-----------
167+ asciiword | Word, all ASCII | abilities | {english_shared,english_stem} | english_shared | {ability}
168+ (1 row)
169+ </programlisting>
170+ </para>
171+
172+ <para>
173+ Or you can update your own text search configuration. For example, you have
174+ the <literal>public.english</literal> dictionary. You can update it to use
175+ the <literal>shared_ispell</literal> template:
176+
177+ <programlisting>
178+ ALTER TEXT SEARCH CONFIGURATION public.english
179+ ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
180+ word, hword, hword_part
181+ WITH english_shared, english_stem;
182+ </programlisting>
183+ </para>
184+
185+ </sect2>
186+
187+ <sect2>
188+ <title>Author</title>
189+
190+ <para>
191+ Tomas Vondra <email>tomas.vondra@2ndquadrant.com</email>, Prague, Czech Republic
192+ </para>
193+ </sect2>
194+
195+ </sect1>