NotificationsYou must be signed in to change notification settings
Fork6
Star31

Commit1073123

committed

Update docs to explain that 7.1 locks down LC_COLLATE and LC_CTYPE at

initdb time. A few copy-editing cleanups, too.

1 parent671f798 commit1073123Copy full SHA for 1073123

File tree

1 file changed

+52

-45

lines changed

doc/src/sgml
- charset.sgml

1 file changed

+52

-45

lines changed

`‎doc/src/sgml/charset.sgml‎`

Lines changed: 52 additions & 45 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.5 2000/12/22 21:51:57 petere Exp $ -->`
	`1`	`+<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.6 2001/01/19 04:47:50 tgl Exp $ -->`
`2`	`2`
`3`	`3`	`<chapter id="charset">`
`4`	`4`	`<title>Localization</>`
`@@ -54,15 +54,15 @@`
`54`	`54`	`cultural preferences regarding alphabets, sorting, number`
`55`	`55`	`formatting, etc. <productname>PostgreSQL</> uses the standard ISO`
`56`	`56`	`C and POSIX-like locale facilities provided by the server operating`
`57`		`- system. For additional information refer the documentation of your`
	`57`	`+ system. For additional information refertothe documentation of your`
`58`	`58`	`system.`
`59`	`59`	`</para>`
`60`	`60`
`61`	`61`	`<sect2>`
`62`	`62`	`<title>Overview</>`
`63`	`63`
`64`	`64`	`<para>`
`65`		`- Locale support is notbuild into <productname>PostgreSQL</> by`
	`65`	`+ Locale support is notbuilt into <productname>PostgreSQL</> by`
`66`	`66`	`default; to enable it, supply the <option>--enable-locale</> option`
`67`	`67`	`to the <filename>configure</> script:`
`68`	`68`	`<informalexample>`
`@@ -95,7 +95,7 @@ export LANG=sv_SE`
`95`	`95`
`96`	`96`	`<para>`
`97`	`97`	`Occasionally it is useful to mix rules from several locales, e.g.,`
`98`		`- use U.S. rules but Spanish messages. To do that a set of`
	`98`	`+ use U.S.collationrules but Spanish messages. To do that a set of`
`99`	`99`	`environment variables exist that override the default of`
`100`	`100`	`<envar>LANG</> for a particular category:`
`101`	`101`
`@@ -141,14 +141,23 @@ export LANG=sv_SE`
`141`	`141`	`</para>`
`142`	`142`
`143`	`143`	`<para>`
`144`		`- Once you have chosen a set of localization rules this way you must`
`145`		`- keep them fixed for any particular database cluster. That means`
`146`		`- that the locales that were active when you ran <filename>initdb</>`
`147`		`- must be kept the same when you start the postmaster. Otherwise,`
`148`		`- the changed sort order can corrupt indexes or make your data`
`149`		`- disappear mysteriously. It is currently not possible to change the`
`150`		`- locales after database initialization or to use more than one set`
`151`		`- of locales for a given database cluster.`
	`144`	`+ Note that the locale behavior is determined by the environment`
	`145`	`+ variables seen by the server, not by the environment of any client.`
	`146`	`+ Therefore, be careful to set these variables before starting the`
	`147`	`+ postmaster.`
	`148`	`+ </para>`
	`149`	`+`
	`150`	`+ <para>`
	`151`	`+ The <envar>LC_COLLATE</> and <envar>LC_CTYPE</> variables affect the`
	`152`	`+ sort order of indexes. Therefore, these values must be kept fixed`
	`153`	`+ for any particular database cluster, or indexes on text columns will`
	`154`	`+ become corrupt. <productname>Postgres</productname> enforces this`
	`155`	`+ by recording the values of <envar>LC_COLLATE</> and <envar>LC_CTYPE</>`
	`156`	`+ that are seen by <command>initdb</>. The server automatically adopts`
	`157`	`+ those two values when it is started; only the other <envar>LC_</>`
	`158`	`+ categories can be set from the environment at server startup.`
	`159`	`+ In short, only one collation order can be used in a database cluster,`
	`160`	`+ and it is chosen at <command>initdb</> time.`
`152`	`161`	`</para>`
`153`	`162`	`</sect2>`
`154`	`163`
`@@ -183,7 +192,10 @@ export LANG=sv_SE`
`183`	`192`	`<para>`
`184`	`193`	`The only severe drawback of using the locale support in`
`185`	`194`	`<productname>PostgreSQL</> is its speed. So use locale only if you`
`186`		`- actually need it.`
	`195`	`+ actually need it. It should be noted in particular that selecting`
	`196`	`+ a non-C locale disables index optimizations for <literal>LIKE</> and`
	`197`	`+ <literal>~</> operators, which can make a huge difference in the`
	`198`	`+ speed of searches that use those operators.`
`187`	`199`	`</para>`
`188`	`200`	`</sect2>`
`189`	`201`
`@@ -261,7 +273,7 @@ perl: warning: Falling back to the standard locale ("C").`
`261`	`273`
`262`	`274`	`<para>`
`263`	`275`	`<acronym>MB</acronym> also fixes some problems concerning 8-bit single byte`
`264`		`- character sets including ISO8859. (I would not say allofproblems`
	`276`	`+ character sets including ISO8859. (I would not say all problems`
`265`	`277`	`have been fixed. I just confirmed that the regression test ran fine`
`266`	`278`	`and a few French characters could be used with the patch. Please let`
`267`	`279`	`me know if you find any problem while using 8-bit characters.)`
`@@ -271,7 +283,7 @@ perl: warning: Falling back to the standard locale ("C").`
`271`	`283`	`<title>Enabling MB</title>`
`272`	`284`
`273`	`285`	`<para>`
`274`		`- Run configure witha multibyte option:`
	`286`	`+ Run configure withthe multibyte option:`
`275`	`287`
`276`	`288`	`<programlisting>`
`277`	`289`	`% ./configure --enable-multibyte[=<replaceable>encoding_system</replaceable>]`
`@@ -383,11 +395,11 @@ perl: warning: Falling back to the standard locale ("C").`
`383`	`395`	`% initdb -E EUC_JP`
`384`	`396`	`</programlisting>`
`385`	`397`
`386`		`- sets the default encoding to EUC_JP(Extended Unix Code for Japanese).`
	`398`	`+ sets the default encoding to EUC_JP(Extended Unix Code for Japanese).`
`387`	`399`	`Note that you can use "--encoding" instead of "-E" if you prefer`
`388`	`400`	`to type longer option strings.`
`389`	`401`	`If no -E or --encoding option is given, the encoding`
`390`		`- specified atthe compile time is used.`
	`402`	`+ specified atconfigure time is used.`
`391`	`403`	`</para>`
`392`	`404`
`393`	`405`	`<para>`
`@@ -397,8 +409,8 @@ perl: warning: Falling back to the standard locale ("C").`
`397`	`409`	`% createdb -E EUC_KR korean`
`398`	`410`	`</programlisting>`
`399`	`411`
`400`		`- will create a database named "korean" with EUC_KR encoding. The`
`401`		`-another way to accomplish this is to use a SQL command:`
	`412`	`+ will create a database named "korean" with EUC_KR encoding.`
	`413`	`+Another way to accomplish this is to use a SQL command:`
`402`	`414`
`403`	`415`	`<programlisting>`
`404`	`416`	`CREATE DATABASE korean WITH ENCODING = 'EUC_KR';`
`@@ -527,20 +539,11 @@ char *pg_encoding_to_char(int <replaceable>encoding_id</replaceable>)`
`527`	`539`	`</para>`
`528`	`540`	`</listitem>`
`529`	`541`
`530`		`- <listitem>`
`531`		`- <para>`
`532`		`-Using <envar>PGCLIENTENCODING</envar>.`
`533`		`-`
`534`		`-If an environment variable <envar>PGCLIENTENCODING</envar> is defined in the`
`535`		`-frontend, an automatic encoding translation is done by the backend.`
`536`		`- </para>`
`537`		`- </listitem>`
`538`		`-`
`539`	`542`	`<listitem>`
`540`	`543`	`<para>`
`541`	`544`	`Using <command>SET CLIENT_ENCODING TO</command>.`
`542`	`545`
`543`		`-Setting the frontend side encoding can be donea SQL command:`
	`546`	`+Setting the frontend side encoding can be doneby this SQL command:`
`544`	`547`
`545`	`548`	`<programlisting>`
`546`	`549`	`SET CLIENT_ENCODING TO 'encoding';`
`@@ -552,7 +555,7 @@ SET CLIENT_ENCODING TO 'encoding';`
`552`	`555`	`SET NAMES 'encoding';`
`553`	`556`	`</programlisting>`
`554`	`557`
`555`		`-To query the currentthefrontend encoding:`
	`558`	`+To query the current frontend encoding:`
`556`	`559`
`557`	`560`	`<programlisting>`
`558`	`561`	`SHOW CLIENT_ENCODING;`
`@@ -565,6 +568,17 @@ RESET CLIENT_ENCODING;`
`565`	`568`	`</programlisting>`
`566`	`569`	`</para>`
`567`	`570`	`</listitem>`
	`571`	`+`
	`572`	`+ <listitem>`
	`573`	`+ <para>`
	`574`	`+Using <envar>PGCLIENTENCODING</envar>.`
	`575`	`+`
	`576`	`+If environment variable <envar>PGCLIENTENCODING</envar> is defined`
	`577`	`+in the client's environment, that client encoding is automatically`
	`578`	`+selected when a backend connection is made. (This can subsequently`
	`579`	`+be overridden using any of the other methods mentioned above.)`
	`580`	`+ </para>`
	`581`	`+ </listitem>`
`568`	`582`	`</itemizedlist>`
`569`	`583`	`</para>`
`570`	`584`	`</sect2>`
`@@ -588,7 +602,7 @@ RESET CLIENT_ENCODING;`
`588`	`602`	`<para>`
`589`	`603`	`Suppose you choose EUC_JP for the backend, LATIN1 for the frontend,`
`590`	`604`	`then some Japanese characters could not be translated into LATIN1. In`
`591`		`- this case, a letter cannot be represented in the LATIN1 character set,`
	`605`	`+ this case, a letterthatcannot be represented in the LATIN1 character set`
`592`	`606`	`would be transformed as:`
`593`	`607`
`594`	`608`	`<programlisting>`
`@@ -601,7 +615,7 @@ RESET CLIENT_ENCODING;`
`601`	`615`	`<title>References</title>`
`602`	`616`
`603`	`617`	`<para>`
`604`		`- These are good sources to start learning variouskind of encoding`
	`618`	`+ These are good sources to start learningaboutvariouskinds of encoding`
`605`	`619`	`systems.`
`606`	`620`
`607`	`621`	`<itemizedlist>`
`@@ -724,8 +738,7 @@ Mar 1, 1998 PL1 released`
`724`	`738`	`<para>`
`725`	`739`	`<!--`
`726`	`740`	`[Here is a good documentation explaining how to use WIN1250 on`
`727`		`-Windows/ODBC from Pavel Behal. Please note that Installation step 1)`
`728`		`-is not necceary in 6.5.1 - Tatsuo]`
	`741`	`+Windows/ODBC from Pavel Behal]`
`729`	`742`
`730`	`743`	`Version: 0.91 for PgSQL 6.5`
`731`	`744`	`Author: Pavel Behal`
`@@ -815,20 +828,14 @@ Sorry for my Eglish and C code, I'm not native :-)`
`815`	`828`	`<title>WIN1250 on Windows/ODBC</title>`
`816`	`829`	`<step>`
`817`	`830`	`<para>`
`818`		`- Change the three relevant files in the source directories.`
`819`		`- </para>`
`820`		`- </step>`
`821`		`-`
`822`		`- <step>`
`823`		`- <para>`
`824`		`- Compile <productname>Postgres</productname> with local enabled`
	`831`	`+ Compile <productname>Postgres</productname> with locale enabled`
`825`	`832`	`and the multibyte encoding set to <literal>LATIN2</literal>.`
`826`	`833`	`</para>`
`827`	`834`	`</step>`
`828`	`835`
`829`	`836`	`<step>`
`830`	`837`	`<para>`
`831`		`- Set up yourinstalation. Do not forget to create locale`
	`838`	`+ Set up yourinstallation. Do not forget to create locale`
`832`	`839`	`variables in your profile (environment). For example (this may`
`833`	`840`	`not be correct for <emphasis>your</emphasis> environment):`
`834`	`841`
`@@ -936,16 +943,16 @@ HostCharset <replaceable>host_spec</> <replaceable>host_charset</>`
`936`	`943`	`<para>`
`937`	`944`	`The <filename>charset.conf</> file is always processed up to the`
`938`	`945`	`end, so you can easily specify exceptions from the previous`
`939`		`- rules. In the src/datayou will findcharset.conf example and a few`
`940`		`- recoding tables.`
	`946`	`+ rules. In the<filename>src/data/</> directoryyou will findan`
	`947`	`+example <filename>charset.conf</> and a fewrecoding tables.`
`941`	`948`	`</para>`
`942`	`949`
`943`	`950`	`<para>`
`944`	`951`	`As this solution is based on the client's IP address and character`
`945`	`952`	`set mapping there are obviously some restrictions as well. You`
`946`	`953`	`cannot use different encodings on the same host at the same`
`947`	`954`	`time. It is also inconvenient when you boot your client hosts into`
`948`		`-more operating systems. Nevertheless, when these restrictions are`
	`955`	`+multiple operating systems. Nevertheless, when these restrictions are`
`949`	`956`	`not limiting and you do not need multi-byte characters than it is a`
`950`	`957`	`simple and effective solution.`
`951`	`958`	`</para>`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit1073123

File tree

1 file changed

1 file changed

`‎doc/src/sgml/charset.sgml‎`

0 commit comments