locale --- 國際化服務

原始碼:Lib/locale.py


Thelocale module opens access to the POSIX locale database andfunctionality. The POSIX locale mechanism allows programmers to deal withcertain cultural issues in an application, without requiring the programmer toknow all the specifics of each country where the software is executed.

Thelocale module is implemented on top of the_locale module,which in turn uses an ANSI C locale implementation if available.

Thelocale module defines the following exception and functions:

exceptionlocale.Error

Exception raised when the locale passed tosetlocale() is notrecognized.

locale.setlocale(category,locale=None)

Iflocale is given and notNone,setlocale() modifies the localesetting for thecategory. The available categories are listed in the datadescription below.locale may be a string, or an iterable of two strings(language code and encoding). If it's an iterable, it's converted to a localename using the locale aliasing engine. An empty string specifies the user'sdefault settings. If the modification of the locale fails, the exceptionError is raised. If successful, the new locale setting is returned.

Iflocale is omitted orNone, the current setting forcategory isreturned.

setlocale() is not thread-safe on most systems. Applications typicallystart with a call of

importlocalelocale.setlocale(locale.LC_ALL,'')

This sets the locale for all categories to the user's default setting (typicallyspecified in theLANG environment variable). If the locale is notchanged thereafter, using multithreading should not cause problems.

locale.localeconv()

Returns the database of the local conventions as a dictionary. This dictionaryhas the following strings as keys:

分類

Key

含義

LC_NUMERIC

'decimal_point'

Decimal point character.

'grouping'

Sequence of numbers specifyingwhich relative positions the'thousands_sep' isexpected. If the sequence isterminated withCHAR_MAX, no furthergrouping is performed. If thesequence terminates with a0, the last group size isrepeatedly used.

'thousands_sep'

Character used between groups.

LC_MONETARY

'int_curr_symbol'

International currency symbol.

'currency_symbol'

Local currency symbol.

'p_cs_precedes/n_cs_precedes'

Whether the currency symbolprecedes the value (forpositive resp. negativevalues).

'p_sep_by_space/n_sep_by_space'

Whether the currency symbol isseparated from the value by aspace (for positive resp.negative values).

'mon_decimal_point'

Decimal point used formonetary values.

'frac_digits'

Number of fractional digitsused in local formatting ofmonetary values.

'int_frac_digits'

Number of fractional digitsused in internationalformatting of monetary values.

'mon_thousands_sep'

Group separator used formonetary values.

'mon_grouping'

Equivalent to'grouping',used for monetary values.

'positive_sign'

Symbol used to annotate apositive monetary value.

'negative_sign'

Symbol used to annotate anegative monetary value.

'p_sign_posn/n_sign_posn'

The position of the sign (forpositive resp. negativevalues), see below.

All numeric values can be set toCHAR_MAX to indicate that there is novalue specified in this locale.

The possible values for'p_sign_posn' and'n_sign_posn' are given below.

Value

Explanation

0

Currency and value are surrounded byparentheses.

1

The sign should precede the value andcurrency symbol.

2

The sign should follow the value andcurrency symbol.

3

The sign should immediately precede thevalue.

4

The sign should immediately follow thevalue.

CHAR_MAX

Nothing is specified in this locale.

The function temporarily sets theLC_CTYPE locale to theLC_NUMERIClocale or theLC_MONETARY locale if locales are different and numeric ormonetary strings are non-ASCII. This temporary change affects other threads.

在 3.7 版的變更:The function now temporarily sets theLC_CTYPE locale to theLC_NUMERIC locale in some cases.

locale.nl_langinfo(option)

Return some locale-specific information as a string. This function is notavailable on all systems, and the set of possible options might also varyacross platforms. The possible argument values are numbers, for whichsymbolic constants are available in the locale module.

Thenl_langinfo() function accepts one of the following keys. Mostdescriptions are taken from the corresponding description in the GNU Clibrary.

locale.CODESET

Get a string with the name of the character encoding used in theselected locale.

locale.D_T_FMT

Get a string that can be used as a format string fortime.strftime() torepresent date and time in a locale-specific way.

locale.D_FMT

Get a string that can be used as a format string fortime.strftime() torepresent a date in a locale-specific way.

locale.T_FMT

Get a string that can be used as a format string fortime.strftime() torepresent a time in a locale-specific way.

locale.T_FMT_AMPM

Get a format string fortime.strftime() to represent time in the am/pmformat.

locale.DAY_1
locale.DAY_2
locale.DAY_3
locale.DAY_4
locale.DAY_5
locale.DAY_6
locale.DAY_7

Get the name of the n-th day of the week.

備註

This follows the US convention ofDAY_1 being Sunday, not theinternational convention (ISO 8601) that Monday is the first day of theweek.

locale.ABDAY_1
locale.ABDAY_2
locale.ABDAY_3
locale.ABDAY_4
locale.ABDAY_5
locale.ABDAY_6
locale.ABDAY_7

Get the abbreviated name of the n-th day of the week.

locale.MON_1
locale.MON_2
locale.MON_3
locale.MON_4
locale.MON_5
locale.MON_6
locale.MON_7
locale.MON_8
locale.MON_9
locale.MON_10
locale.MON_11
locale.MON_12

Get the name of the n-th month.

locale.ABMON_1
locale.ABMON_2
locale.ABMON_3
locale.ABMON_4
locale.ABMON_5
locale.ABMON_6
locale.ABMON_7
locale.ABMON_8
locale.ABMON_9
locale.ABMON_10
locale.ABMON_11
locale.ABMON_12

Get the abbreviated name of the n-th month.

locale.RADIXCHAR

Get the radix character (decimal dot, decimal comma, etc.).

locale.THOUSEP

Get the separator character for thousands (groups of three digits).

locale.YESEXPR

Get a regular expression that can be used with the regex function torecognize a positive response to a yes/no question.

locale.NOEXPR

Get a regular expression that can be used with theregex(3) function torecognize a negative response to a yes/no question.

備註

The regular expressions forYESEXPR andNOEXPR use syntax suitable for theregex function from the C library, which mightdiffer from the syntax used inre.

locale.CRNCYSTR

Get the currency symbol, preceded by "-" if the symbol should appear beforethe value, "+" if the symbol should appear after the value, or "." if thesymbol should replace the radix character.

locale.ERA

Get a string which describes how years are counted and displayed foreach era in a locale.

Most locales do not define this value. An example of a locale which doesdefine this value is the Japanese one. In Japan, the traditionalrepresentation of dates includes the name of the era corresponding to thethen-emperor's reign.

Normally it should not be necessary to use this value directly. SpecifyingtheE modifier in their format strings causes thetime.strftime()function to use this information.The format of the returned string is specified inThe Open Group BaseSpecifications Issue 8, paragraph7.3.5.2 LC_TIME C-Language Access.

locale.ERA_D_T_FMT

Get a format string fortime.strftime() to represent date and time in alocale-specific era-based way.

locale.ERA_D_FMT

Get a format string fortime.strftime() to represent a date in alocale-specific era-based way.

locale.ERA_T_FMT

Get a format string fortime.strftime() to represent a time in alocale-specific era-based way.

locale.ALT_DIGITS

Get a string consisting of up to 100 semicolon-separated symbols usedto represent the values 0 to 99 in a locale-specific way.In most locales this is an empty string.

locale.getdefaultlocale([envvars])

Tries to determine the default locale settings and returns them as a tuple ofthe form(languagecode,encoding).

According to POSIX, a program which has not calledsetlocale(LC_ALL,'')runs using the portable'C' locale. Callingsetlocale(LC_ALL,'') letsit use the default locale as defined by theLANG variable. Since wedo not want to interfere with the current locale setting we thus emulate thebehavior in the way described above.

To maintain compatibility with other platforms, not only theLANGvariable is tested, but a list of variables given as envvars parameter. Thefirst found to be defined will be used.envvars defaults to the searchpath used in GNU gettext; it must always contain the variable name'LANG'. The GNU gettext search path contains'LC_ALL','LC_CTYPE','LANG' and'LANGUAGE', in that order.

Except for the code'C', the language code corresponds toRFC 1766.language code andencoding may beNone if their values cannot bedetermined.

Deprecated since version 3.11, will be removed in version 3.15.

locale.getlocale(category=LC_CTYPE)

Returns the current setting for the given locale category as sequence containinglanguage code,encoding.category may be one of theLC_* valuesexceptLC_ALL. It defaults toLC_CTYPE.

Except for the code'C', the language code corresponds toRFC 1766.language code andencoding may beNone if their values cannot bedetermined.

locale.getpreferredencoding(do_setlocale=True)

Return thelocale encoding used for text data, according to userpreferences. User preferences are expressed differently on differentsystems, and might not be available programmatically on some systems, sothis function only returns a guess.

On some systems, it is necessary to invokesetlocale() to obtain theuser preferences, so this function is not thread-safe. If invoking setlocaleis not necessary or desired,do_setlocale should be set toFalse.

On Android or if thePython UTF-8 Mode is enabled, alwaysreturn'utf-8', thelocale encoding and thedo_setlocaleargument are ignored.

ThePython preinitialization configures the LC_CTYPElocale. See also thefilesystem encoding and error handler.

在 3.7 版的變更:The function now always returns"utf-8" on Android or if thePython UTF-8 Mode is enabled.

locale.getencoding()

Get the currentlocale encoding:

  • On Android and VxWorks, return"utf-8".

  • On Unix, return the encoding of the currentLC_CTYPE locale.Return"utf-8" ifnl_langinfo(CODESET) returns an empty string:for example, if the current LC_CTYPE locale is not supported.

  • On Windows, return the ANSI code page.

ThePython preinitialization configures the LC_CTYPElocale. See also thefilesystem encoding and error handler.

This function is similar togetpreferredencoding(False) except thisfunction ignores thePython UTF-8 Mode.

在 3.11 版被加入.

locale.normalize(localename)

Returns a normalized locale code for the given locale name. The returned localecode is formatted for use withsetlocale(). If normalization fails, theoriginal name is returned unchanged.

If the given encoding is not known, the function defaults to the defaultencoding for the locale code just likesetlocale().

locale.strcoll(string1,string2)

Compares two strings according to the currentLC_COLLATE setting. Asany other compare function, returns a negative, or a positive value, or0,depending on whetherstring1 collates before or afterstring2 or is equal toit.

locale.strxfrm(string)

Transforms a string to one that can be used in locale-awarecomparisons. For example,strxfrm(s1)<strxfrm(s2) isequivalent tostrcoll(s1,s2)<0. This function can be usedwhen the same string is compared repeatedly, e.g. when collating asequence of strings.

locale.format_string(format,val,grouping=False,monetary=False)

Formats a numberval according to the currentLC_NUMERIC setting.The format follows the conventions of the% operator. For floating-pointvalues, the decimal point is modified if appropriate. Ifgrouping isTrue,also takes the grouping into account.

Ifmonetary is true, the conversion uses monetary thousands separator andgrouping strings.

Processes formatting specifiers as informat%val, but takes the currentlocale settings into account.

在 3.7 版的變更:Themonetary keyword parameter was added.

locale.currency(val,symbol=True,grouping=False,international=False)

Formats a numberval according to the currentLC_MONETARY settings.

The returned string includes the currency symbol ifsymbol is true, which isthe default. Ifgrouping isTrue (which is not the default), grouping is donewith the value. Ifinternational isTrue (which is not the default), theinternational currency symbol is used.

備註

This function will not work with the 'C' locale, so you have to set alocale viasetlocale() first.

locale.str(float)

Formats a floating-point number using the same format as the built-in functionstr(float), but takes the decimal point into account.

locale.delocalize(string)

Converts a string into a normalized number string, following theLC_NUMERIC settings.

在 3.5 版被加入.

locale.localize(string,grouping=False,monetary=False)

Converts a normalized number string into a formatted string following theLC_NUMERIC settings.

在 3.10 版被加入.

locale.atof(string,func=float)

Converts a string to a number, following theLC_NUMERIC settings,by callingfunc on the result of callingdelocalize() onstring.

locale.atoi(string)

Converts a string to an integer, following theLC_NUMERIC conventions.

locale.LC_CTYPE

Locale category for the character type functions. Most importantly, thiscategory defines the text encoding, i.e. how bytes are interpreted asUnicode codepoints. SeePEP 538 andPEP 540 for how this variablemight be automatically coerced toC.UTF-8 to avoid issues created byinvalid settings in containers or incompatible settings passed over remoteSSH connections.

Python doesn't internally use locale-dependent character transformation functionsfromctype.h. Instead, an internalpyctype.h provides locale-independentequivalents likePy_TOLOWER.

locale.LC_COLLATE

Locale category for sorting strings. The functionsstrcoll() andstrxfrm() of thelocale module are affected.

locale.LC_TIME

Locale category for the formatting of time. The functiontime.strftime()follows these conventions.

locale.LC_MONETARY

Locale category for formatting of monetary values. The available options areavailable from thelocaleconv() function.

locale.LC_MESSAGES

Locale category for message display. Python currently does not supportapplication specific locale-aware messages. Messages displayed by the operatingsystem, like those returned byos.strerror() might be affected by thiscategory.

This value may not be available on operating systems not conforming to thePOSIX standard, most notably Windows.

locale.LC_NUMERIC

Locale category for formatting numbers. The functionsformat_string(),atoi(),atof() andstr() of thelocale module areaffected by that category. All other numeric formatting operations are notaffected.

locale.LC_ALL

Combination of all locale settings. If this flag is used when the locale ischanged, setting the locale for all categories is attempted. If that fails forany category, no category is changed at all. When the locale is retrieved usingthis flag, a string indicating the setting for all categories is returned. Thisstring can be later used to restore the settings.

locale.CHAR_MAX

This is a symbolic constant used for different values returned bylocaleconv().

範例:

>>>importlocale>>>loc=locale.getlocale()# get current locale# use German locale; name might vary with platform>>>locale.setlocale(locale.LC_ALL,'de_DE')>>>locale.strcoll('f\xe4n','foo')# compare a string containing an umlaut>>>locale.setlocale(locale.LC_ALL,'')# use user's preferred locale>>>locale.setlocale(locale.LC_ALL,'C')# use default (C) locale>>>locale.setlocale(locale.LC_ALL,loc)# restore saved locale

Background, details, hints, tips and caveats

The C standard defines the locale as a program-wide property that may berelatively expensive to change. On top of that, some implementations are brokenin such a way that frequent locale changes may cause core dumps. This makes thelocale somewhat painful to use correctly.

Initially, when a program is started, the locale is theC locale, no matterwhat the user's preferred locale is. There is one exception: theLC_CTYPE category is changed at startup to set the current localeencoding to the user's preferred locale encoding. The program must explicitlysay that it wants the user's preferred locale settings for other categories bycallingsetlocale(LC_ALL,'').

It is generally a bad idea to callsetlocale() in some library routine,since as a side effect it affects the entire program. Saving and restoring itis almost as bad: it is expensive and affects other threads that happen to runbefore the settings have been restored.

If, when coding a module for general use, you need a locale independent versionof an operation that is affected by the locale (such ascertain formats used withtime.strftime()), you will have to find a way todo it without using the standard library routine. Even better is convincingyourself that using locale settings is okay. Only as a last resort should youdocument that your module is not compatible with non-C locale settings.

The only way to perform numeric operations according to the locale is to use thespecial functions defined by this module:atof(),atoi(),format_string(),str().

There is no way to perform case conversions and character classificationsaccording to the locale. For (Unicode) text strings these are done accordingto the character value only, while for byte strings, the conversions andclassifications are done according to the ASCII value of the byte, and byteswhose high bit is set (i.e., non-ASCII bytes) are never converted or consideredpart of a character class such as letter or whitespace.

For extension writers and programs that embed Python

Extension modules should never callsetlocale(), except to find out whatthe current locale is. But since the return value can only be used portably torestore it, that is not very useful (except perhaps to find out whether or notthe locale isC).

When Python code uses thelocale module to change the locale, this alsoaffects the embedding application. If the embedding application doesn't wantthis to happen, it should remove the_locale extension module (which doesall the work) from the table of built-in modules in theconfig.c file,and make sure that the_locale module is not accessible as a sharedlibrary.

Access to message catalogs

locale.gettext(msg)
locale.dgettext(domain,msg)
locale.dcgettext(domain,msg,category)
locale.textdomain(domain)
locale.bindtextdomain(domain,dir)
locale.bind_textdomain_codeset(domain,codeset)

The locale module exposes the C library's gettext interface on systems thatprovide this interface. It consists of the functionsgettext(),dgettext(),dcgettext(),textdomain(),bindtextdomain(),andbind_textdomain_codeset(). These are similar to the same functions inthegettext module, but use the C library's binary format for messagecatalogs, and the C library's search algorithms for locating message catalogs.

Python applications should normally find no need to invoke these functions, andshould usegettext instead. A known exception to this rule areapplications that link with additional C libraries which internally invokeC functionsgettext ordcgettext. For these applications, it may benecessary to bind the text domain, so that the libraries can properly locatetheir message catalogs.