gettext --- 多語言國際化服務

原始碼:Lib/gettext.py


Thegettext module provides internationalization (I18N) and localization(L10N) services for your Python modules and applications. It supports both theGNUgettext message catalog API and a higher level, class-based API that maybe more appropriate for Python files. The interface described below allows youto write your module and application messages in one natural language, andprovide a catalog of translated messages for running under different naturallanguages.

Some hints on localizing your Python modules and applications are also given.

GNUgettext API

Thegettext module defines the following API, which is very similar tothe GNUgettext API. If you use this API you will affect thetranslation of your entire application globally. Often this is what you want ifyour application is monolingual, with the choice of language dependent on thelocale of your user. If you are localizing a Python module, or if yourapplication needs to switch languages on the fly, you probably want to use theclass-based API instead.

gettext.bindtextdomain(domain,localedir=None)

Bind thedomain to the locale directorylocaledir. More concretely,gettext will look for binary.mo files for the given domain usingthe path (on Unix):localedir/language/LC_MESSAGES/domain.mo, wherelanguage is searched for in the environment variablesLANGUAGE,LC_ALL,LC_MESSAGES, andLANG respectively.

Iflocaledir is omitted orNone, then the current binding fordomain isreturned.[1]

gettext.textdomain(domain=None)

Change or query the current global domain. Ifdomain isNone, then thecurrent global domain is returned, otherwise the global domain is set todomain, which is returned.

gettext.gettext(message)

Return the localized translation ofmessage, based on the current globaldomain, language, and locale directory. This function is usually aliased as_() in the local namespace (see examples below).

gettext.dgettext(domain,message)

Likegettext(), but look the message up in the specifieddomain.

gettext.ngettext(singular,plural,n)

Likegettext(), but consider plural forms. If a translation is found,apply the plural formula ton, and return the resulting message (somelanguages have more than two plural forms). If no translation is found, returnsingular ifn is 1; returnplural otherwise.

The Plural formula is taken from the catalog header. It is a C or Pythonexpression that has a free variablen; the expression evaluates to the indexof the plural in the catalog. Seethe GNU gettext documentationfor the precise syntax to be used in.po files and theformulas for a variety of languages.

gettext.dngettext(domain,singular,plural,n)

Likengettext(), but look the message up in the specifieddomain.

gettext.pgettext(context,message)
gettext.dpgettext(domain,context,message)
gettext.npgettext(context,singular,plural,n)
gettext.dnpgettext(domain,context,singular,plural,n)

Similar to the corresponding functions without thep in the prefix (thatis,gettext(),dgettext(),ngettext(),dngettext()),but the translation is restricted to the given messagecontext.

在 3.8 版被加入.

Note that GNUgettext also defines adcgettext() method, butthis was deemed not useful and so it is currently unimplemented.

Here's an example of typical usage for this API:

importgettextgettext.bindtextdomain('myapplication','/path/to/my/language/directory')gettext.textdomain('myapplication')_=gettext.gettext# ...print(_('This is a translatable string.'))

Class-based API

The class-based API of thegettext module gives you more flexibility andgreater convenience than the GNUgettext API. It is the recommendedway of localizing your Python applications and modules.gettext definesaGNUTranslations class which implements the parsing of GNU.mo formatfiles, and has methods for returning strings. Instances of this class can alsoinstall themselves in the built-in namespace as the function_().

gettext.find(domain,localedir=None,languages=None,all=False)

This function implements the standard.mo file search algorithm. Ittakes adomain, identical to whattextdomain() takes. Optionallocaledir is as inbindtextdomain(). Optionallanguages is a list ofstrings, where each string is a language code.

Iflocaledir is not given, then the default system locale directory is used.[2] Iflanguages is not given, then the following environment variables aresearched:LANGUAGE,LC_ALL,LC_MESSAGES, andLANG. The first one returning a non-empty value is used for thelanguages variable. The environment variables should contain a colon separatedlist of languages, which will be split on the colon to produce the expected listof language code strings.

find() then expands and normalizes the languages, and then iteratesthrough them, searching for an existing file built of these components:

localedir/language/LC_MESSAGES/domain.mo

The first such file name that exists is returned byfind(). If no suchfile is found, thenNone is returned. Ifall is given, it returns a listof all file names, in the order in which they appear in the languages list orthe environment variables.

gettext.translation(domain,localedir=None,languages=None,class_=None,fallback=False)

Return a*Translations instance based on thedomain,localedir,andlanguages, which are first passed tofind() to get a list of theassociated.mo file paths. Instances with identical.mo filenames are cached. The actual class instantiated isclass_ ifprovided, otherwiseGNUTranslations. The class's constructor musttake a singlefile object argument.

If multiple files are found, later files are used as fallbacks for earlier ones.To allow setting the fallback,copy.copy() is used to clone eachtranslation object from the cache; the actual instance data is still shared withthe cache.

If no.mo file is found, this function raisesOSError iffallback is false (which is the default), and returns aNullTranslations instance iffallback is true.

在 3.3 版的變更:IOError used to be raised, it is now an alias ofOSError.

在 3.11 版的變更:codeset 參數被移除。

gettext.install(domain,localedir=None,*,names=None)

This installs the function_() in Python's builtins namespace, based ondomain andlocaledir which are passed to the functiontranslation().

For thenames parameter, please see the description of the translationobject'sinstall() method.

As seen below, you usually mark the strings in your application that arecandidates for translation, by wrapping them in a call to the_()function, like this:

print(_('This string will be translated.'))

For convenience, you want the_() function to be installed in Python'sbuiltins namespace, so it is easily accessible in all modules of yourapplication.

在 3.11 版的變更:names is now a keyword-only parameter.

TheNullTranslations class

Translation classes are what actually implement the translation of originalsource file message strings to translated message strings. The base class usedby all translation classes isNullTranslations; this provides the basicinterface you can use to write your own specialized translation classes. Hereare the methods ofNullTranslations:

classgettext.NullTranslations(fp=None)

Takes an optionalfile objectfp, which is ignored by the base class.Initializes "protected" instance variables_info and_charset which are setby derived classes, as well as_fallback, which is set throughadd_fallback(). It then callsself._parse(fp) iffp is notNone.

_parse(fp)

No-op in the base class, this method takes file objectfp, and readsthe data from the file, initializing its message catalog. If you have anunsupported message catalog file format, you should override this methodto parse your format.

add_fallback(fallback)

Addfallback as the fallback object for the current translation object.A translation object should consult the fallback if it cannot provide atranslation for a given message.

gettext(message)

If a fallback has been set, forwardgettext() to the fallback.Otherwise, returnmessage. Overridden in derived classes.

ngettext(singular,plural,n)

If a fallback has been set, forwardngettext() to the fallback.Otherwise, returnsingular ifn is 1; returnplural otherwise.Overridden in derived classes.

pgettext(context,message)

If a fallback has been set, forwardpgettext() to the fallback.Otherwise, return the translated message. Overridden in derived classes.

在 3.8 版被加入.

npgettext(context,singular,plural,n)

If a fallback has been set, forwardnpgettext() to the fallback.Otherwise, return the translated message. Overridden in derived classes.

在 3.8 版被加入.

info()

Return a dictionary containingthe metadata found in the message catalog file.

charset()

Return the encoding of the message catalog file.

install(names=None)

This method installsgettext() into the built-in namespace,binding it to_.

If thenames parameter is given, it must be a sequence containing thenames of functions you want to install in the builtins namespace inaddition to_(). Supported names are'gettext','ngettext','pgettext', and'npgettext'.

Note that this is only one way, albeit the most convenient way, to makethe_() function available to your application. Because it affectsthe entire application globally, and specifically the built-in namespace,localized modules should never install_(). Instead, they should usethis code to make_() available to their module:

importgettextt=gettext.translation('mymodule',...)_=t.gettext

This puts_() only in the module's global namespace and so onlyaffects calls within this module.

在 3.8 版的變更:新增'pgettext''npgettext'

TheGNUTranslations class

Thegettext module provides one additional class derived fromNullTranslations:GNUTranslations. This class overrides_parse() to enable reading GNUgettext format.mo filesin both big-endian and little-endian format.

GNUTranslations parses optional metadata out of the translationcatalog. It is convention with GNUgettext to include metadata asthe translation for the empty string. This metadata is inRFC 822-stylekey:value pairs, and should contain theProject-Id-Version key. If thekeyContent-Type is found, then thecharset property is used toinitialize the "protected"_charset instance variable, defaulting toNone if not found. If the charset encoding is specified, then all messageids and message strings read from the catalog are converted to Unicode usingthis encoding, else ASCII is assumed.

Since message ids are read as Unicode strings too, all*gettext() methodswill assume message ids as Unicode strings, not byte strings.

The entire set of key/value pairs are placed into a dictionary and set as the"protected"_info instance variable.

If the.mo file's magic number is invalid, the major version number isunexpected, or if other problems occur while reading the file, instantiating aGNUTranslations class can raiseOSError.

classgettext.GNUTranslations

The following methods are overridden from the base class implementation:

gettext(message)

Look up themessage id in the catalog and return the corresponding messagestring, as a Unicode string. If there is no entry in the catalog for themessage id, and a fallback has been set, the look up is forwarded to thefallback'sgettext() method. Otherwise, themessage id is returned.

ngettext(singular,plural,n)

Do a plural-forms lookup of a message id.singular is used as the message idfor purposes of lookup in the catalog, whilen is used to determine whichplural form to use. The returned message string is a Unicode string.

If the message id is not found in the catalog, and a fallback is specified,the request is forwarded to the fallback'sngettext()method. Otherwise, whenn is 1singular is returned, andplural isreturned in all other cases.

以下是個範例:

n=len(os.listdir('.'))cat=GNUTranslations(somefile)message=cat.ngettext('There is%(num)d file in this directory','There are%(num)d files in this directory',n)%{'num':n}
pgettext(context,message)

Look up thecontext andmessage id in the catalog and return thecorresponding message string, as a Unicode string. If there is noentry in the catalog for themessage id andcontext, and a fallbackhas been set, the look up is forwarded to the fallback'spgettext() method. Otherwise, themessage id is returned.

在 3.8 版被加入.

npgettext(context,singular,plural,n)

Do a plural-forms lookup of a message id.singular is used as themessage id for purposes of lookup in the catalog, whilen is used todetermine which plural form to use.

If the message id forcontext is not found in the catalog, and afallback is specified, the request is forwarded to the fallback'snpgettext() method. Otherwise, whenn is 1singular isreturned, andplural is returned in all other cases.

在 3.8 版被加入.

Solaris message catalog support

The Solaris operating system defines its own binary.mo file format, butsince no documentation can be found on this format, it is not supported at thistime.

The Catalog constructor

GNOME uses a version of thegettext module by James Henstridge, but thisversion has a slightly different API. Its documented usage was:

importgettextcat=gettext.Catalog(domain,localedir)_=cat.gettextprint(_('hello world'))

For compatibility with this older module, the functionCatalog() is analias for thetranslation() function described above.

One difference between this module and Henstridge's: his catalog objectssupported access through a mapping API, but this appears to be unused and so isnot currently supported.

Internationalizing your programs and modules

Internationalization (I18N) refers to the operation by which a program is madeaware of multiple languages. Localization (L10N) refers to the adaptation ofyour program, once internationalized, to the local language and cultural habits.In order to provide multilingual messages for your Python programs, you need totake the following steps:

  1. prepare your program or module by specially marking translatable strings

  2. run a suite of tools over your marked files to generate raw messages catalogs

  3. create language-specific translations of the message catalogs

  4. use thegettext module so that message strings are properly translated

In order to prepare your code for I18N, you need to look at all the strings inyour files. Any string that needs to be translated should be marked by wrappingit in_('...') --- that is, a call to the function_. For example:

filename='mylog.txt'message=_('writing a log message')withopen(filename,'w')asfp:fp.write(message)

In this example, the string'writingalogmessage' is marked as a candidatefor translation, while the strings'mylog.txt' and'w' are not.

There are a few tools to extract the strings meant for translation.The original GNUgettext only supported C or C++ sourcecode but its extended versionxgettext scans code writtenin a number of languages, including Python, to find strings marked astranslatable.Babel is a Pythoninternationalization library that includes apybabel script toextract and compile message catalogs. François Pinard's programcalledxpot does a similar job and is available as part ofhispo-utils package.

(Python also includes pure-Python versions of these programs, calledpygettext.py andmsgfmt.py; some Python distributionswill install them for you.pygettext.py is similar toxgettext, but only understands Python source code andcannot handle other programming languages such as C or C++.pygettext.py supports a command-line interface similar toxgettext; for details on its use, runpygettext.py--help.msgfmt.py is binary compatible with GNUmsgfmt. With these two programs, you may not need the GNUgettext package to internationalize your Pythonapplications.)

xgettext,pygettext, and similar tools generate.po files that are message catalogs. They are structuredhuman-readable files that contain every marked string in the sourcecode, along with a placeholder for the translated versions of thesestrings.

Copies of these.po files are then handed over to theindividual human translators who write translations for everysupported natural language. They send back the completedlanguage-specific versions as a<language-name>.po file that'scompiled into a machine-readable.mo binary catalog file usingthemsgfmt program. The.mo files are used by thegettext module for the actual translation processing atrun-time.

How you use thegettext module in your code depends on whether you areinternationalizing a single module or your entire application. The next twosections will discuss each case.

Localizing your module

If you are localizing your module, you must take care not to make globalchanges, e.g. to the built-in namespace. You should not use the GNUgettextAPI but instead the class-based API.

Let's say your module is called "spam" and the module's various natural languagetranslation.mo files reside in/usr/share/locale in GNUgettext format. Here's what you would put at the top of yourmodule:

importgettextt=gettext.translation('spam','/usr/share/locale')_=t.gettext

Localizing your application

If you are localizing your application, you can install the_() functionglobally into the built-in namespace, usually in the main driver file of yourapplication. This will let all your application-specific files just use_('...') without having to explicitly install it in each file.

In the simple case then, you need only add the following bit of code to the maindriver file of your application:

importgettextgettext.install('myapplication')

If you need to set the locale directory, you can pass it into theinstall() function:

importgettextgettext.install('myapplication','/usr/share/locale')

Changing languages on the fly

If your program needs to support many languages at the same time, you may wantto create multiple translation instances and then switch between themexplicitly, like so:

importgettextlang1=gettext.translation('myapplication',languages=['en'])lang2=gettext.translation('myapplication',languages=['fr'])lang3=gettext.translation('myapplication',languages=['de'])# start by using language1lang1.install()# ... time goes by, user selects language 2lang2.install()# ... more time goes by, user selects language 3lang3.install()

Deferred translations

In most coding situations, strings are translated where they are coded.Occasionally however, you need to mark strings for translation, but defer actualtranslation until later. A classic example is:

animals=['mollusk','albatross','rat','penguin','python',]# ...forainanimals:print(a)

Here, you want to mark the strings in theanimals list as beingtranslatable, but you don't actually want to translate them until they areprinted.

Here is one way you can handle this situation:

def_(message):returnmessageanimals=[_('mollusk'),_('albatross'),_('rat'),_('penguin'),_('python'),]del_# ...forainanimals:print(_(a))

This works because the dummy definition of_() simply returns the stringunchanged. And this dummy definition will temporarily override any definitionof_() in the built-in namespace (until thedel command). Takecare, though if you have a previous definition of_() in the localnamespace.

Note that the second use of_() will not identify "a" as beingtranslatable to thegettext program, because the parameteris not a string literal.

Another way to handle this is with the following example:

defN_(message):returnmessageanimals=[N_('mollusk'),N_('albatross'),N_('rat'),N_('penguin'),N_('python'),]# ...forainanimals:print(_(a))

In this case, you are marking translatable strings with the functionN_(), which won't conflict with any definition of_().However, you will need to teach your message extraction program tolook for translatable strings marked withN_().xgettext,pygettext,pybabelextract, andxpot allsupport this through the use of the-k command-line switch.The choice ofN_() here is totally arbitrary; it could have justas easily beenMarkThisStringForTranslation().

致謝

The following people contributed code, feedback, design suggestions, previousimplementations, and valuable experience to the creation of this module:

  • Peter Funk

  • James Henstridge

  • Juan David Ibáñez Palomar

  • Marc-André Lemburg

  • Martin von Löwis

  • François Pinard

  • Barry Warsaw

  • Gustavo Niemeyer

註解

[1]

The default locale directory is system dependent; for example, on Red Hat Linuxit is/usr/share/locale, but on Solaris it is/usr/lib/locale.Thegettext module does not try to support these system dependentdefaults; instead its default issys.base_prefix/share/locale (seesys.base_prefix). For this reason, it is always best to callbindtextdomain() with an explicit absolute path at the start of yourapplication.

[2]

請見上方bindtextdomain() 之註解。