Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue20087

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:Mismatch between glibc and X11 locale.alias
Type:behaviorStage:resolved
Components:Library (Lib)Versions:Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To:Nosy List: Arfrever, benjamin.peterson, lemburg, licht-t, loewis, serhiy.storchaka
Priority:normalKeywords:patch

Created on2013-12-28 09:29 byserhiy.storchaka, last changed2022-04-11 14:57 byadmin. This issue is nowclosed.

Pull Requests
URLStatusLinkedEdit
PR 422benjamin.peterson,2017-03-03 07:48
PR 713mergedbenjamin.peterson,2017-03-19 06:18
PR 6708mergedserhiy.storchaka,2018-05-05 15:22
PR 6713mergedmiss-islington,2018-05-06 05:47
PR 6714mergedmiss-islington,2018-05-06 05:49
PR 6717mergedserhiy.storchaka,2018-05-06 07:19
Messages (33)
msg207025 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2013-12-28 09:29
The locale module uses locale alias table derived from X11 locale.alias file for mapping bare locale names without encodings to locale names with encodings. However sometimes glibc default encoding for a locale differs from that used in X11 locale.alias.Here is full differences table:                 GLibc                 X11 locale.aliasaz_az            az_AZ.UTF-8           az_AZ.ISO8859-9Eca_ad            ca_AD.ISO8859-15      ca_AD.ISO8859-1ca_fr            ca_FR.ISO8859-15      ca_FR.ISO8859-1ca_it            ca_IT.ISO8859-15      ca_IT.ISO8859-1cy_gb            cy_GB.ISO8859-14      cy_GB.ISO8859-1en_in            en_IN.UTF-8           en_IN.ISO8859-1et_ee            et_EE.ISO8859-1       et_EE.ISO8859-15fi_fi            fi_FI.ISO8859-1       fi_FI.ISO8859-15gd_gb            gd_GB.ISO8859-15      gd_GB.ISO8859-1hi_in            hi_IN.UTF-8           hi_IN.ISCII-DEViu_ca            iu_CA.UTF-8           iu_CA.NUNACOM-8iw_il            iw_IL.ISO8859-8       he_IL.ISO8859-8ka_ge            ka_GE.GEORGIAN_PS     ka_GE.GEORGIAN-ACADEMYlo_la            lo_LA.UTF-8           lo_LA.MULELAO-1mi_nz            mi_NZ.ISO8859-13      mi_NZ.ISO8859-1nr_za            nr_ZA.UTF-8           nr_ZA.ISO8859-1nso_za           nso_ZA.UTF-8          nso_ZA.ISO8859-15ru_ru            ru_RU.ISO8859-5       ru_RU.UTF-8rw_rw            rw_RW.UTF-8           rw_RW.ISO8859-1sq_al            sq_AL.ISO8859-1       sq_AL.ISO8859-2ss_za            ss_ZA.UTF-8           ss_ZA.ISO8859-1ta_in            ta_IN.UTF-8           ta_IN.TSCII-0tg_tj            tg_TJ.KOI8_T          tg_TJ.KOI8-Cth_th            th_TH.TIS_620         th_TH.ISO8859-11tn_za            tn_ZA.UTF-8           tn_ZA.ISO8859-15ts_za            ts_ZA.UTF-8           ts_ZA.ISO8859-1tt_ru            tt_RU.UTF-8           tt_RU.TATAR-CYRur_pk            ur_PK.UTF-8           ur_PK.CP1256uz_uz            uz_UZ.ISO8859-1       uz_UZ.UTF-8uz_uz@cyrillicuz_UZ.UTF-8@cyrillic  uz_UZ.UTF-8vi_vn            vi_VN.UTF-8           vi_VN.TCVNzh_cn            zh_CN.GB2312          zh_CN.gb2312zh_tw            zh_TW.BIG5            zh_TW.big5zh_tw.euctw      zh_TW.EUC_TW          zh_TW.eucTWFor example with the en_IN encoding:>>> import locale, _locale>>> _locale.setlocale(locale.LC_CTYPE)'en_IN'>>> locale.getlocale()('en_IN', 'ISO8859-1')>>> locale.nl_langinfo(locale.CODESET)'UTF-8'>>> locale.setlocale(locale.LC_CTYPE, locale.getlocale())Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "/home/serhiy/py/cpython/Lib/locale.py", line 592, in setlocale    return _setlocale(category, locale)locale.Error: unsupported locale setting
msg288871 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2017-03-03 08:06
Needed a test for few common locales (en_IN, ru_RU) and maybe for unusual locales (uz_uz,uz_uz@cyrillic).I would prefer to have a separate issue that updates the aliases table to glibc 2.24.
msg289174 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2017-03-07 16:53
I agree that it's reasonable to have glibc's aliases overridethe X.org ones, but this patch makes some pretty significant changes to Python's default assumptions with respect to default encodings for several locales.While some changes obviously make sense (e.g. 'ca_AD.ISO8859-1' to 'ca_AD.ISO8859-15'), others are less clear (e.g. 'cy_GB.ISO8859-1' to 'cy_GB.ISO8859-14' or 'tg_TJ.KOI8-C' to 'tg_TJ.KOI8-T' or several of the moves from ISO encodings to UTF-8). Is there some reference for why glibc chose different values than X.org for these ?I also don't understand why some "xx.utf-8" locale mappings were removed - I don't think we should remove those, unless they are no lot needed due to some other logic implying these mappings.Since these are major changes, we need an appropriate warning in the NEWS file (and the "What's New" document), an update of the top comment (under "### Database") to mention that the glibc database takes precedence and where to find it,
msg289176 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2017-03-07 17:23
> 'cy_GB.ISO8859-1' to 'cy_GB.ISO8859-14'Looks as just fixing an error. The default West-European ISO8859-1 is changed to Celtic cy_GB.ISO8859-14. This looks better option for Welsh.> 'tg_TJ.KOI8-C' to 'tg_TJ.KOI8-T'KOI8-C is not supported by Python, but KOI8-T is supported. I don't know what KOI8-C means, there are several rarely used incompatible encodings with this name.> I also don't understand why some "xx.utf-8" locale mappings were removed - I don't think we should remove those, unless they are no lot needed due to some other logic implying these mappings.The aliases table is a table of exceptions. Removed entries no longer are exceptional.
msg289179 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2017-03-07 18:29
On 07.03.2017 18:23, Serhiy Storchaka wrote:> > Serhiy Storchaka added the comment:> >> 'cy_GB.ISO8859-1' to 'cy_GB.ISO8859-14'> > Looks as just fixing an error. The default West-European ISO8859-1 is changed to Celtic cy_GB.ISO8859-14. This looks better option for Welsh.> >> 'tg_TJ.KOI8-C' to 'tg_TJ.KOI8-T'> > KOI8-C is not supported by Python, but KOI8-T is supported. I don't know what KOI8-C means, there are several rarely used incompatible encodings with this name.While all this may make sense, I'm missing some more reasoningbehind the differences between X.org and glibc.This change also looks strange:-    'ka_ge':                                'ka_GE.GEORGIAN-ACADEMY',+    'ka_ge':                                'ka_GE.GEORGIAN_PS',     'ka_ge.georgianacademy':                'ka_GE.GEORGIAN-ACADEMY',     'ka_ge.georgianps':                     'ka_GE.GEORGIAN-PS',     'ka_ge.georgianrs':                     'ka_GE.GEORGIAN-ACADEMY',Why is GEORGIAN_PS written with an underscore whereas the othermappings use dashes ?Or this one:-    'fi_fi':                                'fi_FI.ISO8859-15',+    'fi_fi':                                'fi_FI.ISO8859-1',Why would a locale switch away from an encoding havingthe Euro sign to one without it ?Or why is this latin variant removed:-    'nan_tw@latin':                         'nan_TW.UTF-8@latin',Why should Russians switch back to ISO ?-    'ru_ru':                                'ru_RU.UTF-8',+    'ru_ru':                                'ru_RU.ISO8859-5',or from ISO to KOI ?-    'russian':                              'ru_RU.ISO8859-5',+    'russian':                              'ru_RU.KOI8-R',The more I look at these changes, the more I believe weshould not simply take everything we find in the filesfor granted. They obviously both have bugs.>> I also don't understand why some "xx.utf-8" locale mappings were removed - I don't think we should remove those, unless they are no longer needed due to some other logic implying these mappings.> > The aliases table is a table of exceptions. Removed entries no longer are exceptional.It's not a table of exceptions, it's a table mapping commonlyused locale settings to ones which the lib C understands :-)But regardless, I checked the code and it is alreadysmart enough to convert lib C incompatible spellings suchas "utf8" to "UTF-8", so these entries can indeed beremoved, but only if the locale is otherwise listed.In some cases, it's probably better to drop the ".utf8"to have more generic mappings, e.g.+    'bhb_in.utf8':                          'bhb_IN.UTF-8',or     'de_li.utf8':                           'de_LI.UTF-8',though I'd expect that mapping to be:     'de_li':                           'de_LI.ISO8859-1',as for all other "de" entries.
msg289205 -(view)Author: Benjamin Peterson (benjamin.peterson)*(Python committer)Date: 2017-03-08 06:27
Why is the X11 locale alias map used at all? It seems like it can only create confusion with libc.
msg289210 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2017-03-08 07:20
Not all platforms use glibc 2.24 as libc.Ideally most of entries should even not exist. We should ask libc for the default encoding if it is not included in the locale name. The aliases table should be used only for mapping commonly used but unsupported by libc locales to supported by libc locales.
msg289222 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2017-03-08 09:25
On 08.03.2017 08:20, Serhiy Storchaka wrote:> > Serhiy Storchaka added the comment:> > Not all platforms use glibc 2.24 as libc.True. Many don't even use glibc.> Ideally most of entries should even not exist. We should ask libc for the default encoding if it is not included in the locale name. The aliases table should be used only for mapping commonly used but unsupported by libc locales to supported by libc locales.I think you have a wrong understanding of what this alias tableis used for: we need it to determine the lib C compatible localename without using lib C APIs such as setlocale(), since these arenot thread safe and have side-effects for the whole process.The alias table is there to avoid having to go to the lib Cto ask it indirectly for more details. Unfortunately, there areno cross-platform lib C APIs which would allow querying thesedetails without also changing the local settings of the process.I know that Python still plays the usual "save current locale,run setlocale(), revert to previous locale" trick in a coupleof places and this works if Python is the only thread running,but it doesn't when embedded into other applications.Regarding the patch: we cannot simply use the output from thescript to set new values. The changes have to be manuallyreviewed as well.E.g. this entry in the table is clearly a typo:    'en_zw.utf8':                           'en_ZS.UTF-8',(it should read en_ZW.UTF-8)This entry appears wrong as well:    'eo':                                   'eo_XX.ISO8859-3',(XX is not a valid country ISO code)How should we go about this ? Mark all the problems in the PR ?
msg289223 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2017-03-08 09:37
The problem is that that table can get incorrect result for non-Linux platforms (or for Linux with old glibc).
msg289231 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2017-03-08 11:35
On 08.03.2017 07:27, Benjamin Peterson wrote:> > Why is the X11 locale alias map used at all? It seems like it can only create confusion with libc.Because it was the only such maintained mapping available at thetime. It's also used for the X.org system, which has a rather strongfocus on user interfaces where locale matter a lot, unlikethe lib C :-)
msg289232 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2017-03-08 11:48
On 08.03.2017 10:37, Serhiy Storchaka wrote:> > The problem is that that table can get incorrect result for non-Linux platforms (or for Linux with old glibc).Sure, it's a best effort approach.Also note that on today's systems you often don't have the full set oflocales available anymore - instead these have to either be installedseparately or generated on the target system.Our locale database works on all these system, regardless ofwhat's installed or not.
msg289242 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2017-03-08 15:33
Why was the PR merged while we were still discussing it ?
msg289277 -(view)Author: Benjamin Peterson (benjamin.peterson)*(Python committer)Date: 2017-03-09 07:15
"eo_XX" is just something that appears in the X11 locale.alias file. My change doesn't add that; it was already there. (for Esperanto, which I suppose explains the "XX")Most of the changes you identify the glibc aliases taking precedence over the X11 ones. e.g., glibc has "fi_FI ISO-8859-1" while the X11 locale list has "fi_FI.ISO8859-15". That seems correct to me as far as the intent of this change is concerned.How do you propose to pick and choose what we use from the X11 locale alias list?
msg289282 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2017-03-09 10:23
On 09.03.2017 08:15, Benjamin Peterson wrote:> > "eo_XX" is just something that appears in the X11 locale.alias file. My change doesn't add that; it was already there. (for Esperanto, which I suppose explains the "XX")Yes, I know. That was an example of a bug in the X.org list.> Most of the changes you identify the glibc aliases taking precedence over the X11 ones. e.g., glibc has "fi_FI ISO-8859-1" while the X11 locale list has "fi_FI.ISO8859-15". That seems correct to me as far as the intent of this change is concerned.No, it's not correct. ISO-8859-1 is the older version of Latin-1without the Euro sign. ISO8859-15 adds it.> How do you propose to pick and choose what we use from the X11 locale alias list?We have to go through the list one by one to check whetherthe mapping update makes sense and is correct.This will be difficult in a few cases where the glibc mappingswitches to UTF-8 from an ISO encoding. We'll have to findevidence that this change does indeed make sense.My take on this is that the X.org folks know better than theglibc folks, since the former have to deal with end users thatrely on the locale settings a lot more than applicationsusing glibc for getting an initial locale setting right.Also note that you are parsing the SUPPORTED file fromglibc (in slightly processed form):https://github.com/bminor/glibc/blob/master/localedata/SUPPORTEDThis file does not provide a locale alias mapping asthe routine in makelocalealias.py suggests. Instead it'sa list of locales to install by default:https://github.com/bminor/glibc/blob/73dfd088936b9237599e4ab737c7ae2ea7d710e1/localedata/MakefileIn glibc you can define both the locale and the encoding separatelywhen creating a locale using localedef and the file simply providesthe default parameters to pass to this tool.As such, I don't see how you can derive a default aliasmeaning from the file.It's simply an indication of what glibc would have installedin case it were installed from source, but that's hardly everthe case. On today's systems only a bare subset of localesis installed and more added as necessary, so you rarely haveall the locales defined in SUPPORTED installed on a system.So the file doesn't even provide a hint at what couldbe installed on the system ("locale -a" gives you that list).Here's the history:https://github.com/bminor/glibc/commits/master/localedata/SUPPORTEDIt's merely a list of additions and removals from thedefault set. Nothing more. It does provide a list ofknown and supported locales, but no usable or authoritativeencoding information (locales are defined using Unicode, sothe encoding is a parameter and not predefined).Overall, I believe the file is pretty useless to use asbasis for an alias table providing encoding information.It may provide some ideas for corrections, but should notoverride the X.org one by default.On the other hand, you have the local.alias master file:https://cgit.freedesktop.org/xorg/lib/libX11/tree/nls/locale.alias.pretogether with the history of why changes were made and when.This is an authoritative resource and people are making changesagainst it from the user perspective.I'd suggest to make the override optional in makelocalealias.pyvia a command line switch and to use this for manually addingor fixing X.org entries.If you absolutely want to parse the glibc file per default aswell, please only let it add new entries, not override existingones. As we've seen in the patch, those overrides need to becarefully reviewed.
msg289283 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2017-03-09 10:41
> Why is the X11 locale alias map used at all? It seems like it can only create confusion with libc.Originally only the X11 locale alias map was used. The support of the glibc locale alias map was added 2.5 years ago (issue20079).
msg289284 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2017-03-09 10:47
The SUPPORTED file from glibc is used for determining the default encoding  for locales that don't include it explicitly. For example en_IN uses UTF-8 rather than ISO8859-1.
msg289286 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2017-03-09 12:42
On 09.03.2017 11:47, Serhiy Storchaka wrote:> > The SUPPORTED file from glibc is used for determining the default encoding  for locales that don't include it explicitly. For example en_IN uses UTF-8 rather than ISO8859-1.No, the glibc locales don't say anything about default encodingsused in a locale:http://manpages.ubuntu.com/manpages/wily/en/man5/locale.5.htmlThese encodings are just used for determining the defaultset of locale.encoding variants to install on the system,nothing more:https://github.com/bminor/glibc/blob/73dfd088936b9237599e4ab737c7ae2ea7d710e1/localedata/Makefile#L204glibc does have a locale.alias file:https://github.com/bminor/glibc/blob/73dfd088936b9237599e4ab737c7ae2ea7d710e1/intl/locale.aliaswhich uses the X.org format, but this is completely out ofdate and declared obsolete.Serhiy: If you believe that there's anything authoritative aboutthe glibc SUPPORTED file in terms of defining the commonlyused encoding in a locale, please provide references. Theseshould also clarify why the glibc encoding is the correct onecompared to the X.org mapping.It doesn't help, trying to interpret things into such buildfiles. We need a database that is being actively maintainedand has a track record of representing what people actuallyuse in their locales. The only one I know is the X.org one.
msg289290 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2017-03-09 13:07
The original issue isissue29571. The locale module returned encoding ISO8859-1 for locale en_IN (as in the X11 locale alias map), but glibc uses UTF-8 (as in glibc SUPPORT file).
msg289340 -(view)Author: Benjamin Peterson (benjamin.peterson)*(Python committer)Date: 2017-03-10 07:37
Do you believe this program should work?import locale, osfor l in open("/usr/share/i18n/SUPPORTED"):    alias, encoding = l.strip().split()    locale.setlocale(locale.LC_ALL, alias)    try:        enc = locale.getlocale()[1]    except ValueError:        continue # not in table    normalized = enc.replace("ISO", "ISO-"). \                     replace("_", "-"). \                     replace("euc", "EUC-"). \                     replace("big5", "big5-").upper()    assert normalized == locale.nl_langinfo(locale.CODESET)After my change it does—the encoding returned from getlocale() is the one actually being used by glibc. It fails dramatically on earlier versions of Python (for example on the en_IN example from#29571.) I don't understand why Python needs to editorialize whatever choices libc or the system administrator has made.Is getlocale() expected to return something different from the underlying C locale?In fact, why have this table at all instead of using nl_langinfo to return the encoding for the current locale?
msg289377 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2017-03-10 15:29
On 10.03.2017 08:37, Benjamin Peterson wrote:> > Do you believe this program should work?> > import locale, os> for l in open("/usr/share/i18n/SUPPORTED"):>     alias, encoding = l.strip().split()>     locale.setlocale(locale.LC_ALL, alias)>     try:>         enc = locale.getlocale()[1]>     except ValueError:>         continue # not in table>     normalized = enc.replace("ISO", "ISO-"). \>                      replace("_", "-"). \>                      replace("euc", "EUC-"). \>                      replace("big5", "big5-").upper()>     assert normalized == locale.nl_langinfo(locale.CODESET)> > After my change it does—the encoding returned from getlocale() is the one actually being used by glibc. It fails dramatically on earlier versions of Python (for example on the en_IN example from#29571.) I don't understand why Python needs to editorialize whatever choices libc or the system administrator has made.Your program essentially tests what alias is configuredon your particular system. It will fail on older systems(with a different or no version of SUPPORTED), it will fail onsystems that do not have all locales installed, it willfail on systems that use the X.org aliases table as basisrather than some list of supported locales of glibc, orcustom alias tables.What we want in Python is a consistent mapping of aliases to localesacross all (Unix based) Python installations, just like what wehave for encoding aliases and those mappings should be takenfrom a support alias database, not a list of default installationson some glibc version.Also note that a lot of these discussions are really academic,since locales should always be specified with encoding.While Unix gravitates to UTF-8 for all system related things,users still use other encodings a lot for their daily operations,as you can see in the X.org aliases file.This is why defaulting to UTF-8 for locales (as e.g.is done for many locales in the glibc default installs) is nota good idea. Locales affect user work products. What's fine forcommand line interfacing or piping, is not necessarily forfine for e.g. documents created by users.So to answer your question: No, I don't believe that SUPPORTEDhas any authority for our purposes and thus don't think thatthe program can be considered a valid test case.The SUPPORTED file can server as extra resource for fixing bugsin the table, but nothing more.> Is getlocale() expected to return something different from the underlying C locale?getlocale() will return whatever is currently configured viasetlocale().Of course, it can return something different from what some glibcSUPPORTED lists as default installation encoding, if you don't providethe encoding when using setlocale(), but it will always defaultto the same locale and encoding on all platforms where yourun Python.> In fact, why have this table at all instead of using nl_langinfo to return the encoding for the current locale?The table is meant to normalize locale names and enrichthem with default encodings from a well known database ofsuch aliases, where necessary. As mentioned above the locale settingshould ideally include the encoding as well, so that any suchguesses are not necessary.Regarding nl_langinfo():nl_langinfo() will only work if you have calledsetlocale() already, since a process always starts up inthe C locale without this call.If you don't have a problem with calling setlocale() fortesting the default locale settings (e.g. Python is notembedded, you don't have other threads running, noAPIs which use locale information called yet, setlocale()was already called to setup the locale, etc.),you can use the approach taken by getpreferredencoding(),which is to temporarily set the locale to the default.Going forward, I think that the following changes makesense:* from ISO8859-1 to ISO8859-15 (the -15 version adds  the Euro sign)* casing changes e.g. 'zh_CN.gb2312' to 'zh_CN.GB2312'* fixes which undo removal of modifiers such as  'uz_uz@cyrillic' -> 'uz_UZ.UTF-8' to 'uz_UZ.UTF-8@cyrillic'As for the other changes: please undo them and alsorevert the unconditional use of glibc mappings overridingthe X.org ones, as mentioned earlier in the thread.We can readd some of the modifications later on if there'sevidence that they actually do make sense.Thanks,-- Marc-Andre LemburgeGenix.com
msg289386 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2017-03-10 16:01
I'm feeling there is something wrong with the current locale design. See issuesissue504219,issue10466,issue20088,issue25191,issue29571.
msg289439 -(view)Author: Benjamin Peterson (benjamin.peterson)*(Python committer)Date: 2017-03-11 07:55
I'm still confused about what getlocale() is supposed to do. Why do we attempt to return an encoding anyway if the underlying setlocale call doesn't return one? Is getlocale() not supposed to a simple wrapper over the C locale? If not, how is one supposed to get the encoding associated with the C locale?The old alias table code meant that the encoding returned from getlocale() could be related to or completely unrelated to the actual C locale. Misunderstanding this results in issues like#29571.
msg289787 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2017-03-17 21:30
The main purpose of the alias table is to support normalization and this is used for getdefaultencoding() which was created to be able to determine the default encoding based on what X.org uses as default without doing temporary setlocale() tricks.Now, normalization also happens when passing a locale value to the underlying setlocale(), mainly to avoid many common bugs due to setlocale() being extremely picky about the locale value. A side effect of this is that normalization will also kick in to add the encoding in case no encoding is given in the parameter.Note that no normalization is necessary to simply set the configured default locale configured on the system. In such a case, you'd run setlocale('LC_ALL') and get what's configured.If you run the lib C setlocale() with a locale without encoding, the encoding used by the system entirely on what's configured on the system. The SUPPORTED file only gives a hint at what glibc think it should install per default, but any admin or distributor could change these settings simply by running localedef with some other encoding (charmap in locale speak).I suppose that we could resolve some of the confusion by adding a parameter to disable this normalization in setlocale().
msg290131 -(view)Author: Benjamin Peterson (benjamin.peterson)*(Python committer)Date: 2017-03-24 20:18
New changesetdf8280838f52d6ec45ba03ef734b0dec8a9c43fb by Benjamin Peterson in branch 'master':bpo-20087: Revert "make the glibc alias table take precedence over the X11 one (#422)" (#713)https://github.com/python/cpython/commit/df8280838f52d6ec45ba03ef734b0dec8a9c43fb
msg290273 -(view)Author: Benjamin Peterson (benjamin.peterson)*(Python committer)Date: 2017-03-24 22:42
New changeset02371e0ed1ee82ec73e7d363bcf2ed40cde1397a by Benjamin Peterson in branch 'master':make the glibc alias table take precedence over the X11 one (#422)https://github.com/python/cpython/commit/02371e0ed1ee82ec73e7d363bcf2ed40cde1397a
msg316214 -(view)Author: Licht Takeuchi (licht-t)*Date: 2018-05-05 13:41
Hi all,The locale in the latest Ubuntu 18.04 contains en_IL as valid locale, but Python cannot resolve this.This makes test failure in pandas.https://github.com/pandas-dev/pandas/issues/20957en_IL has significant impact because this is English locale and now supported in the latest Ubuntu. Is there any plan to add only en_IL?(Note that I've already created the PR. (https://github.com/python/cpython/pull/6707 ))```(pandas-dev) [pandas] locale -aCC.UTF-8en_AGen_AG.utf8en_AU.utf8en_BW.utf8en_CA.utf8en_DK.utf8en_GB.utf8en_HK.utf8en_IE.utf8en_ILen_IL.utf8en_INen_IN.utf8en_NGen_NG.utf8en_NZ.utf8en_PH.utf8en_SG.utf8en_US.utf8en_ZA.utf8en_ZMen_ZM.utf8en_ZW.utf8ja_JP.utf8POSIX```
msg316216 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2018-05-05 15:28
Benjamin's patch did two things: 1) made the glibc alias table taking precedence over the X11 one; 2) updated the alias mapping with new glibc. The first part is controversial, but updating the alias mapping with new glibc is made regularly.PR 6708 updates it with glibc 2.27. This adds 39 new aliases and fixesissue32781 andissue33432.
msg316224 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2018-05-06 05:46
New changesetcedc9b74202d8c1ae39bca261cbb45d42ed54d45 by Serhiy Storchaka in branch 'master':bpo-20087: Update locale alias mapping with glibc 2.27 supported locales. (ПР-6708)https://github.com/python/cpython/commit/cedc9b74202d8c1ae39bca261cbb45d42ed54d45
msg316226 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2018-05-06 07:20
New changeset6049bda21b607acc90bbabcc604997e794e8aee1 by Serhiy Storchaka (Miss Islington (bot)) in branch '3.7':[3.7]bpo-20087: Update locale alias mapping with glibc 2.27 supported locales. (GH-6708) (GH-6713)https://github.com/python/cpython/commit/6049bda21b607acc90bbabcc604997e794e8aee1
msg316227 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2018-05-06 07:20
New changesetb1c70d0ffbb235def1deab62a744ffd9b5253924 by Serhiy Storchaka (Miss Islington (bot)) in branch '3.6':[3.6]bpo-20087: Update locale alias mapping with glibc 2.27 supported locales. (GH-6708) (GH-6714)https://github.com/python/cpython/commit/b1c70d0ffbb235def1deab62a744ffd9b5253924
msg316228 -(view)Author: Serhiy Storchaka (serhiy.storchaka)*(Python committer)Date: 2018-05-06 07:51
New changeseta55ac801f749a731250f3c7c1db7d546d22ae032 by Serhiy Storchaka in branch '2.7':[2.7]bpo-20087: Update locale alias mapping with glibc 2.27 supported locales. (GH-6708). (GH-6717)https://github.com/python/cpython/commit/a55ac801f749a731250f3c7c1db7d546d22ae032
msg316234 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2018-05-06 11:38
Thanks, Serhiy.
msg387148 -(view)Author: Marc-Andre Lemburg (lemburg)*(Python committer)Date: 2021-02-17 12:12
I believe we can close this old issue.The discussion was certainly a useful one. I guess we should stop updating the alias table automatically and instead add new aliases or change existing ones based on more research and using the X11 files as well as glibc and other resources to help.
History
DateUserActionArgs
2022-04-11 14:57:56adminsetgithub: 64286
2021-02-17 12:12:03lemburgsetstatus: open -> closed
resolution: fixed
messages: +msg387148

stage: patch review -> resolved
2019-03-05 15:26:32vstinnerunlinkissue29571 dependencies
2018-05-06 11:38:05lemburgsetmessages: +msg316234
2018-05-06 07:51:51serhiy.storchakasetmessages: +msg316228
2018-05-06 07:20:44serhiy.storchakasetmessages: +msg316227
2018-05-06 07:20:14serhiy.storchakasetmessages: +msg316226
2018-05-06 07:19:19serhiy.storchakasetpull_requests: +pull_request6409
2018-05-06 05:49:25miss-islingtonsetpull_requests: +pull_request6407
2018-05-06 05:47:25miss-islingtonsetpull_requests: +pull_request6406
2018-05-06 05:46:24serhiy.storchakasetmessages: +msg316224
2018-05-05 15:28:51serhiy.storchakasetmessages: +msg316216
2018-05-05 15:22:57serhiy.storchakasetkeywords: +patch
stage: test needed -> patch review
pull_requests: +pull_request6401
2018-05-05 13:41:00licht-tsetnosy: +licht-t
messages: +msg316214
2018-05-05 12:42:29serhiy.storchakalinkissue33432 superseder
2018-02-15 10:03:40serhiy.storchakalinkissue32781 superseder
2017-03-24 22:42:45benjamin.petersonsetmessages: +msg290273
2017-03-24 20:18:34benjamin.petersonsetmessages: +msg290131
2017-03-19 06:18:12benjamin.petersonsetpull_requests: +pull_request633
2017-03-18 06:31:35serhiy.storchakasetpull_requests: -pull_request602
2017-03-17 21:30:17lemburgsetmessages: +msg289787
2017-03-17 21:00:34larrysetpull_requests: +pull_request602
2017-03-11 07:55:04benjamin.petersonsetmessages: +msg289439
2017-03-10 16:01:12serhiy.storchakasetmessages: +msg289386
2017-03-10 15:29:14lemburgsetmessages: +msg289377
2017-03-10 07:37:04benjamin.petersonsetmessages: +msg289340
2017-03-09 13:07:38serhiy.storchakasetmessages: +msg289290
2017-03-09 12:42:38lemburgsetmessages: +msg289286
2017-03-09 10:47:43serhiy.storchakasetmessages: +msg289284
2017-03-09 10:41:55serhiy.storchakasetmessages: +msg289283
2017-03-09 10:23:28lemburgsetmessages: +msg289282
2017-03-09 07:15:30benjamin.petersonsetmessages: +msg289277
2017-03-08 15:33:41lemburgsetmessages: +msg289242
2017-03-08 11:48:32lemburgsetmessages: +msg289232
2017-03-08 11:35:17lemburgsetmessages: +msg289231
2017-03-08 09:37:13serhiy.storchakasetmessages: +msg289223
2017-03-08 09:25:21lemburgsetmessages: +msg289222
2017-03-08 07:20:53serhiy.storchakasetmessages: +msg289210
2017-03-08 06:27:06benjamin.petersonsetnosy: +benjamin.peterson
messages: +msg289205
2017-03-07 18:29:01lemburgsetmessages: +msg289179
2017-03-07 17:23:04serhiy.storchakasetmessages: +msg289176
2017-03-07 16:53:07lemburgsetmessages: +msg289174
2017-03-06 09:41:12serhiy.storchakalinkissue29571 dependencies
2017-03-06 09:39:59serhiy.storchakasetstage: test needed
versions: + Python 3.6, Python 3.7, - Python 3.4
2017-03-03 08:06:38serhiy.storchakasetmessages: +msg288871
2017-03-03 07:48:57benjamin.petersonsetpull_requests: +pull_request351
2014-09-30 15:45:29serhiy.storchakasetversions: + Python 3.5, - Python 3.3
2013-12-28 20:46:45Arfreversetnosy: +Arfrever
2013-12-28 09:29:49serhiy.storchakacreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp