Movatterモバイル変換

This is the mail archive of thelibc-alpha@sourceware.orgmailing list for theglibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Improved check-localedef script

From: Mike FABIAN <mfabian at redhat dot com>
To: Rafal Luzynski <digitalfreak at lingonborough dot com>
Cc: Zack Weinberg <zackw at panix dot com>, GNU C Library <libc-alpha at sourceware dot org>
Date: Fri, 04 Aug 2017 11:50:00 +0200
Subject: Re: Improved check-localedef script
Authentication-results: sourceware.org; auth=none
Authentication-results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=mfabian at redhat dot com
Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 587CB8124F
References: <CAKCAbMjLN7SMWwveXVokSCttqso+r+1AttpFEpDBdJcSyiuQ4Q@mail.gmail.com><s9d60e3bspn.fsf@redhat.com><26692227.553011.1501838716734@poczta.nazwa.pl>

Rafal Luzynski <digitalfreak@lingonborough.com> wrote:> 4.08.2017 11:14 Mike FABIAN <mfabian@redhat.com> wrote:>> But even though U+20AC cannot be converted to ISO-8859-1, the>> ca_ES.ISO-8859-1 locale still works because it is transliterated:>>>> $ LC_ALL=ca_ES locale -k currency_symbol charmap>> currency_symbol="EUR">> charmap="ISO-8859-1">>>> So this does not cause an actual problem.>> So the "€" character is actually representable in ISO-8859-1 because> we can convert it to "EUR".  Looks like a false positive then.Yes.>> The ca_ES source file is not ASCII, it has>>>> % català>> lang_name "<U0063><U0061><U0074><U0061><U006C><U00E0>">>>> So maybe I could just convert the file to UTF-8>> and change “% Charset: ISO-8859-1” into “% Charset: UTF-8”>> to get rid of the check-localedef warning.>>>> Would that be OK?>> I think that no, it's not OK.  If I understand correctly the> "source file is ASCII" sentence means that the individual characters:> '<', '2', '0', 'A', 'C', '>' are ASCII.Yes.> They may describe something more complex like <U00E0>.  But even this> is not UTF-8 because UTF-8 would be <C3> <A0> (UTF-8 is 8-bit).  The> closest charset would be UCS-2 or simply a generic Unicode.My understanding at the moment is that the “% Charset: ...” commentindicates the encoding used to write the source file. So something like“<U20AC>” is definitely ASCII. Non-ASCII stuff in locale source filesseems to exist only in comments at the moment.-- Mike FABIAN <mfabian@redhat.com>

Follow-Ups:
- Re: Improved check-localedef script
  - From: Rafal Luzynski

References:
- Improved check-localedef script
  - From: Zack Weinberg
- Re: Improved check-localedef script
  - From: Mike FABIAN
- Re: Improved check-localedef script
  - From: Rafal Luzynski

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

[8]ページ先頭