Movatterモバイル変換


[0]ホーム

URL:


Re[Python-Dev] #pragmas in Python source code

Andrew M. Kuchlingakuchlin@mems-exchange.org
Fri, 14 Apr 2000 15:37:01 -0400 (EDT)


Fredrik Lundh writes:>    if the programmer wants to convert between a unicode>    string and a buffer containing encoded text, she needs>    to spell it out.  the codecs are never called "under the>    hood"Watching the successive weekly Unicode patchsets, each one fixing someobscure corner case that turned out to be buggy -- '%s' % ustr,concatenating literals, int()/float()/long(), comparisons -- I'mbeginning to agree with Fredrik.  Automatically making Unicode stringsand regular strings interoperate looks like it requires many changesall over the place, and I worry if it's possible to catch them all intime.  Maybe we should consider being more conservative, and just having theUnicode built-in type, the unicode() built-in function, and the u"..."notation, and then leaving all responsibility for conversions up tothe user.  On the other hand, *some* default conversion seems needed,because it seems draconian to make open(u"abcfile") fail with aTypeError.(While I want to see Python 1.6 expedited, I'd also not like to see itsaddled with a system that proves to have been a mistake, or onethat's a maintenance burden.  If forced to choose between delaying andgetting it right, the latter wins.)>why not just assume that the *ENTIRE SOURCE FILE* uses a single>encoding, and let the tokenizer (or more likely, a conversion stage>before the tokenizer) convert the whole thing to unicode.To reinforce Fredrik's point here, note that XML only supportsencodings at the level of an entire file (or external entity). Youcan't tell an XML parser that a file is in UTF-8, except for this oneelement whose contents are in Latin1.  -- A.M. Kuchlinghttp://starship.python.net/crew/amk/Dream casts a human shadow, when it occurs to him to do so.  -- From SANDMAN: "Season of Mists", episode 0


[8]ページ先頭

©2009-2025 Movatter.jp