Movatterモバイル変換
[0]ホーム
[Python-Dev] Unicode input issues
M.-A. Lemburgmal@lemburg.com
Mon, 10 Apr 2000 17:32:17 +0200
Guido van Rossum wrote:>> Thinking about entering Japanese into raw_input() in IDLE more, I> thought I figured a way to give Takeuchi a Unicode string when he> enters Japanese characters.>> I added an experimental patch to the readline method of the PyShell> class: if the line just read, when converted to Unicode, has fewer> characters but still compares equal (and no exceptions happen during> this test) then return the Unicode version.>> This doesn't currently work because the built-in raw_input() function> requires that the readline() call it makes internally returns an 8-bit> string. Should I relax that requirement in general? (I could also> just replace __builtin__.[raw_]input with more liberal versions> supplied by IDLE.)>> I also discovered that the built-in unicode() function is not> idempotent: unicode(unicode('a')) returns u'\000a'. I think it should> special-case this and return u'a' !Good idea. I'll fix this in the next round. > Finally, I believe we need a way to discover the encoding used by> stdin or stdout. I have to admit I know very little about the file> wrappers that Marc wrote -- is it easy to get the encoding out of> them?I'm not sure what you mean: the name of the input encoding ?Currently, only the names of the encoding and decoding functionsare available to be queried.> IDLE should probably emulate this, as it's encoding is clearly> UTF-8 (at least when using Tcl 8.1 or newer).It should be possible to redirect sys.stdin/stdout usingthe codecs.EncodedFile wrapper. Some tests show that raw_input()doesn't seem to use the redirected sys.stdin though...>>> sys.stdin = EncodedFile(sys.stdin, 'utf-8', 'latin-1')>>> s = raw_input()äöü>>> s'\344\366\374'>>> s = sys.stdin.read()äöü>>> s'\303\244\303\266\303\274\012'-- Marc-Andre Lemburg______________________________________________________________________Business:http://www.lemburg.com/Python Pages:http://www.lemburg.com/python/
[8]ページ先頭