Movatterモバイル変換
[0]ホーム
[Python-Dev] re: Unicode as argument for 8-bit strings
M.-A. Lemburgmal@lemburg.com
Sat, 08 Apr 2000 11:51:32 +0200
Bill Tutt wrote:>> > There has been a bug report about the treatment of Unicode> > objects together with 8-bit format strings. The current> > implementation converts the Unicode object to UTF-8 and then> > inserts this value in place of the %s....> >> > I'm inclined to change this to have '...%s...' % u'abc'> > return u'...abc...' since this is just another case of> > coercing data to the "bigger" type to avoid information loss.> >> > Thoughts ?>> Suddenly returning a Unicode string from an operation that was an 8-bit> string is likely to give some code exterme fits of despondency.>> Converting to UTF-8 didn't give you any data loss, however it certainly> might be unexpected to now find UTF-8 characters in what the user originally> thought was> a binary string containing whatever they had wanted it to contain.Well, the design is to always coerce to Unicode when 8-bitstring objects and Unicode objects meet. This is done forall string methods and that's the reason I'm also implementingthis for %-formatting (internally this is just another stringmethod). > Throwing an exception would at the very least force the user to make a> decision one way or the other about what they want to do with the data.> They might want to do a codepage translation, or something else. (aka Hey,> here's a bug I just found for you!)True; but Guido's intention was to have strings and Unicodeinteroperate without too much user intervention.> In what other cases are you suddenly returning a Unicode string object from> which previouslly returned a string object?All string methods automatically coerce to Unicode when theysee a Unicode argument, e.g. " ".join(("abc", u"def")) willreturn u"abc def".-- Marc-Andre Lemburg______________________________________________________________________Business:http://www.lemburg.com/Python Pages:http://www.lemburg.com/python/
[8]ページ先頭