Movatterモバイル変換
[0]ホーム
[Python-Dev] Generalised String Coercion
"Martin v. Löwis"martin at v.loewis.de
Sun Aug 7 15:06:27 CEST 2005
Reinhold Birkenfeld wrote:> FWIW, I've already drafted a patch for the former. It lets you write to> file.encoding and honors this when writing Unicode strings to it.I don't like that approach. You shouldn't be allowed to change theencoding mid-stream (except perhaps under very specific circumstances).As I see it, the buffer of an encoded file becomes split, atleast forinput: there are bytes which have been read and not yet decoded, andthere are characters which have been decoded but not yet consumed.If you change the encoding mid-stream, you would have to undo decodingthat was already done, resetting the stream to the real "current"position.For output, the situation is similar: before changing to a new encoding,or before changing from unicode output to byte output, you have toflush then codec first: it may be that the codec has buffered somestate which needs to be completely processed first before a new codeccan be applied to the stream.Another issue is seeking: given the many different kinds of buffers,seeking becomes fairly complex. Ideally, seeking should apply toapplication-level positions, ie. if when you tell the current position,it should be in terms of data already consumed by the application.Perhaps seeking in an encoded stream should not be supported at all.Finally, you also have to consider Universal Newlines: you can applythem either on the byte stream, or on the character stream. I thinkconceptually right would be to do universal newlines on the characterstream.Regards,Martin
More information about the Python-Devmailing list
[8]ページ先頭