Movatterモバイル変換


[0]ホーム

URL:


Jython: Upper-ASCII characters '\351' from chr(233)

Maurice Bauhahnbauhahnm at clara.net
Wed Apr 25 17:00:23 EDT 2001


Thank you, Steve. Maybe it would help if I could explain what I am doing. I'mtrying to write a programme to transcode eight-bit encodings to Unicodeencodings (the Cambodian/Khmer language) and then do letter-pair frequencystudies. Since I will be comparing characters (not integers) to compare againstthe key, I need to have characters as the key. That effort has in fact now beensuccessfull. My next problem, discussed elsewhere in comp.lang.python, is toimport Unicode escaped characters/strings from another 8-bit encoded file into aJython dictionary. (Presumably the solution is in the codecs module).Steve Holden wrote:> "Maurice Bauhahn" <bauhahnm at clara.net> wrote in message> news:mailman.988158281.19384.python-list at python.org...> > Thank you very much for your persistent help.> >> > I was able to get the 8th bit characters to act as keys...with a somewhat> > complex construction: chr(int(linesplit[0])). Linesplit had decimal> > numbers in text format.> >> Would this shed any light on your original question or help in solving your> problem more compactly? Note that this is CPython, not Jython, but> portability should make all this work in both implementations. From what> I've read, it seems to be your need to see decimal numbers in the source> whcih led you to these contortions.>This solved the problem I first encountered (which was probably an artifact ofsomething on the same line!).>> Your original assertion that>> """> >>> chr(127)> '?' (in fact a character like a house)> """>> is quite correct, but I don't see why a weird printable representation makes> a character unsuitable for use as a dictionary key. Maybe I missed your> point. Anyway ...>I do not worry about the shape of the characters (why those of Khmer are muchmore novel in any case;-)).>> >>> # Construct a string of all chars from 0 to 255> >>> chars = "".join(map(chr, [i for i in range(256)]))> >>> # Use decimal value to access single characters> >>> # and use them as dictionary keys> >>> dict = {}> >>> dict[chars[233]] = "Two hundred thirty-three"> >>> dict[chars[27]] = "escape"> >>> dict["\033"]> 'escape'> >>> dict["\351"]> 'Two hundred thirty-three'> >>> dict> {'\033': 'escape', '\351': 'Two hundred thirty-three'}> >>>>> In other words, having constructed the chars[] list, you can index it with> decimal numbers to get the characters you want. chars could equally have> been a list of single-character strings, with the same effect.Yes, it is the list of single-character strings that I am using (nowsuccessfully).>>> If this doesn't help you at all, please feel free to ignore my rantings.>Thank you for the questions and desire to help!>> regards>  Steve>> > linesplit = split('\t',encodingline)> >          if (len(linesplit) > 5):> >             try:> >                templist = linesplit[2:4]> >                templist.append(split(';|:',linesplit[4]))> >                templist.append(strip(linesplit[5]))> >                encodedict[chr(int(linesplit[0]))] = templist> >                print templist> >             except ValueError:> >                logerror('My error', linesplit[0])> >          else:> >             logerror('Not >5 fields long', linesplit)> >--Maurice Bauhahn2 Meadow WayDorney ReachMAIDENHEADSL6 0DSUnited KingdomHome Tel: +44(0)1628 626068Work Tel: +44(0)1932 878404Home Email:bauhahnm at clara.netWork Email:mbauhahn at brio.com


More information about the Python-listmailing list

[8]ページ先頭

©2009-2025 Movatter.jp