Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

EUC-JP codec fails to properly decode the "㎝" character #95734

Closed as not planned
Labels
stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error
@Klim314

Description

@Klim314

Minor bug with decoding of EUC-JP character "㎝".

Bug report

the character "㎝" is part of theJIS_X_0208 encoding. The python core libraries include theEUC-JP encoding, which represents theJIS X 0208,JIS X 0212, andJIS X 0201 encodings. However, attempting to decode the "㎝" character with theEUC-JP codec results in decoding errors.

Example

As taken fromhttps://stackoverflow.com/questions/73255012/python-fails-to-decode-euc-jp-strings-with-the-character:

print(b"58\xad\xd1".decode("EUC-JP"))

throws

Traceback (most recent call last):  File "<pyshell#53>", line 1, in <module>    print(b"58\xad\xd1".decode("EUC-JP"))UnicodeDecodeError: 'euc_jp' codec can't decode byte 0xad in position 2: illegal multibyte sequence

However, decoding with alternative codecs works

content = b"\xa5\xb5\xa5\xa4\xa5\xba\xa1\xa7XL \xcc\xf377\xad\xd1\xa1\xdf\xcc\xf358\xad\xd1"print(b"58\xad\xd1".decode("euc_jisx0213"))>58㎝

Your environment

  • CPython versions tested on: 3.9, 3.10
  • Operating system and architecture: Windows x64

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp