Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.4k
Open
Description
Bug report
#83518 changed handling of non-ASCII characters inencodings.normalize_encoding()
, but it is still inconsistent withcodecs.lookup()
, and not even self-consistent. For example:
>>>import encodings>>> encodings.normalize_encoding('a¤b')'a_b'>>> encodings.normalize_encoding('aæb')'ab'>>> encodings.normalize_encoding('a-¤')'a'>>> encodings.normalize_encoding('a-æ')'a_'>>> encodings.normalize_encoding('a-¤-b')'a_b'>>> encodings.normalize_encoding('a-æ-b')'a__b'
You can even get an underscore at the end or repeated underscores in the middle.
cc@malemburg,@vstinner,@shihai1991
Linked PRs
Metadata
Metadata
Assignees
Labels
Projects
Status
No status