Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

unicodedata: is_normalized claims nothing is normalized in any form when using the 3.2.0 database #101372

Open
Labels
@zahlman

Description

@zahlman

Bug report

3.8 adds the.is_normalized function to theunicodedata module, which also is available as a method on the legacyunicodedata.ucd_3_2_0 database. It is supposed to check whether a string is equal to its normalization in a given form, but without having to normalize and compare.

However, the legacy version does not maintain the expected invariant. In fact, it reports thatevery single-character string isnot normalized,regardless of the normalization form chosen. Presumably, the result is the same for every non-empty string. (It appears that the empty string works because it is special-cased at line 871-874.)

Example:

>>> import unicodedata>>> unicodedata.ucd_3_2_0.normalize('NFC', '!') == '!'True>>> unicodedata.ucd_3_2_0.is_normalized('NFC', '!')False>>> any(unicodedata.ucd_3_2_0.is_normalized(form, chr(x)) for form in ('NFC', 'NFD', 'NFKC', 'NFKD') for x in range(0x110000))False

The bug appears to beat line 801-804 of unicodedata.c:

    /* UCD 3.2.0 is requested, quickchecks must be disabled. */    if (UCD_Check(self)) {        return NO;    }

I believe theNO should sayMAYBE instead. TheNO value appears to indicate that the quickcheck has determined that the string is not normalized - contrary to both the comment and expected behaviour.

Your environment

$ pythonPython 3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0] on linuxType "help", "copyright", "credits" or "license" for more information.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp