Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

html.parser(convert_charrefs=False) silently drops ampersand (&) on invalid named entities, causing data loss #140875

Closed
Assignees
serhiy-storchaka
Labels
3.13bugs and security fixes3.14bugs and security fixes3.15new features, bugs and security fixesstdlibStandard Library Python modules in the Lib/ directorytype-bugAn unexpected behavior, bug, or error
@T90REAL

Description

@T90REAL

Bug report

Bug description:

WhenHTMLParser is initialized withconvert_charrefs=False, it behaves incorrectly when processing an invalid named entity reference (e.g.,&A, which is not a valid HTML entity). The parser silently drops the& character and only passes the subsequentA tohandle_data. I think this indicates a silent data loss problem.

fromhtml.parserimportHTMLParserclassMyParser(HTMLParser):defhandle_data(self,data):print(f"handle_data received:{data!r}")parser_false=MyParser(convert_charrefs=False)parser_false.feed('&A')parser_false.close()
handle_datareceived:'A'

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Labels

3.13bugs and security fixes3.14bugs and security fixes3.15new features, bugs and security fixesstdlibStandard Library Python modules in the Lib/ directorytype-bugAn unexpected behavior, bug, or error

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions


    [8]ページ先頭

    ©2009-2025 Movatter.jp