Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-135661: Fix parsing start and end tags in HTMLParser#135930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
serhiy-storchaka wants to merge2 commits intopython:main
base:main
Choose a base branch
Loading
fromserhiy-storchaka:htmlparser-tag

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchakaserhiy-storchaka commentedJun 25, 2025
edited by bedevere-appbot
Loading

  • Whitespaces no longer accepted between</ and the tag name. E.g.</ script> does not end the script section.

  • Vertical tabulation (\v) and non-ASCII whitespaces no longer recognized as whitespaces. The only whitespaces are\t\n\r\f.

  • Null character (U+0000) no longer ends the tag name.

  • End tag can have attributes and slashes after tag name. It no longer ends after the first> in quoted attribute value. E.g.</script/foo=">"/>.

  • Multiple slashes and whitespaces between the last attribute and closing> are now accepted in both start and end tags. E.g.<a foo=bar/ //>.

  • Multiple= between attribute name and value are no longer collapsed. E.g.<a foo==bar> produces attribute "foo" with value "=bar".

  • Whitespaces between the= separator and attribute name or value are no longer ignored. E.g.<a foo =bar> produces two attributes "foo" and "=bar", both with value None;<a foo= bar> produces two attributes: "foo" with value "" and "bar" with value None.

* Whitespaces no longer accepted between `</` and the tag name.  E.g. `</ script>` does not end the script section.* Vertical tabulation (`\v`) and non-ASCII whitespaces no longer recognized  as whitespaces. The only whitespaces are `\t\n\r\f `.* Null character (U+0000) no longer ends the tag name.* End tag can have attributes and slashes after tag name. It no longer ends  after the first `>` in quoted attribute value. E.g. `</script/foo=">"/>`.* Multiple slashes and whitespaces between the last attribute and closing `>`  are now accepted in both start and end tags. E.g. `<a foo=bar/ //>`.* Multiple `=` between attribute name and value are no longer collapsed.  E.g. `<a foo==bar>` produces attribute "foo" with value "=bar".* Whitespaces between the `=` separator and attribute name or value are no  longer ignored. E.g. `<a foo =bar>` produces two attributes "foo" and  "=bar", both with value None; `<a foo= bar>` produces two attributes:  "foo" with value "" and "bar" with value None.
@serhiy-storchaka
Copy link
MemberAuthor

I tried to minimize changes and split this PR on several PRs, but they would not be independent, and all these changes are needed to fix the possible XSS.

I am planning further refactoring, but this is only for the main branch.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@ezio-melottiezio-melottiAwaiting requested review from ezio-melottiezio-melotti is a code owner

Assignees
No one assigned
Labels
awaiting core reviewneeds backport to 3.13bugs and security fixesneeds backport to 3.14bugs and security fixes
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

1 participant
@serhiy-storchaka

[8]ページ先頭

©2009-2025 Movatter.jp