Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-135462: Fix quadratic complexity in processing special input in HTMLParser#135464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchakaserhiy-storchaka commentedJun 13, 2025
edited by bedevere-appbot
Loading

@serhiy-storchakaserhiy-storchakaforce-pushed thehtmlparser-quadratic-complexity branch from77d5125 toc87cb49CompareJune 13, 2025 13:05
@serhiy-storchaka
Copy link
MemberAuthor

The solution has been written in a way that simplifies backporting. There are other issues, and the code will be refactored in new versions after fixing them.

Comment on lines +724 to +727
@support.requires_resource('cpu')
def test_eof_no_quadratic_complexity(self):
# Each of these examples used to take about an hour.
# Now they take a fraction of a second.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

If they now take a fraction of a second, is there a reason to require thecpu resource?

My understanding is that:

  • with therequires_resource('cpu') decorator:
    • this test would normally be skipped
    • in case of regression, we won't notice unless thecpu is enabled
  • without the decorator:
    • this test is always run and completes quickly
    • in case of regression, the test will timeout/fail and expose the problem

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

They totally take 1.3 seconds on my computer. All other tests take 0.1-0.2 seconds. It is a waste of time to run it several times for every update of any PR. Some buildbots are slower than my computer.

I think that it is enough to run this test only on the fastests builtbots. We already usedrequires_resource('cpu') in similar tests.

ezio-melotti reacted with thumbs up emoji
('data', '\n<img src="URL>'),
('comment', '/img'),
('endtag', 'html<')])
('data', '\n')])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It seems that now everything after the first</html> is ignored (except the\n). This is technically a change in behavior, which should be fine if the new behavior matches the HTML5 specs, but maybe should be noted in the whatsnew.

There also seem to be other minor changes in behavior that -- if they follow the specs -- might not need to be documented (a generic "Some additional invalid constructs are now handled according to the HTML5 specs." might be enough)

serhiy-storchaka reacted with thumbs up emoji
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

In this case, a double-quoted attribute value is never closed. This ishttps://html.spec.whatwg.org/multipage/parsing.html#parse-error-eof-in-tag .

I have update the NEWS entry.

ezio-melotti reacted with thumbs up emoji
serhiy-storchakaand others added2 commitsJune 13, 2025 17:17
Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
Copy link
Contributor

@sethmlarsonsethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Approach seems sensible to me.

@serhiy-storchakaserhiy-storchaka merged commit6eb6c5d intopython:mainJun 13, 2025
43 checks passed
@miss-islington-app
Copy link

Thanks@serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9, 3.10, 3.11, 3.12, 3.13, 3.14.
🐍🍒⛏🤖

@serhiy-storchakaserhiy-storchaka deleted the htmlparser-quadratic-complexity branchJune 13, 2025 16:57
miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestJun 13, 2025
… in HTMLParser (pythonGH-135464)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestJun 13, 2025
… in HTMLParser (pythonGH-135464)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-135481 is a backport of this pull request to the3.14 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.14bugs and security fixes labelJun 13, 2025
@miss-islington-app
Copy link

Sorry,@serhiy-storchaka, I could not cleanly backport this to3.12 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 6eb6c5dbfb528bd07d77b60fd71fd05d81d45c41 3.12

@bedevere-app
Copy link

GH-135482 is a backport of this pull request to the3.13 branch.

@miss-islington-app
Copy link

Sorry,@serhiy-storchaka, I could not cleanly backport this to3.11 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 6eb6c5dbfb528bd07d77b60fd71fd05d81d45c41 3.11

@bedevere-appbedevere-appbot removed the needs backport to 3.13bugs and security fixes labelJun 13, 2025
@miss-islington-app
Copy link

Sorry,@serhiy-storchaka, I could not cleanly backport this to3.10 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 6eb6c5dbfb528bd07d77b60fd71fd05d81d45c41 3.10

@miss-islington-app
Copy link

Sorry,@serhiy-storchaka, I could not cleanly backport this to3.9 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 6eb6c5dbfb528bd07d77b60fd71fd05d81d45c41 3.9

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull requestJun 13, 2025
…l input in HTMLParser (pythonGH-135464)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-135483 is a backport of this pull request to the3.12 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.12only security fixes labelJun 13, 2025
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull requestJun 13, 2025
…l input in HTMLParser (pythonGH-135464)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-135484 is a backport of this pull request to the3.11 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.11only security fixes labelJun 13, 2025
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull requestJun 13, 2025
…l input in HTMLParser (pythonGH-135464)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-135485 is a backport of this pull request to the3.10 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.10only security fixes labelJun 13, 2025
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull requestJun 13, 2025
… input in HTMLParser (pythonGH-135464)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-135486 is a backport of this pull request to the3.9 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.9only security fixes labelJun 13, 2025
serhiy-storchaka added a commit that referenced this pull requestJun 13, 2025
…t in HTMLParser (GH-135464) (GH-135482)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit that referenced this pull requestJun 13, 2025
…t in HTMLParser (GH-135464) (GH-135481)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
lkollar pushed a commit to lkollar/cpython that referenced this pull requestJun 19, 2025
… in HTMLParser (pythonGH-135464)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.
ambv pushed a commit that referenced this pull requestJul 3, 2025
…t in HTMLParser (GH-135464) (GH-135484)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)
ambv pushed a commit that referenced this pull requestJul 3, 2025
…t in HTMLParser (GH-135464) (GH-135485)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)
ambv pushed a commit that referenced this pull requestJul 3, 2025
… in HTMLParser (GH-135464) (GH-135486)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)
ambv pushed a commit that referenced this pull requestJul 3, 2025
…t in HTMLParser (GH-135464) (GH-135483)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.(cherry picked from commit6eb6c5d)
Pranjal095 pushed a commit to Pranjal095/cpython that referenced this pull requestJul 12, 2025
… in HTMLParser (pythonGH-135464)End-of-file errors are now handled according to the HTML5 specs --comments and declarations are automatically closed, tags are ignored.
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@sethmlarsonsethmlarsonsethmlarson approved these changes

@ezio-melottiezio-melottiezio-melotti approved these changes

Labels
type-securityA security issue
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@serhiy-storchaka@sethmlarson@ezio-melotti

[8]ページ先頭

©2009-2025 Movatter.jp