Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[3.13] gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648)#133944

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
encukou merged 2 commits intopython:3.13fromserhiy-storchaka:backport-9f69a58-3.13
May 20, 2025

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchakaserhiy-storchaka commentedMay 12, 2025
edited by bedevere-appbot
Loading

If the error handler is used, a new bytes object is created to set as the object attribute of UnicodeDecodeError, and that bytes object then replaces the original data. A pointer to the decoded data will became invalid after destroying that temporary bytes object. So we need other way to return the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().

_PyBytes_DecodeEscape() does not have such issue, because it does not use the error handlers registry, but it should be changed for compatibility with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit9f69a58)

…der with an error handler (pythonGH-129648)If the error handler is used, a new bytes object is created to set asthe object attribute of UnicodeDecodeError, and that bytes object thenreplaces the original data. A pointer to the decoded data will became invalidafter destroying that temporary bytes object. So we need other way to returnthe first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal()._PyBytes_DecodeEscape() does not have such issue, because it does notuse the error handlers registry, but it should be changed for compatibilitywith _PyUnicode_DecodeUnicodeEscapeInternal().(cherry picked from commit9f69a58)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@serhiy-storchaka
Copy link
MemberAuthor

serhiy-storchaka commentedMay 13, 2025
edited
Loading

Removing_PyUnicode_DecodeUnicodeEscapeInternal and_PyBytes_DecodeEscape breaks the C ABI. We have the following options:

  • Remove them anyway. They are not for use in the user code, and are guarded against this. You have to try hard to use it in your code.
  • Add stub functions which:
    • Always raise an exception.
    • Work as before, butmay ignore invalid escape sequences (setfirst_escape_char to NULL).
    • Work as before, butmay raise an exception if found an invalid escape sequence.
    • Work as before, butmay setfirst_escape_char to some other point in input if found an invalid escape sequences.

May means that it depends on the error handler and if an encoding error happens before an invalid escape sequence.

@serhiy-storchaka
Copy link
MemberAuthor

@Yhg1s, what do you prefer?

@encukou
Copy link
Member

I'm not@Yhg1s, but, I'd vote for making them raise.

@serhiy-storchaka
Copy link
MemberAuthor

The next beta is close, so I implemented a simple, but with minimal impact, solution. The users of_PyUnicode_DecodeUnicodeEscapeInternal()may lose some warnings if they use non-strict error handler. Nothing changed for users of_PyBytes_DecodeEscape().

Anything simpler can break the user code if they actually use these functions. Anything more complex increases probability of adding bugs in non-tested code.

Copy link
Member

@encukouencukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This looks good, thank you!

@encukouencukou merged commit6279eb8 intopython:3.13May 20, 2025
39 checks passed
@miss-islington-app
Copy link

Thanks@serhiy-storchaka for the PR, and@encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9, 3.10, 3.11, 3.12.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Sorry,@serhiy-storchaka and@encukou, I could not cleanly backport this to3.12 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 6279eb8c076d89d3739a6edb393e43c7929b429d 3.12

@miss-islington-app
Copy link

Sorry,@serhiy-storchaka and@encukou, I could not cleanly backport this to3.11 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 6279eb8c076d89d3739a6edb393e43c7929b429d 3.11

@miss-islington-app
Copy link

Sorry,@serhiy-storchaka and@encukou, I could not cleanly backport this to3.10 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 6279eb8c076d89d3739a6edb393e43c7929b429d 3.10

@miss-islington-app
Copy link

Sorry,@serhiy-storchaka and@encukou, I could not cleanly backport this to3.9 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 6279eb8c076d89d3739a6edb393e43c7929b429d 3.9

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull requestMay 20, 2025
…der with an error handler (pythonGH-129648) (pythonGH-133944)If the error handler is used, a new bytes object is created to set asthe object attribute of UnicodeDecodeError, and that bytes object thenreplaces the original data. A pointer to the decoded data will became invalidafter destroying that temporary bytes object. So we need other way to returnthe first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal()._PyBytes_DecodeEscape() does not have such issue, because it does notuse the error handlers registry, but it should be changed for compatibilitywith _PyUnicode_DecodeUnicodeEscapeInternal().(cherry picked from commit9f69a58)(cherry picked from commit6279eb8)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-134337 is a backport of this pull request to the3.12 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.12only security fixes labelMay 20, 2025
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull requestMay 20, 2025
…der with an error handler (pythonGH-129648) (pythonGH-133944)If the error handler is used, a new bytes object is created to set asthe object attribute of UnicodeDecodeError, and that bytes object thenreplaces the original data. A pointer to the decoded data will became invalidafter destroying that temporary bytes object. So we need other way to returnthe first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal()._PyBytes_DecodeEscape() does not have such issue, because it does notuse the error handlers registry, but it should be changed for compatibilitywith _PyUnicode_DecodeUnicodeEscapeInternal().(cherry picked from commit9f69a58)(cherry picked from commit6279eb8)(cherry picked from commita75953b)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-134341 is a backport of this pull request to the3.11 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.11only security fixes labelMay 20, 2025
@bedevere-app
Copy link

GH-134345 is a backport of this pull request to the3.10 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.10only security fixes labelMay 20, 2025
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull requestMay 20, 2025
…der with an error handler (pythonGH-129648) (pythonGH-133944)If the error handler is used, a new bytes object is created to set asthe object attribute of UnicodeDecodeError, and that bytes object thenreplaces the original data. A pointer to the decoded data will became invalidafter destroying that temporary bytes object. So we need other way to returnthe first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal()._PyBytes_DecodeEscape() does not have such issue, because it does notuse the error handlers registry, but it should be changed for compatibilitywith _PyUnicode_DecodeUnicodeEscapeInternal().(cherry picked from commit9f69a58)(cherry picked from commit6279eb8)(cherry picked from commita75953b)(cherry picked from commit0c33e5b)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this pull requestMay 20, 2025
…er with an error handler (pythonGH-129648) (pythonGH-133944)If the error handler is used, a new bytes object is created to set asthe object attribute of UnicodeDecodeError, and that bytes object thenreplaces the original data. A pointer to the decoded data will became invalidafter destroying that temporary bytes object. So we need other way to returnthe first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal()._PyBytes_DecodeEscape() does not have such issue, because it does notuse the error handlers registry, but it should be changed for compatibilitywith _PyUnicode_DecodeUnicodeEscapeInternal().(cherry picked from commit9f69a58)(cherry picked from commit6279eb8)(cherry picked from commita75953b)(cherry picked from commit0c33e5b)(cherry picked from commit8b528ca)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-134346 is a backport of this pull request to the3.9 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.9only security fixes labelMay 20, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@encukouencukouencukou approved these changes

@pablogsalpablogsalAwaiting requested review from pablogsalpablogsal is a code owner

@lysnikolaoulysnikolaouAwaiting requested review from lysnikolaoulysnikolaou is a code owner

@Yhg1sYhg1sAwaiting requested review from Yhg1s

Assignees

@encukouencukou

Labels
type-securityA security issue
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

2 participants
@serhiy-storchaka@encukou

[8]ページ先頭

©2009-2025 Movatter.jp