Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
[3.13] gh-133767: Fix use-after-free in the unicode-escape decoder with an error handler (GH-129648)#133944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…der with an error handler (pythonGH-129648)If the error handler is used, a new bytes object is created to set asthe object attribute of UnicodeDecodeError, and that bytes object thenreplaces the original data. A pointer to the decoded data will became invalidafter destroying that temporary bytes object. So we need other way to returnthe first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal()._PyBytes_DecodeEscape() does not have such issue, because it does notuse the error handlers registry, but it should be changed for compatibilitywith _PyUnicode_DecodeUnicodeEscapeInternal().(cherry picked from commit9f69a58)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka commentedMay 13, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Removing
May means that it depends on the error handler and if an encoding error happens before an invalid escape sequence. |
@Yhg1s, what do you prefer? |
I'm not@Yhg1s, but, I'd vote for making them raise. |
The next beta is close, so I implemented a simple, but with minimal impact, solution. The users of Anything simpler can break the user code if they actually use these functions. Anything more complex increases probability of adding bugs in non-tested code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This looks good, thank you!
6279eb8
intopython:3.13Uh oh!
There was an error while loading.Please reload this page.
Thanks@serhiy-storchaka for the PR, and@encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9, 3.10, 3.11, 3.12. |
Sorry,@serhiy-storchaka and@encukou, I could not cleanly backport this to
|
Sorry,@serhiy-storchaka and@encukou, I could not cleanly backport this to
|
Sorry,@serhiy-storchaka and@encukou, I could not cleanly backport this to
|
Sorry,@serhiy-storchaka and@encukou, I could not cleanly backport this to
|
…der with an error handler (pythonGH-129648) (pythonGH-133944)If the error handler is used, a new bytes object is created to set asthe object attribute of UnicodeDecodeError, and that bytes object thenreplaces the original data. A pointer to the decoded data will became invalidafter destroying that temporary bytes object. So we need other way to returnthe first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal()._PyBytes_DecodeEscape() does not have such issue, because it does notuse the error handlers registry, but it should be changed for compatibilitywith _PyUnicode_DecodeUnicodeEscapeInternal().(cherry picked from commit9f69a58)(cherry picked from commit6279eb8)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
GH-134337 is a backport of this pull request to the3.12 branch. |
…der with an error handler (pythonGH-129648) (pythonGH-133944)If the error handler is used, a new bytes object is created to set asthe object attribute of UnicodeDecodeError, and that bytes object thenreplaces the original data. A pointer to the decoded data will became invalidafter destroying that temporary bytes object. So we need other way to returnthe first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal()._PyBytes_DecodeEscape() does not have such issue, because it does notuse the error handlers registry, but it should be changed for compatibilitywith _PyUnicode_DecodeUnicodeEscapeInternal().(cherry picked from commit9f69a58)(cherry picked from commit6279eb8)(cherry picked from commita75953b)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
GH-134341 is a backport of this pull request to the3.11 branch. |
GH-134345 is a backport of this pull request to the3.10 branch. |
…der with an error handler (pythonGH-129648) (pythonGH-133944)If the error handler is used, a new bytes object is created to set asthe object attribute of UnicodeDecodeError, and that bytes object thenreplaces the original data. A pointer to the decoded data will became invalidafter destroying that temporary bytes object. So we need other way to returnthe first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal()._PyBytes_DecodeEscape() does not have such issue, because it does notuse the error handlers registry, but it should be changed for compatibilitywith _PyUnicode_DecodeUnicodeEscapeInternal().(cherry picked from commit9f69a58)(cherry picked from commit6279eb8)(cherry picked from commita75953b)(cherry picked from commit0c33e5b)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
…er with an error handler (pythonGH-129648) (pythonGH-133944)If the error handler is used, a new bytes object is created to set asthe object attribute of UnicodeDecodeError, and that bytes object thenreplaces the original data. A pointer to the decoded data will became invalidafter destroying that temporary bytes object. So we need other way to returnthe first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal()._PyBytes_DecodeEscape() does not have such issue, because it does notuse the error handlers registry, but it should be changed for compatibilitywith _PyUnicode_DecodeUnicodeEscapeInternal().(cherry picked from commit9f69a58)(cherry picked from commit6279eb8)(cherry picked from commita75953b)(cherry picked from commit0c33e5b)(cherry picked from commit8b528ca)Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
GH-134346 is a backport of this pull request to the3.9 branch. |
Uh oh!
There was an error while loading.Please reload this page.
If the error handler is used, a new bytes object is created to set as the object attribute of UnicodeDecodeError, and that bytes object then replaces the original data. A pointer to the decoded data will became invalid after destroying that temporary bytes object. So we need other way to return the first invalid escape from _PyUnicode_DecodeUnicodeEscapeInternal().
_PyBytes_DecodeEscape() does not have such issue, because it does not use the error handlers registry, but it should be changed for compatibility with _PyUnicode_DecodeUnicodeEscapeInternal().
(cherry picked from commit9f69a58)
unicode_escape
decoder with error handler #133767