Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-121284: Fix email address header folding with parsed encoded-word#122754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
encukou merged 8 commits intopython:mainfrommedmunds:fix-issue-121284
Mar 18, 2025

Conversation

medmunds
Copy link
Contributor

@medmundsmedmunds commentedAug 6, 2024
edited
Loading

Fixes#121284.

[This fixes a #security-issue. PSRT instructed me to handle the fix publicly.]

Email generators using email.policy.default may convert an RFC 2047 encoded-word to unencoded form during header refolding. In a structured header, this could allow 'specials' chars outside a quoted-string, leading to invalid address headers and enabling spoofing. This change ensures a parsed encoded-word that contains specials is kept as an encoded-word while the header is refolded.

The issue is very similar to PR#122753 (and has the same security implications), but this PR involves refolding an encoded-word; the other PR involves refolding a quoted-string. The fixes required are different.

…-wordEmail generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.
@medmunds
Copy link
ContributorAuthor

Requesting a review from@serhiy-storchaka, who recently reviewed a similar security fix around encoded words in email headers (#122233).

And pinging@bitdancer, who probably knows the most about this section of the code.

(Also hoping both of you might be able to review PR#122753, which tries to fix a related security issue first reported 5 years ago.)

@medmunds
Copy link
ContributorAuthor

(A nicer fix would be to decide separately whether each refolded segment needs rfc2047 encoding, quoted-string handling, or no special treatment. But that would require giving _refold_parse_tree() info about whether it's working in a structured or unstructured header, which seems too involved for a security patch.)

@medmunds
Copy link
ContributorAuthor

medmunds commentedJan 19, 2025
edited
Loading

Btw, although this may seem like it's too obscure to matter much, it's actually pretty easy to stumble into vulnerable code. E.g., calling email.utils.formataddr() (correctly!) with user-supplied content can generate exactly the sort of (valid) encoded-word that gets mishandled by email.policy.default.

Copy link
Member

@encukouencukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

By my reading of the code, this is correct.
I'd appreciate an additional review from an email expert, but, if there are no objections I plat to merge this next week.

medmunds and sethmlarson reacted with thumbs up emoji
@bitdancer
Copy link
Member

Please give me another day to look at this before you merge. I think it's fine, but there are a couple things I want to check.

medmunds and encukou reacted with thumbs up emoji

@bitdancer
Copy link
Member

OK, there is actually a pretty straightforward solution to this problem, and the functionality is better:

diff --git a/Lib/email/_header_value_parser.py b/Lib/email/_header_value_parser.pyindex c0e856306d3..9a51b943733 100644--- a/Lib/email/_header_value_parser.py+++ b/Lib/email/_header_value_parser.py@@ -1053,7 +1053,7 @@ def get_fws(value):     fws = WhiteSpaceTerminal(value[:len(value)-len(newvalue)], 'fws')     return fws, newvalue -def get_encoded_word(value):+def get_encoded_word(value, terminal_type='vtext'):     """ encoded-word = "=?" charset "?" encoding "?" encoded-text "?="      """@@ -1092,7 +1092,7 @@ def get_encoded_word(value):             ew.append(token)             continue         chars, *remainder = _wsp_splitter(text, 1)-        vtext = ValueTerminal(chars, 'vtext')+        vtext = ValueTerminal(chars, terminal_type)         _validate_xtext(vtext)         ew.append(vtext)         text = ''.join(remainder)@@ -1134,7 +1134,7 @@ def get_unstructured(value):         valid_ew = True         if value.startswith('=?'):             try:-                token, value = get_encoded_word(value)+                token, value = get_encoded_word(value, 'utext')             except _InvalidEwError:                 valid_ew = False             except errors.HeaderParseError:@@ -1163,7 +1163,7 @@ def get_unstructured(value):         # the parser to go in an infinite loop.         if valid_ew and rfc2047_matcher.search(tok):             tok, *remainder = value.partition('=?')-        vtext = ValueTerminal(tok, 'vtext')+        vtext = ValueTerminal(tok, 'utext')         _validate_xtext(vtext)         unstructured.append(vtext)         value = ''.join(remainder)@@ -2813,7 +2813,7 @@ def _refold_parse_tree(parse_tree, *, policy):             continue         tstr = str(part)         if not want_encoding:-            if part.token_type == 'ptext':+            if part.token_type in ('ptext', 'vtext'):                 # Encode if tstr contains special characters.                 want_encoding = not SPECIALSNL.isdisjoint(tstr)             else:

At this point I no longer remember what 'vtext' was supposed to stand for, but 'utext' is obviously 'unstructured text token' ;) The documentation of these bits of the code could use some improvement (not to mention the code itself!), but this fixes the problem pretty much in the way the original code was intended to work, if we imagine that the failure to check whether or not we were dealing with structured text was a bug as opposed to me forgetting about the distinction ;)

medmunds reacted with thumbs up emoji

medmundsand others added2 commitsMarch 5, 2025 10:32
… encoded-word[Better fix from@bitdancer.]Co-authored-by: R David Murray <rdmurray@bitdance.com>
@medmunds
Copy link
ContributorAuthor

Nice! That feels much better, and allows unencoding encoded-words in refolding when it's safe. I've updated the PR with your change.

Copy link
Member

@bitdancerbitdancer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM

@bitdancerbitdancer added needs backport to 3.9only security fixes needs backport to 3.10only security fixes needs backport to 3.11only security fixes needs backport to 3.12only security fixes needs backport to 3.13bugs and security fixes labelsMar 17, 2025
miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestMar 18, 2025
…-word (pythonGH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------(cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
@bedevere-app
Copy link

GH-131403 is a backport of this pull request to the3.13 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.13bugs and security fixes labelMar 18, 2025
miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestMar 18, 2025
…-word (pythonGH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------(cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
@bedevere-app
Copy link

GH-131404 is a backport of this pull request to the3.12 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.12only security fixes labelMar 18, 2025
miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestMar 18, 2025
…-word (pythonGH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------(cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
@miss-islington-app
Copy link

Sorry,@medmunds and@encukou, I could not cleanly backport this to3.10 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 295b53df2aa18deb625a7da41f7e4babfe6ef34b 3.10

@bedevere-app
Copy link

GH-131405 is a backport of this pull request to the3.11 branch.

@miss-islington-app
Copy link

Sorry,@medmunds and@encukou, I could not cleanly backport this to3.9 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 295b53df2aa18deb625a7da41f7e4babfe6ef34b 3.9

encukou added a commit to encukou/cpython that referenced this pull requestMar 18, 2025
…encoded-word (pythonGH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------(cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
encukou added a commit to encukou/cpython that referenced this pull requestMar 18, 2025
…encoded-word (pythonGH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------(cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
@bedevere-app
Copy link

GH-131411 is a backport of this pull request to the3.10 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.10only security fixes labelMar 18, 2025
encukou added a commit to encukou/cpython that referenced this pull requestMar 18, 2025
…ncoded-word (pythonGH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------(cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
encukou added a commit to encukou/cpython that referenced this pull requestMar 18, 2025
…ncoded-word (pythonGH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------(cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
encukou added a commit to encukou/cpython that referenced this pull requestMar 18, 2025
…ncoded-word (pythonGH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------(cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
@bedevere-app
Copy link

GH-131412 is a backport of this pull request to the3.9 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.9only security fixes labelMar 18, 2025
bitdancer added a commit that referenced this pull requestMar 18, 2025
…d-word (GH-122754) (#131403)gh-121284: Fix email address header folding with parsed encoded-word (GH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------(cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
bitdancer added a commit that referenced this pull requestMar 18, 2025
…d-word (GH-122754) (#131404)gh-121284: Fix email address header folding with parsed encoded-word (GH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------(cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
colesbury pushed a commit to colesbury/cpython that referenced this pull requestMar 20, 2025
…-word (pythonGH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
ambv pushed a commit that referenced this pull requestApr 3, 2025
…d-word (GH-122754) (GH-131405)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.](cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
ambv pushed a commit that referenced this pull requestApr 3, 2025
…d-word (GH-122754) (GH-131411)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.](cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>
ambv pushed a commit that referenced this pull requestApr 3, 2025
…-word (GH-122754) (GH-131412)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.](cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>
gentoo-bot pushed a commit to gentoo/cpython that referenced this pull requestApr 9, 2025
…ncoded-word (pythonGH-122754) (pythonGH-131412)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.](cherry picked from commit295b53d)Co-authored-by: Mike Edmunds <medmunds@gmail.com>Co-authored-by: R David Murray <rdmurray@bitdance.com>
seehwan pushed a commit to seehwan/cpython that referenced this pull requestApr 16, 2025
…-word (pythonGH-122754)Email generators using email.policy.default may convert an RFC 2047encoded-word to unencoded form during header refolding. In a structuredheader, this could allow 'specials' chars outside a quoted-string,leading to invalid address headers and enabling spoofing. This changeensures a parsed encoded-word that contains specials is kept as anencoded-word while the header is refolded.[Better fix from@bitdancer.]---------Co-authored-by: R David Murray <rdmurray@bitdance.com>Co-authored-by: Petr Viktorin <encukou@gmail.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@encukouencukouencukou approved these changes

@bitdancerbitdancerbitdancer approved these changes

@serhiy-storchakaserhiy-storchakaAwaiting requested review from serhiy-storchaka

Assignees

@encukouencukou

Labels
topic-emailtype-securityA security issue
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

email: invalid RFC 2047 address header after refolding with email.policy.default
4 participants
@medmunds@bitdancer@encukou@sethmlarson

[8]ページ先頭

©2009-2025 Movatter.jp