Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-83938, gh-122476: Stop incorrectly RFC 2047 encoding non-ASCII email addresses#122540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
medmunds wants to merge6 commits intopython:main
base:main
Choose a base branch
Loading
frommedmunds:fix-issues-83938-122476

Conversation

medmunds
Copy link
Contributor

@medmundsmedmunds commentedAug 1, 2024
edited
Loading

This PRfixesgh-83938 andfixesgh-122476, which have the same underlying issue.

Email generators had been incorrectly flattening non-ASCII email addresses to RFC 2047 encoded-word format, leaving them undeliverable. (RFC 2047 prohibits use of encoded-word in an addr-spec.) This change raises a ValueError when attempting to flatten an EmailMessage with a non-ASCII addr-spec and a policy withutf8=False. (Exception: If the non-ASCII address originated from parsing a message, it will be flattened as originally parsed, without error.)

Non-ASCII email addresses are supported when using a policy withutf8=True (such as email.policy.SMTPUTF8) under RFCs 6531 and 6532.

Non-ASCII email address domains (but not localparts) can also be used with non-SMTPUTF8 policies by encoding the domain as an IDNA A-label. (The email package does not perform this encoding, because it cannot know whether the caller wants IDNA 2003, IDNA 2008, or some other variant such as UTS-46.)


📚 Documentation preview 📚:https://cpython-previews--122540.org.readthedocs.build/

@medmundsmedmunds requested a review froma team as acode ownerAugust 1, 2024 00:35
@medmundsmedmundsforce-pushed thefix-issues-83938-122476 branch 2 times, most recently fromd1f0bdc to2e0696cCompareAugust 1, 2024 00:43
@medmunds
Copy link
ContributorAuthor

This is based on#81074 (comment):

we should probably be raising an error if the rendering policy does not have utf8=True and we don't have an "original source line" from parsing a message (which is the case here), rather than using the incorrect RFC2047 encoding.

Checkingpart.token_type == 'addr-spec' seemed like the simplest approach.

An alternative would be to introduce a newNonASCIIDomainLiteralDefect parallelingNonASCIILocalPartDefect and apply it in _header_value_parser.get_domain_literal(). And addNonASCIIAddrSpecDefect as a superclass of both. Then change _refold_parse_tree() to checkany(isinstance(d, NonASCIIAddrSpecDefect) for d in part.all_defects) (and perhaps move it up with the other UnicodeEncodeError logic). (If we go this direction, PR#122477 will also need an update.)

Also, Ithinkcharset == 'unknown-8bit' is only possible in _refold_parse_tree() when the non-ASCII characters resulted from parsing an existing message: see the UndecodableBytesDefect logic just above the new code. (The added tests seem to confirm this.)

@medmundsmedmundsforce-pushed thefix-issues-83938-122476 branch from2e0696c tocbedf5dCompareAugust 1, 2024 01:11
Email generators had been incorrectly flattening non-ASCII emailaddresses to RFC 2047 encoded-word format, leaving them undeliverable.(RFC 2047 prohibits use of encoded-word in an addr-spec.)This change raises a ValueError when attempting to flatten anEmailMessage with a non-ASCII addr-spec and a policy with utf8=False.(Exception: If the non-ASCII address originated from parsing a message,it will be flattened as originally parsed, without error.)Non-ASCII email addresses are supported when using a policy withutf8=True (such as email.policy.SMTPUTF8) under RFCs 6531 and 6532.Non-ASCII email address domains (but not localparts) can also be usedwith non-SMTPUTF8 policies by encoding the domain as an IDNA A-label.(The email package does not perform this encoding, because it cannotknow whether the caller wants IDNA 2003, IDNA 2008, or some othervariant such as UTSpython#46.)
@picnixzpicnixz changed the titlegh-83938: Stop incorrectly RFC 2047 encoding non-ASCII email addressesgh-83938, gh-122476: Stop incorrectly RFC 2047 encoding non-ASCII email addressesDec 3, 2024
Copy link
Member

@bitdancerbitdancer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Your analysis is solid and the fix looks great. We'll need a follow on PR to have smtplib handle the new error, but that should be a trivial PR.

medmunds reacted with thumbs up emoji
@bedevere-app
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be poked with soft cushions!

@medmundsmedmundsforce-pushed thefix-issues-83938-122476 branch from5d60c1c tobd6845dCompareApril 1, 2025 20:14
@medmunds
Copy link
ContributorAuthor

I have made the requested changes; please review again

@bedevere-app
Copy link

Thanks for making the requested changes!

@bitdancer: please review the changes made to this pull request.

@bedevere-appbedevere-appbot requested a review frombitdancerApril 1, 2025 20:26
Copy link
Member

@bitdancerbitdancer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry it took me so long to get back to this.

else:
raise errors.InvalidMailboxError(
"Non-ASCII address requires policy with utf8=True:"
" '{}'".format(part)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Would f"Non-ASCII mailbox {str(part)!r} is invalid under current policy setting (utf8=False)" be clearer do you think?

@@ -0,0 +1,5 @@
The :mod:`email` module no longer incorrectly encodes non-ASCII characters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Duplicate news item.

@bedevere-app
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@bitdancerbitdancerbitdancer requested changes

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

EmailMessage bad encoding for non-ASCII localpart EmailMessage bad encoding for international domain
3 participants
@medmunds@bitdancer@ZeroIntensity

[8]ページ先頭

©2009-2025 Movatter.jp