Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-102988: Detect email address parsing errors and return empty tuple to indicate the parsing error (old API)#102990

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
tdwyer wants to merge2 commits intopython:mainfromtdwyer:issues102988

Conversation

tdwyer
Copy link
Contributor

Pull Request title

gh-102988: This PR is designed to detect parsing errors and return an empty tuple to indicate the parsing error. Additionally, this PR updates thetest_email.py to check for these bugs, as well as, adds some other wacky Address Headers that are in the examples of RFC 2822 and makes sure they are being parsed correctly.

I realize that this PR dose not actually track down the bug and fix it. It simply detects the error has happened and returns a parsing error. However,Lib/email/utils.py is a much simple file thanLib/email/_parseaddr.py, so it is much easier to review this change. Additionally, there are actually multiple bugs which are causing erroneous output. Tracing the code flow for each and fixing them would be prone to error considering all of the wacky stuff that RFC 2822 allows for in Address headers. Finally, this change is actually rather simple.

@tdwyertdwyer requested a review froma team as acode ownerMarch 24, 2023 04:32
@bedevere-bot
Copy link

Most changes to Pythonrequire a NEWS entry.

Please add it using theblurb_it web app or theblurb command-line tool.

@ghost
Copy link

ghost commentedMar 24, 2023
edited by ghost
Loading

All commit authors signed the Contributor License Agreement.
CLA signed

@arhadthedevarhadthedev changed the titleDetect parsing errors and return empty tuple to indicate the parsing errorgh-102988: Detect parsing errors and return empty tuple to indicate the parsing errorMar 24, 2023
@arhadthedevarhadthedev added stdlibPython modules in the Lib dir topic-email labelsMar 24, 2023
@bitdancerbitdancer changed the titlegh-102988: Detect parsing errors and return empty tuple to indicate the parsing errorgh-102988: Detect email address parsing errors and return empty tuple to indicate the parsing error (old API)Mar 24, 2023
@bedevere-bot
Copy link

Most changes to Pythonrequire a NEWS entry.

Please add it using theblurb_it web app or theblurb command-line tool.

Copy link
Member

@gpsheadgpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

thanks for the PR!

for v in fieldvalues:
s = str(v).replace('\\(', '').replace('\\)', '')
if s.count('(') != s.count(')'):
fieldvalues.remove(v)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

this loop is modifying the apparent list it is iterating over within the loop. that makes reasoning about its exact behavior hard. removing an item could mean you wind up skipping an item, appending an item could make the loop iterate over that. furthermore remove is O(n)... you're re-finding the item to remove to remove it. Also, this code modifys the passed in list in place before returning it.

A better code pattern for this is to build up a new list and return that. never modifying the input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Something like:

accepted_values= []forvinfieldvalues:s=v.replace('\\(','').replace('\\)','')ifs.count('(')!=s.count(')'):v= ('','')accepted_values.append(v)returnaccepted_values

CharlieZhao95 and tdwyer reacted with thumbs up emoji
"""Validate the parsed values are syntactically correct"""
for v in parsedvalues:
if '[' in v[1]:
parsedvalues.remove(v)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

same design comment as above, don't modify the list being iterated over and don't modify the argument in place.

tdwyer reacted with thumbs up emoji

n = 0
for v in fieldvalues:
n += str(v).count(',') + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

get rid of this str(v)

tdwyer reacted with thumbs up emoji

def getaddresses(fieldvalues):
"""Return a list of (REALNAME, EMAIL) for each fieldvalue."""
fieldvalues = _pre_parse_validation(fieldvalues)
all = COMMASPACE.join(str(v) for v in fieldvalues)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The existing code already callsstr(v) here... this is a problem. It means the input can be anything. But we probably cannot change it in a patch release for a security fix.

To avoid propagating this mistake into more lines of code spread out, I suggest changing this function to do:

fieldvalues= [str(v)forvinfieldvalues]

on the first line and get rid of all subsequent str(v) calls on anything from fieldvalues in this function or in functions it calls.

tdwyer reacted with thumbs up emoji
parsedvalues.append(('', ''))

return parsedvalues


def getaddresses(fieldvalues):
"""Return a list of (REALNAME, EMAIL) for each fieldvalue."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

add documentation mentioning that fieldvalues that could not be parsed may cause a ('', '') item to be returned in their place.

tdwyer reacted with thumbs up emoji
def _pre_parse_validation(fieldvalues):
"""Validate the field values are syntactically correct"""
for v in fieldvalues:
s = str(v).replace('\\(', '').replace('\\)', '')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

See later comments onstr(v) being bad, we can get rid of it here.

tdwyer reacted with thumbs up emoji
@@ -106,12 +106,42 @@ def formataddr(pair, charset='utf-8'):
return address


def _pre_parse_validation(fieldvalues):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

the name "fieldvalues" is non-specific. i realize it comes from the name in getaddress() but we should make it more clear what these are. email_address_fields perhaps?

Related: I don't this docstring adds meaningful value. naming the function and parameter right along with it being short code is sufficiently self explanatory for this internal function. get rid of the docstring.

tdwyer reacted with thumbs up emoji
def _post_parse_validation(parsedvalues):
"""Validate the parsed values are syntactically correct"""
for v in parsedvalues:
if '[' in v[1]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Add a comment explaining why only[ is not allowed. ideally with a link to the relevant RFC or similar.

tdwyer reacted with thumbs up emoji
return fieldvalues


def _post_parse_validation(parsedvalues):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

parsed_email_address_tuples perhaps?

tdwyer reacted with thumbs up emoji
@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

tdwyer reacted with thumbs up emoji

@gpsheadgpshead self-assigned thisApr 28, 2023
@gpsheadgpshead requested a review frombitdancerApril 28, 2023 21:53
@bedevere-bot
Copy link

Most changes to Pythonrequire a NEWS entry.

Please add it using theblurb_it web app or theblurb command-line tool.

1 similar comment
@bedevere-bot
Copy link

Most changes to Pythonrequire a NEWS entry.

Please add it using theblurb_it web app or theblurb command-line tool.

Copy link

@theta682theta682 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

stylistic suggestions

eq(utils.getaddresses(['alice@example.org(<bob@example.com>']),
[('', '')])
eq(utils.getaddresses(['alice@example.org)<bob@example.com>']),
[('' ,'')])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
[('' ,'')])
[('','')])

eq(utils.getaddresses(['alice@example.org)<bob@example.com>']),
[('' ,'')])
eq(utils.getaddresses(['alice@example.org<<bob@example.com>']),
[('' ,'')])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
[('' ,'')])
[('','')])

eq(utils.getaddresses(['alice@example.org<<bob@example.com>']),
[('' ,'')])
eq(utils.getaddresses(['alice@example.org><bob@example.com>']),
[('' ,'')])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
[('' ,'')])
[('','')])

eq(utils.getaddresses(['alice@example.org><bob@example.com>']),
[('' ,'')])
eq(utils.getaddresses(['alice@example.org@<bob@example.com>']),
[('' ,'')])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
[('' ,'')])
[('','')])

eq(utils.getaddresses(['alice@example.org,<bob@example.com>']),
[('', 'alice@example.org'), ('', 'bob@example.com')])
eq(utils.getaddresses(['alice@example.org;<bob@example.com>']),
[('' ,'')])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
[('' ,'')])
[('','')])

eq(utils.parseaddr(['alice@example.org;<bob@example.com>']),
('' ,''))
eq(utils.parseaddr(['alice@example.org:<bob@example.com>']),
('' ,''))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
('' ,''))
[('','')])

eq(utils.parseaddr(['alice@example.org:<bob@example.com>']),
('' ,''))
eq(utils.parseaddr(['alice@example.org.<bob@example.com>']),
('' ,''))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
('' ,''))
[('','')])

eq(utils.parseaddr(['alice@example.org.<bob@example.com>']),
('' ,''))
eq(utils.parseaddr(['alice@example.org"<bob@example.com>']),
('' ,''))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
('' ,''))
[('','')])

eq(utils.parseaddr(['alice@example.org"<bob@example.com>']),
('' ,''))
eq(utils.parseaddr(['alice@example.org[<bob@example.com>']),
('' ,''))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
('' ,''))
[('','')])

eq(utils.parseaddr(['alice@example.org[<bob@example.com>']),
('' ,''))
eq(utils.parseaddr(['alice@example.org]<bob@example.com>']),
('' ,''))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
('' ,''))
[('','')])

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@theta682theta682theta682 left review comments

@CharlieZhao95CharlieZhao95CharlieZhao95 left review comments

@gpsheadgpsheadgpshead requested changes

Assignees

@gpsheadgpshead

Labels
awaiting changesstdlibPython modules in the Lib dirtopic-emailtype-securityA security issue
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

6 participants
@tdwyer@bedevere-bot@CharlieZhao95@gpshead@theta682@arhadthedev

[8]ページ先頭

©2009-2025 Movatter.jp