Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue31831

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:EmailMessage.add_attachment(filename="long or spécial") crashes or produces invalid output
Type:behaviorStage:resolved
Components:emailVersions:Python 3.7, Python 3.6
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To:Nosy List: barry, calimeroteknik, r.david.murray
Priority:normalKeywords:

Created on2017-10-20 22:22 bycalimeroteknik, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.

Messages (8)
msg304684 -(view)Author: calimeroteknik (calimeroteknik)Date: 2017-10-20 22:22
The following code excerpt demonstrates a crash:import email.messagemail = email.message.EmailMessage()mail.add_attachment(   b"test",   maintype = "text",   subtype  = "plain",   filename = "I thought I could put a few words in the filename but apparently it does not go so well.txt" )print(mail)Output on python 3.7.0a1:https://gist.github.com/altendky/33c235e8a693235acd0551affee0a4f6Output on python 3.6.2:https://oremilac.tk/paste/python-rfc2231-oops.logAdditionally, a behavioral issue is demonstrated by replacing in the above:filename = "What happens if we try French in here? touché!.txt"Which results in the following output (headers):Content-Type: text/plainContent-Transfer-Encoding: base64Content-Disposition: attachment;filename*=utf-8''What%20happens%20if%20we%20try%20French%20in%20here%3F%20touch%C3%A9%21.txtMIME-Version: 1.0Instead of, for example, this correct output (by Mozilla Thunderbird here):Content-Type: text/plain; charset=UTF-8; name="=?UTF-8?Q?What_happens_if_we_try_French_in_here=3f_touch=c3=a9!.txt?="Content-Transfer-Encoding: base64Content-Disposition: attachment; filename*0*=utf-8''%57%68%61%74%20%68%61%70%70%65%6E%73%20%69%66%20%77%65; filename*1*=%20%74%72%79%20%46%72%65%6E%63%68%20%69%6E%20%68%65%72%65%3F; filename*2*=%20%74%6F%75%63%68%C3%A9%21%2E%74%78%74Issues to note here:-the "filename" parameter is not indented, mail clients ignore it-the "filename" parameter is not split according to RFC 2231The relevant standard is exemplified in section 4.1 ofhttps://tools.ietf.org/html/rfc2231#page-5Python 3.4.6 and 3.5.4 simply do not wrap anything, which works with  but is not conformant to standards.Solving all of the above would imply correctly splitting any header.Function "set_param" in /usr/lib/python*/email/message.py looked like a place to look.Unfortunately I do not understand what's going on there very well.As yet an additional misbehaviour to note, try to repeat the above print statement twice.The result is not identical, and the second time you get the following output:Content-Type: text/plainContent-Transfer-Encoding: base64Content-Disposition: attachment;*=utf-8''What%20happens%20if%20we%20try%20French%20in%20here%3F%20touch%C3%A9%21.txtMIME-Version: 1.0It would appear that "filename" has disappeared.The issue does not reveal itself with simple values for the 'filename' argument, e.g. "test.txt".PS: The above output also illustrates this (way more minor) issue:https://bugs.python.org/issue25235
msg304685 -(view)Author: calimeroteknik (calimeroteknik)Date: 2017-10-20 22:27
Erratum: the output generated by python 3.5 and 3.4 causes line wraps in the SMTP delivery chain, which cause exactly the same breakage as ulterior versions: the crucially needed indendation of one space ends up being absent.
msg304691 -(view)Author: R. David Murray (r.david.murray)*(Python committer)Date: 2017-10-21 00:12
Does the patch ingh-3488 fix this?  I think it should, or if it doesn't that's a bug in the PR patch.
msg304705 -(view)Author: calimeroteknik (calimeroteknik)Date: 2017-10-21 13:49
I confirm that as for the crash, the patch ingh-3488 fixes it.The first code excerpt in my initial report now outputs the following, valid headers:Content-Type: text/plainContent-Transfer-Encoding: base64Content-Disposition: attachment; filename*0*=utf-8''I%20thought%20I%20could%20put%20a%20few%20words%20in%20th; filename*1*=e%20filename%20but%20apparently%20it%20does%20not%20go%20so%20we; filename*2*=ll.txtMIME-Version: 1.0However, when Unicode is added and the filename is short, things don't look right, this code:import email.messagemail = email.message.EmailMessage()mail.add_attachment(b"test", maintype="text", subtype="plain", filename="é.txt")print(mail)Results in these headers:Content-Type: text/plainContent-Transfer-Encoding: base64Content-Disposition: attachment; filename="é.txt"MIME-Version: 1.0To begin with, it is easy to deduce that there is no way to know that this 'é' character is UTF-8.And it's two 8-bit values at east one of which is detectably outside of 7-bit US-ASCII.Quotinghttps://tools.ietf.org/html/rfc2231#page-4:>a lightweight encoding mechanism is needed to accommodate 8-bit information in parameter values.The 8-bit encoding goes straight through instead of undergoing the encoding process, which seems required in my interpretation of RFC2231.
msg304706 -(view)Author: R. David Murray (r.david.murray)*(Python committer)Date: 2017-10-21 13:55
You are correct, that is a bug.  Presumably I forgot to check for non-ascii when the parameter value doesn't need to be folded.  I'm not sure when I'll have time to look at this, unfortunately :(.  If you can see how to fix it, you could submit a PR against my PR branch, I think.
msg304726 -(view)Author: calimeroteknik (calimeroteknik)Date: 2017-10-22 00:48
Eventually there is no bug, I was just confused at the output of print() on the EmailMessage.I noticed that in email/_header_value_parser.py policy.utf8 was True.The reason is found in email/message.py line 970 (class MIMEPart):    def __str__(self):       return self.as_string(policy=self.policy.clone(utf8=True)print() will use __str__() and this is why it happens.I didn't dig out the exact reason since there are so many delegated calls.In any case, the flattened message in smtplib.SMTP does contain what as_string() returns, which means that the policy.utf8 is only forced when using print().Sorry for the false alert.I can guess that the intention in forcing policy.utf8=True in __str__() was that SMTPUTF8 output is visually prettier than any ASCII-armored text.After additional fuzzing, checking the output with EmailMessage.as_string(), everything seems OK.That's a +1 forgh-3488, which fixes this bug.
msg304756 -(view)Author: R. David Murray (r.david.murray)*(Python committer)Date: 2017-10-22 16:27
Great, thank you for that research.  And yes, that's exactly why __str__ uses utf8=True, the "picture" of the message is much more readable.  I will commit that PR soon.
msg307532 -(view)Author: R. David Murray (r.david.murray)*(Python committer)Date: 2017-12-04 00:51
The PR has been committed.
History
DateUserActionArgs
2022-04-11 14:58:53adminsetgithub: 76012
2017-12-04 00:51:27r.david.murraysetstatus: open -> closed
versions: - Python 3.4, Python 3.5
type: crash -> behavior
messages: +msg307532

resolution: fixed
stage: resolved
2017-10-22 16:27:10r.david.murraysetmessages: +msg304756
2017-10-22 00:48:12calimerotekniksetmessages: +msg304726
2017-10-21 13:55:36r.david.murraysetmessages: +msg304706
2017-10-21 13:49:54calimerotekniksetmessages: +msg304705
2017-10-21 00:12:30r.david.murraysetmessages: +msg304691
2017-10-20 22:27:45calimerotekniksetmessages: +msg304685
versions: + Python 3.4, Python 3.5
2017-10-20 22:22:56calimeroteknikcreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp