Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.4k
gh-92081: Fix for email.generator.Generator with whitespace between encoded words.#92281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
abadger commentedMay 5, 2022 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
This is a work in progress because#92081 has one other issue that also needs to be fixed. Whitespace at the start of the Subject is being omitted as well. I'm leaning towards fixing that one in the decoder but I'm still reading through the rfcs to see if it has anything to say about that. Note: the other issue has been fixed here as well and this is ready for review. |
bc7a42d
to80f5cfa
Comparebec90d8
to69b205d
Compare@@ -1628,7 +1629,7 @@ def test_address_display_names(self): | |||
'Lôrem ipsum dôlôr sit amet, cônsectetuer adipiscing. ' | |||
'Suspendisse pôtenti. Aliquam nibh. Suspendisse pôtenti.', | |||
'=?utf-8?q?L=C3=B4rem_ipsum_d=C3=B4l=C3=B4r_sit_amet=2C_c' | |||
'=C3=B4nsectetuer?=\n =?utf-8?q?adipiscing=2E_Suspendisse' | |||
'=C3=B4nsectetuer?=\n =?utf-8?q?_adipiscing=2E_Suspendisse' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Note, the data in the unittest was buggy. If you run the original output throughdecode_header()
you'll find that it is missing the space betweencônsectetuer
andadipiscing
This fix is now ready to be reviewed. |
69b205d
to4104b8e
CompareHey@warsaw , this fix is ready for review if you have some cycles to spare for thinking about email and the stdlib. |
alex-pobeditel-2004 commentedDec 12, 2023
@abadger I tested this fix on my project and it worked like a charm. Inexpressible thanks! Will use your patch until this will not be merged into CPython :) |
email.generator.Generator currently does not handle whitespace betweenencoded words correctly when the encoded words span multiple lines. Thecurrent generator will create an encoded word for each line. If the endof the line happens to correspond with the end real word in theplaintext, the generator will place an unencoded space at the start ofthe subsequent lines to represent the whitespace between the plaintextwords.A compliant decoder will strip all the whitespace from between twoencoded words which leads to missing spaces in the round-trippedoutput.The fix for this is to make sure that whitespace between two encodedwords ends up inside of one or the other of the encoded words. Thisfix places the space inside of the second encoded word.A second problem happens with continuation lines. A continuation line thatstarts with whitespace and is followed by a non-encoded word is fine becausethe newline between such continuation lines is defined as condensing toa single space character. When the continuation line starts with whitespacefollowed by an encoded word, however, the RFCs specify that the word is runtogether with the encoded word on the previous line. This is because normalwords are filded on syntactic breaks by encoded words are not.The solution to this is to add the whitespace to the start of the encoded wordon the continuation line.Test cases are frompython#92081
4104b8e
to5071b52
CompareThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks for your contribution to Python!
…ween encoded words. (pythonGH-92281)* Fix for email.generator.Generator with whitespace between encoded words.email.generator.Generator currently does not handle whitespace betweenencoded words correctly when the encoded words span multiple lines. Thecurrent generator will create an encoded word for each line. If the endof the line happens to correspond with the end real word in theplaintext, the generator will place an unencoded space at the start ofthe subsequent lines to represent the whitespace between the plaintextwords.A compliant decoder will strip all the whitespace from between twoencoded words which leads to missing spaces in the round-trippedoutput.The fix for this is to make sure that whitespace between two encodedwords ends up inside of one or the other of the encoded words. Thisfix places the space inside of the second encoded word.A second problem happens with continuation lines. A continuation line thatstarts with whitespace and is followed by a non-encoded word is fine becausethe newline between such continuation lines is defined as condensing toa single space character. When the continuation line starts with whitespacefollowed by an encoded word, however, the RFCs specify that the word is runtogether with the encoded word on the previous line. This is because normalwords are filded on syntactic breaks by encoded words are not.The solution to this is to add the whitespace to the start of the encoded wordon the continuation line.Test cases are frompythonGH-92081* Rename a variable so it's not confused with the final variable.(cherry picked from commita6fdb31)Co-authored-by: Toshio Kuratomi <a.badger@gmail.com>
…ween encoded words. (pythonGH-92281)* Fix for email.generator.Generator with whitespace between encoded words.email.generator.Generator currently does not handle whitespace betweenencoded words correctly when the encoded words span multiple lines. Thecurrent generator will create an encoded word for each line. If the endof the line happens to correspond with the end real word in theplaintext, the generator will place an unencoded space at the start ofthe subsequent lines to represent the whitespace between the plaintextwords.A compliant decoder will strip all the whitespace from between twoencoded words which leads to missing spaces in the round-trippedoutput.The fix for this is to make sure that whitespace between two encodedwords ends up inside of one or the other of the encoded words. Thisfix places the space inside of the second encoded word.A second problem happens with continuation lines. A continuation line thatstarts with whitespace and is followed by a non-encoded word is fine becausethe newline between such continuation lines is defined as condensing toa single space character. When the continuation line starts with whitespacefollowed by an encoded word, however, the RFCs specify that the word is runtogether with the encoded word on the previous line. This is because normalwords are filded on syntactic breaks by encoded words are not.The solution to this is to add the whitespace to the start of the encoded wordon the continuation line.Test cases are frompythonGH-92081* Rename a variable so it's not confused with the final variable.(cherry picked from commita6fdb31)Co-authored-by: Toshio Kuratomi <a.badger@gmail.com>
GH-119245 is a backport of this pull request to the3.13 branch. |
GH-119246 is a backport of this pull request to the3.12 branch. |
…tween encoded words. (GH-92281) (#119245)* Fix for email.generator.Generator with whitespace between encoded words.email.generator.Generator currently does not handle whitespace betweenencoded words correctly when the encoded words span multiple lines. Thecurrent generator will create an encoded word for each line. If the endof the line happens to correspond with the end real word in theplaintext, the generator will place an unencoded space at the start ofthe subsequent lines to represent the whitespace between the plaintextwords.A compliant decoder will strip all the whitespace from between twoencoded words which leads to missing spaces in the round-trippedoutput.The fix for this is to make sure that whitespace between two encodedwords ends up inside of one or the other of the encoded words. Thisfix places the space inside of the second encoded word.A second problem happens with continuation lines. A continuation line thatstarts with whitespace and is followed by a non-encoded word is fine becausethe newline between such continuation lines is defined as condensing toa single space character. When the continuation line starts with whitespacefollowed by an encoded word, however, the RFCs specify that the word is runtogether with the encoded word on the previous line. This is because normalwords are filded on syntactic breaks by encoded words are not.The solution to this is to add the whitespace to the start of the encoded wordon the continuation line.Test cases are fromGH-92081* Rename a variable so it's not confused with the final variable.(cherry picked from commita6fdb31)Co-authored-by: Toshio Kuratomi <a.badger@gmail.com>
…tween encoded words. (GH-92281) (#119246)* Fix for email.generator.Generator with whitespace between encoded words.email.generator.Generator currently does not handle whitespace betweenencoded words correctly when the encoded words span multiple lines. Thecurrent generator will create an encoded word for each line. If the endof the line happens to correspond with the end real word in theplaintext, the generator will place an unencoded space at the start ofthe subsequent lines to represent the whitespace between the plaintextwords.A compliant decoder will strip all the whitespace from between twoencoded words which leads to missing spaces in the round-trippedoutput.The fix for this is to make sure that whitespace between two encodedwords ends up inside of one or the other of the encoded words. Thisfix places the space inside of the second encoded word.A second problem happens with continuation lines. A continuation line thatstarts with whitespace and is followed by a non-encoded word is fine becausethe newline between such continuation lines is defined as condensing toa single space character. When the continuation line starts with whitespacefollowed by an encoded word, however, the RFCs specify that the word is runtogether with the encoded word on the previous line. This is because normalwords are filded on syntactic breaks by encoded words are not.The solution to this is to add the whitespace to the start of the encoded wordon the continuation line.Test cases are fromGH-92081* Rename a variable so it's not confused with the final variable.(cherry picked from commita6fdb31)Co-authored-by: Toshio Kuratomi <a.badger@gmail.com>
…ween encoded words. (python#92281)* Fix for email.generator.Generator with whitespace between encoded words.email.generator.Generator currently does not handle whitespace betweenencoded words correctly when the encoded words span multiple lines. Thecurrent generator will create an encoded word for each line. If the endof the line happens to correspond with the end real word in theplaintext, the generator will place an unencoded space at the start ofthe subsequent lines to represent the whitespace between the plaintextwords.A compliant decoder will strip all the whitespace from between twoencoded words which leads to missing spaces in the round-trippedoutput.The fix for this is to make sure that whitespace between two encodedwords ends up inside of one or the other of the encoded words. Thisfix places the space inside of the second encoded word.A second problem happens with continuation lines. A continuation line thatstarts with whitespace and is followed by a non-encoded word is fine becausethe newline between such continuation lines is defined as condensing toa single space character. When the continuation line starts with whitespacefollowed by an encoded word, however, the RFCs specify that the word is runtogether with the encoded word on the previous line. This is because normalwords are filded on syntactic breaks by encoded words are not.The solution to this is to add the whitespace to the start of the encoded wordon the continuation line.Test cases are frompython#92081* Rename a variable so it's not confused with the final variable.
Uh oh!
There was an error while loading.Please reload this page.
email.generator.Generator currently does not handle whitespace between
encoded words correctly when the encoded words span multiple lines. The
current generator will create an encoded word for each line. If the end
of the line happens to correspond with the end real word in the
plaintext, the generator will place an unencoded space at the start of
the subsequent lines to represent the whitespace between the plaintext
words.
A compliant decoder will strip all the whitespace from between two
encoded words which leads to missing spaces in the round-tripped
output.
The fix for this is to make sure that whitespace between two encoded
words ends up inside of one or the other of the encoded words. This
fix places the space inside of the second encoded word.
Test case from#92081