Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-121650: Encode newlines in headers, and verify headers are sound#122233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

encukou
Copy link
Member

@encukouencukou commentedJul 24, 2024
edited by bedevere-appbot
Loading

Re:#121812

Hello@basbloemsaat,

I've spent the day reading through the email module, and RFCs, and I believe I found a better place to fix the issue.
This involved lots of experimentation, so I'm sending an alternative PR rather than a review on yours.

  • The generator (writer) verifies that the representation of each header is sound (a parser won't treat it as multiple headers, start-of-body, or part of another header). That should cover customfold() implementations orHeader subclasses.

    • However, some user out there is probably misusing such header injection in working code, so, I added a policy attribute to turn it back.
  • Newlines areencoded infold(), just like undecodable bytes and other special characters.

Overall, this means that we treat newlines as validcontent of headers, but “escape” them when such a header isserialized to text.

This PR is a proof of concept. It needs tests and documentation, but I'm out of time for today, and I wanted to share what I have.

Does this look reasonable to you?

encukouand others added2 commitsJuly 24, 2024 15:30
This should fail for custom fold() implementations that aren't carefulabout newlines.
Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.---Credit for an earlier attempt:Co-Authored-By: Bas Bloemsaat <bas@bloemsaat.org>
@basbloemsaat
Copy link
Contributor

That sounds entirely reasonable, and conforms to the RFCs.

Two points:

  1. what would be the use of keeping this check in email.policy.header_store_parse ?
  2. I found one case that is not covered by this (contrived, I admit):
fromemailimportmessage_from_stringemail_in="""Subject: foo <bar>\nBCC: injected@example.comTo: incoming+tag@me.example.comFrom: External Sender <sender@them.example.com>message body"""msg=message_from_string(email_in)print(msg)

@encukou
Copy link
MemberAuthor

what would be the use of keeping this check in email.policy.header_store_parse ?

This is in the branch that handles strings (rather than custom Header object). I'm not clears what kind of format that string is supposed to be in.
Keeping the check means that if someone relied on this, they'll get the same error as before. Also, the indication that something is wrong will come earlier.

I found one case that is not covered by this (contrived, I admit)

That\n there is Python syntax, by the time it gets tomessage_from_string it's the same as a “real” newline.
If you use a raw stringr""", or read from a file, the header remains on a single line.

basbloemsaat reacted with thumbs up emoji

@encukouencukou added needs backport to 3.12only security fixes needs backport to 3.13bugs and security fixes labelsJul 27, 2024
I'm not touching other instances in this file, since this PR mightbe backported to very old versions.
@encukouencukou marked this pull request as ready for reviewJuly 29, 2024 13:18
@encukouencukou requested a review froma team as acode ownerJuly 29, 2024 13:18
@encukou
Copy link
MemberAuthor

@serhiy-storchaka, would you like to review this?
@warsaw,@bitdancer,@maxking, asemail experts, do you have any comments?

@encukouencukou added needs backport to 3.8 needs backport to 3.9only security fixes needs backport to 3.10only security fixes needs backport to 3.11only security fixes 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelsJul 29, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by@encukou for commitaf41733 🤖

If you want to schedule another build, you need to add the🔨 test-with-buildbots label again.

@bedevere-botbedevere-bot removed the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelJul 29, 2024
Copy link
Member

@serhiy-storchakaserhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-appbedevere-appbot removed the needs backport to 3.11only security fixes labelAug 2, 2024
ambv pushed a commit to ambv/cpython that referenced this pull requestAug 2, 2024
…s are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-122609 is a backport of this pull request to the3.10 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.10only security fixes labelAug 2, 2024
ambv pushed a commit to ambv/cpython that referenced this pull requestAug 2, 2024
… are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-122610 is a backport of this pull request to the3.9 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.9only security fixes labelAug 2, 2024
ambv pushed a commit to ambv/cpython that referenced this pull requestAug 2, 2024
… are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app
Copy link

GH-122611 is a backport of this pull request to the3.8 branch.

Yhg1s pushed a commit that referenced this pull requestAug 6, 2024
…sound (GH-122233) (#122484)gh-121650: Encode newlines in headers, and verify headers are sound (GH-122233)GH-GH- Encode header parts that contain newlinesPer RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.GH-GH- Verify that email headers are well-formedThis should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Yhg1s pushed a commit that referenced this pull requestAug 6, 2024
…sound (GH-122233) (#122599)*gh-121650: Encode newlines in headers, and verify headers are sound (GH-122233)- Encode header parts that contain newlinesPer RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.- Verify that email headers are well-formedThis should fail for custom fold() implementations that aren't carefulabout newlines.Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>(cherry picked from commit0976339)* Document changes as made in 3.12.5
hroncok pushed a commit to fedora-python/cpython that referenced this pull requestAug 6, 2024
…s are soundpythongh-121650: Encode newlines in headers, and verify headers are sound (pythonGH-122233)Encode header parts that contain newlinesPer RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.Verify that email headers are well-formedThis should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
frenzymadness pushed a commit to frenzymadness/cpython that referenced this pull requestAug 13, 2024
…s are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
frenzymadness pushed a commit to fedora-python/cpython that referenced this pull requestAug 15, 2024
…s are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
stratakis pushed a commit to stratakis/cpython that referenced this pull requestAug 15, 2024
…s are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
hrnciar added a commit to hrnciar/cpython that referenced this pull requestAug 16, 2024
 headers are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>This patch also contains modified commit cherry picked fromc5bba85.This commit was backported to simplify the backport of the other commitfixing CVE. The only modification is a removal of one test case whichtests multiple changes in Python 3.7 and it wasn't working properlywith Python 3.6 where we backported only one change.Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>
hrnciar added a commit to fedora-python/cpython that referenced this pull requestAug 16, 2024
 headers are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)This patch also contains modified commit cherry picked fromc5bba85.This commit was backported to simplify the backport of the other commitfixing CVE. The only modification is a removal of one test case whichtests multiple changes in Python 3.7 and it wasn't working properlywith Python 3.6 where we backported only one change.Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>
hrnciar added a commit to fedora-python/cpython that referenced this pull requestAug 20, 2024
 headers are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)This patch also contains modified commit cherry picked fromc5bba85.This commit was backported to simplify the backport of the other commitfixing CVE. The only modification is a removal of one test case whichtests multiple changes in Python 3.7 and it wasn't working properlywith Python 3.6 where we backported only one change.Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>
blhsing pushed a commit to blhsing/cpython that referenced this pull requestAug 22, 2024
…ound (pythonGH-122233)## Encode header parts that contain newlinesPer RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.## Verify that email headers are well-formedThis should fail for custom fold() implementations that aren't carefulabout newlines.Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
ambv added a commit that referenced this pull requestSep 4, 2024
…ound (GH-122233) (#122611)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
ambv added a commit that referenced this pull requestSep 4, 2024
…sound (GH-122233) (#122608)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.Verify that email headers are well-formed.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
ambv added a commit that referenced this pull requestSep 4, 2024
…sound (GH-122233) (#122609)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
ambv added a commit that referenced this pull requestSep 4, 2024
…ound (GH-122233) (#122610)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
brainhoard-github pushed a commit to distro-core-curated-mirrors/poky-contrib that referenced this pull requestSep 16, 2024
Changelog:https://docs.python.org/release/3.12.5/whatsnew/changelog.htmlInclude security fixCVE-2024-6923Reference:https://nvd.nist.gov/vuln/detail/CVE-2024-6923python/cpython#122233(From OE-Core rev: 777cad793a5b07d392b1d9875530fb5480e75863)Signed-off-by: Vijay Anusuri <vanusuri@mvista.com>Signed-off-by: Steve Sakoman <steve@sakoman.com>
hrnciar added a commit to fedora-python/cpython that referenced this pull requestApr 23, 2025
…s are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)This patch also contains modified commit cherry picked fromc5bba85.This commit was backported to simplify the backport of the other commitfixing CVE. The only modification is a removal of one test case whichtests multiple changes in Python 3.7 and it wasn't working properlywith Python 3.6 where we backported only one change.Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>
hroncok pushed a commit to fedora-python/cpython that referenced this pull requestJul 4, 2025
…s are sound (pythonGH-122233)Per RFC 2047:> [...] these encoding schemes allow the> encoding of arbitrary octet values, mail readers that implement this> decoding should also ensure that display of the decoded data on the> recipient's terminal will not cause unwanted side-effectsIt seems that the "quoted-word" scheme is a valid way to includea newline character in a header value, just like we already allowundecodable bytes or control characters.They do need to be properly quoted when serialized to text, though.This should fail for custom fold() implementations that aren't carefulabout newlines.(cherry picked from commit0976339)This patch also contains modified commit cherry picked fromc5bba85.This commit was backported to simplify the backport of the other commitfixing CVE. The only modification is a removal of one test case whichtests multiple changes in Python 3.7 and it wasn't working properlywith Python 3.6 where we backported only one change.Co-authored-by: Petr Viktorin <encukou@gmail.com>Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@serhiy-storchakaserhiy-storchakaserhiy-storchaka approved these changes

Assignees

@encukouencukou

Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

5 participants
@encukou@basbloemsaat@bedevere-bot@ambv@serhiy-storchaka

[8]ページ先頭

©2009-2025 Movatter.jp