Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.4k
[3.8] bpo-43882 - urllib.parse should sanitize urls containing ASCII newline and tabs. (GH-25595)#25726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…e and tabs. (pythonGH-25595)* issue43882 - urllib.parse should sanitize urls containing ASCII newline and tabs.Co-authored-by: Gregory P. Smith <greg@krypto.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>(cherry picked from commit76cd81d)Co-authored-by: Senthil Kumaran <senthil@uthcode.com>
@orsenthil: Status check is done, and it's a failure ❌ . |
Uh oh!
There was an error while loading.Please reload this page.
Lib/urllib/parse.py Outdated
@@ -443,6 +451,7 @@ def urlsplit(url, scheme='', allow_fragments=True): | |||
if '?' in url: | |||
url, query = url.split('?', 1) | |||
_checknetloc(netloc) | |||
url = _remove_unsafe_bytes_from_url(url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think we need to do this up front right after_coerce_args(url, ...)
(and should do the same in 3.9 and 3.10).
by this point in 3.8 we've potentially allowed characters to slip through into query and fragment of http: urls.
at a minumum if we weren't do this right after _coerce_args before looking in the cache, this needs to be done right beforeany splitting happens in this branch of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
the test case should be updated to include an invalid character in each of the five portions of the url.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
allowed characters to slip through into query and fragment
I had a doubt initially, and I thought (from practice) that newlines and tabs are percent-encoded when in query/fragment than they are removed upfront.
We will have find a spec that will unambiguously state what should be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We've already got that unambiguous spec.https://url.spec.whatwg.org/#concept-basic-url-parser step 3. strip these three characters no matter what before the parsing state machine starts.
bedevere-bot commentedMay 1, 2021
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be put in the comfy chair! |
Note: this will miss 3.8.10 but as a security fix will be included in 3.8.11 later in the year. |
Uh oh!
There was an error while loading.Please reload this page.
Lib/test/test_urlparse.py Outdated
self.assertEqual(p.geturl(), "x-new-scheme://www.python.org/#"648"> | |||
# Remove ASCII tabs and newlines from input as bytes, any scheme. | |||
url = b"x-new-scheme\t://www.python.org/java\nscript:\talert('msg\r\n')/?query\n=\tsomething#frag\nment" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Updated test cases to verify removal in all parts of the URL.
Uh oh!
There was an error while loading.Please reload this page.
I have made the requested changes; please review again. |
Sorry, I can't merge this PR. Reason: |
This can be merged now,@ambv . Thank you. |
bedevere-bot commentedMay 5, 2021
@ambv: Please replace |
Thanks@miss-islington for the PR, and@ambv for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7. |
Thanks! ✨ 🍰 ✨ |
…newline and tabs. (pythonGH-25595) (pythonGH-25726)Co-authored-by: Gregory P. Smith <greg@krypto.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>(cherry picked from commit76cd81d)Co-authored-by: Senthil Kumaran <senthil@uthcode.com>Co-authored-by: Senthil Kumaran <skumaran@gatech.edu>(cherry picked from commit515a7bc)Co-authored-by: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
bedevere-bot commentedMay 5, 2021
GH-25923 is a backport of this pull request to the3.7 branch. |
…newline and tabs. (pythonGH-25595) (pythonGH-25726)Co-authored-by: Gregory P. Smith <greg@krypto.org>Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>(cherry picked from commit76cd81d)Co-authored-by: Senthil Kumaran <senthil@uthcode.com>Co-authored-by: Senthil Kumaran <skumaran@gatech.edu>(cherry picked from commit515a7bc)Co-authored-by: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
bedevere-bot commentedMay 5, 2021
GH-25924 is a backport of this pull request to the3.6 branch. |
Uh oh!
There was an error while loading.Please reload this page.
Co-authored-by: Gregory P. Smithgreg@krypto.org
Co-authored-by: Serhiy Storchakastorchaka@gmail.com
(cherry picked from commit76cd81d)
Co-authored-by: Senthil Kumaransenthil@uthcode.com
https://bugs.python.org/issue43882