Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
gh-76960: Fix urljoining with an empty query string.#5645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
the-knights-who-say-ni commentedFeb 12, 2018
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed thePSF contributor agreement (CLA). Unfortunately we couldn't find an account corresponding to your GitHub username onbugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please followthe steps outlined in the CPython devguide to rectify this issue. Thanks again to your contribution and we look forward to looking at it! |
Bump—I’ve signed the CLA. How do I get that check rerun? @the-knights-who-say-ni if i @ you will it do the thing |
Previously, urllib.urljoin with a relative URL of the form '?' wouldresult in no change to the URL, in spite of the fact that it shouldclear the query string. This solves that case and variations on it.
Removing reviewers who were added on account of a bad rebase. Sorry! |
Force pushing to do an update is almost never the right thing to do and too often ruins a PR by bringing in other commits. Removing the extraneous commits did not removed the reviewer requests, so I did the latter. With the PR back to your commits, I checked the box to run GHA tests so Senthil can see them. |
This PR is stale because it has been open for 30 days with no activity. |
ghost commentedFeb 9, 2023
The following commit authors need to sign the Contributor License Agreement: |
if not query: | ||
# since urlparse doesn't leave any evidence of whether there was a bare | ||
# '?' with an empty query string, we need to check whether it was there. | ||
has_empty_query = url[0] == '?' or url.startswith(scheme + ':?') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
What is thisurl.startswith(scheme + ':?')
special case requirement for and where is this referenced?
The behavior in Ruby and Golang was different for this scenario (but consistent).
require 'uri'base_url = 'https://www.example.com/?a=b'relative_url = 'https:?'url = URI.join(base_url, relative_url).to_sputs url
https:?
And with golanghttps://go.dev/play/p/lui16M9pFyo
package mainimport ("fmt""net/url")func main() {base, _ := url.Parse("https://example.com/?a=b")u, _ := url.Parse("http:?")fmt.Println(base.ResolveReference(u))}
http:?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
If this condition is removed or url.startswith(scheme + ':?')
from this patch, we can consider this PR as it brings the expected behavior seen across other language libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Fromhttps://datatracker.ietf.org/doc/html/rfc3986#section-5.2.2:
-- A non-strict parser may ignore a scheme in the reference -- if it is identical to the base URI's scheme.
For now,urllib.parse
behaves as a non-strict parser (there is special test for this). We can add an option to switch this, but this is a different feature.
But testing onlyurl.startswith(scheme + ':?')
is not enough, becausehttp:?#z
should clear query as well. And there are similar cases with other empty components. I am working on larger and more general PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The addition ofurl.startswith(scheme + ':?')
doesn't look correct to me.
bedevere-bot commentedApr 21, 2023
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
@@ -437,6 +437,12 @@ def test_urljoins(self): | |||
# issue 23703: don't duplicate filename | |||
self.checkJoin('a', 'b', 'b') | |||
# issue 32779: clear the query string when joining with '?' | |||
self.checkJoin('http://a/b/c?d=e', '?', 'http://a/b/c') | |||
self.checkJoin('http://a/b/c?d=e', 'http:?', 'http://a/b/c') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
this is the test case that@orsenthil suggests is wrong as it doesn't match other languages which result inhttp:?
as the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yes, that's correct. We need to remove this. Perhaps I will modify this and bring this PR to a close.
This PR is stale because it has been open for 30 days with no activity. |
This PR is stale because it has been open for 30 days with no activity. |
Thank you for your PR@thetorpedodog. But this issue was fixed in more general way by#123273. |
Uh oh!
There was an error while loading.Please reload this page.
Previously, urllib.urljoin with a relative URL of the form '?' would
result in no change to the URL, in spite of the fact that it should
clear the query string. This solves that case and variations on it.
https://bugs.python.org/issue32779