Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.3k
GH-126367: Fix urllib.request.url2pathname() colon handling in URLs on Windows#127752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Most changes to Pythonrequire a NEWS entry. Add one using theblurb_it web app or theblurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
| url=url.replace(':','|') | ||
| ifnot'|'inurl: | ||
| url=url.replace(':','|',1) | ||
| ifnot'|'inurlor ('|'inurlandnot ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can this not be simplified? I had a hard time understanding the code:
url=url.replace(':','|',1)ifnot'|'inurlor ('|'inurlandnot (url.startswith('|')# error caseor'/|'inurl# error caseor (len(url)>2andurl[0]=='/'andurl[1].isalpha()andurl[2]=='|'))):
barneygale left a comment• edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks for taking a look at this
It's perhaps worth taking a step back to think about what we're aiming to do here, and why Windows drive paths need special handling.
The simplest way to create afile: URI is to prependfile: to a path, so:
foo/bar.txtbecomesfile:foo/bar.txt(relative file URI, not widely supported)c:/foobecomesfile:c:/foo(Windows)/etc/hostsbecomesfile:/etc/hosts(POSIX)
For conformance withRFC 1738, folks began adding an empty authority section for absolute POSIX paths, so:
/etc/hostsbecomesfile:///etc/hosts(orfile://localhost/etc/hostsmore explicitly)
This works nicely on POSIX because absolute paths must start with a slash, just like thepath component of URLs.
But what about Windows? A path likeC:/foo can't be directly mapped to a URL path component because it doesn't start with a slash. The solution folks came up with: just add a slash prefix!
C:/foobecomesfile:///C:/foo(orfile://localhost/C:/foomore explicitly)
Our task in this PR is to undo that operation. We need to:
- Detect when a URL'spath component begins with a slash and then a Windows drive, like
/A:/foo/baror/B|/doo- Remove the leading slash if so
- If second character is
|, convert it to:
Once those steps are done, all that remains is to convert slashes and unquote URL entities.
As far as detecting a Windows drives goes, I suggest having a look at the implementation ofos.path.splitroot() oros.path.isabs() inntpath.py:
Lines 80 to 95 in5c89adf
| defisabs(s): | |
| """Test whether a path is absolute""" | |
| s=os.fspath(s) | |
| ifisinstance(s,bytes): | |
| sep=b'\\' | |
| altsep=b'/' | |
| colon_sep=b':\\' | |
| double_sep=b'\\\\' | |
| else: | |
| sep='\\' | |
| altsep='/' | |
| colon_sep=':\\' | |
| double_sep='\\\\' | |
| s=s[:3].replace(altsep,sep) | |
| # Absolute: UNC, device, and paths with a drive and root. | |
| returns.startswith(colon_sep,1)ors.startswith(double_sep) |
We don't need to check whether the drive letter is alphabetic IMHO. It's probably sufficient to check that 3rd character in the URL path is a colon or a pipe (after checking that the first character is a slash).
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be put in the comfy chair! |
barneygale commentedDec 10, 2024
Don't worry about breaking the following test case - there's no reason for cpython/Lib/test/test_urllib.py Lines 1498 to 1499 in5c89adf
|
barneygale commentedDec 10, 2024
All this code smells funny and should be binned IMO: Lines 28 to 40 in5c89adf
|
samtstephens commentedDec 10, 2024
Hi, thank you for the detailed response, I really appreciate it!! We will look into these changes and we'll see if we can come up with a better solution. I can definitely say we felt a bit out of our depth, not having a lot of in-depth knowledge with how the filesshould look, so this background is appreciated. |
barneygale commentedDec 10, 2024
You didn't pick the easiest issue to work on to be honest 😅 file URIs are a bit of a dark art... |
barneygale commentedFeb 5, 2025
Hey@samtstephens, is there anything I can do to help with this? Ta |
barneygale commentedMar 18, 2025
I'm keen to get this sorted in time for python 3.14, so I've opened a PR here:#131428 |
barneygale commentedMar 18, 2025
Closing this PR as the bug has been fixed elsewhere. Thanks all the same for taking a look! |
Uh oh!
There was an error while loading.Please reload this page.