Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-126367: Fix urllib.request.url2pathname() colon handling in URLs on Windows#127752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
samtstephens wants to merge2 commits intopython:mainfromsamtstephens:main

Conversation

@samtstephens
Copy link

@samtstephenssamtstephens commentedDec 9, 2024
edited by bedevere-appbot
Loading

barneygale reacted with hooray emoji
@bedevere-app
Copy link

Most changes to Pythonrequire a NEWS entry. Add one using theblurb_it web app or theblurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply theskip news label instead.

url=url.replace(':','|')
ifnot'|'inurl:
url=url.replace(':','|',1)
ifnot'|'inurlor ('|'inurlandnot (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can this not be simplified? I had a hard time understanding the code:

url=url.replace(':','|',1)ifnot'|'inurlor ('|'inurlandnot (url.startswith('|')# error caseor'/|'inurl# error caseor (len(url)>2andurl[0]=='/'andurl[1].isalpha()andurl[2]=='|'))):

@barneygalebarneygale self-assigned thisDec 10, 2024
Copy link
Contributor

@barneygalebarneygale left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for taking a look at this

It's perhaps worth taking a step back to think about what we're aiming to do here, and why Windows drive paths need special handling.

The simplest way to create afile: URI is to prependfile: to a path, so:

  • foo/bar.txt becomesfile:foo/bar.txt (relative file URI, not widely supported)
  • c:/foo becomesfile:c:/foo (Windows)
  • /etc/hosts becomesfile:/etc/hosts (POSIX)

For conformance withRFC 1738, folks began adding an empty authority section for absolute POSIX paths, so:

  • /etc/hosts becomesfile:///etc/hosts (orfile://localhost/etc/hosts more explicitly)

This works nicely on POSIX because absolute paths must start with a slash, just like thepath component of URLs.

But what about Windows? A path likeC:/foo can't be directly mapped to a URL path component because it doesn't start with a slash. The solution folks came up with: just add a slash prefix!

  • C:/foo becomesfile:///C:/foo (orfile://localhost/C:/foo more explicitly)

Our task in this PR is to undo that operation. We need to:

  1. Detect when a URL'spath component begins with a slash and then a Windows drive, like/A:/foo/bar or/B|/doo
    • Remove the leading slash if so
  2. If second character is|, convert it to:

Once those steps are done, all that remains is to convert slashes and unquote URL entities.

As far as detecting a Windows drives goes, I suggest having a look at the implementation ofos.path.splitroot() oros.path.isabs() inntpath.py:

defisabs(s):
"""Test whether a path is absolute"""
s=os.fspath(s)
ifisinstance(s,bytes):
sep=b'\\'
altsep=b'/'
colon_sep=b':\\'
double_sep=b'\\\\'
else:
sep='\\'
altsep='/'
colon_sep=':\\'
double_sep='\\\\'
s=s[:3].replace(altsep,sep)
# Absolute: UNC, device, and paths with a drive and root.
returns.startswith(colon_sep,1)ors.startswith(double_sep)

We don't need to check whether the drive letter is alphabetic IMHO. It's probably sufficient to check that 3rd character in the URL path is a colon or a pipe (after checking that the first character is a slash).

@bedevere-app
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be put in the comfy chair!

@barneygale
Copy link
Contributor

Don't worry about breaking the following test case - there's no reason forurl2pathname() to be stricter about Windows drives than functions inntpath.py. It would be good ifurl2pathname() didn't raiseOSError at all.

# Non-ASCII drive letter
self.assertRaises(IOError,fn,"///\u00e8|/")

@barneygale
Copy link
Contributor

All this code smells funny and should be binned IMO:

# Windows itself uses ":" even in URLs.
url=url.replace(':','|')
ifnot'|'inurl:
# No drive specifier, just convert slashes
# make sure not to convert quoted slashes :-)
returnurllib.parse.unquote(url.replace('/','\\'))
comp=url.split('|')
iflen(comp)!=2orcomp[0][-1]notinstring.ascii_letters:
error='Bad URL: '+url
raiseOSError(error)
drive=comp[0][-1]
tail=urllib.parse.unquote(comp[1].replace('/','\\'))
returndrive+':'+tail

@samtstephens
Copy link
Author

Hi, thank you for the detailed response, I really appreciate it!! We will look into these changes and we'll see if we can come up with a better solution. I can definitely say we felt a bit out of our depth, not having a lot of in-depth knowledge with how the filesshould look, so this background is appreciated.

barneygale reacted with thumbs up emoji

@barneygale
Copy link
Contributor

You didn't pick the easiest issue to work on to be honest 😅 file URIs are a bit of a dark art...

@barneygale
Copy link
Contributor

Hey@samtstephens, is there anything I can do to help with this? Ta

@barneygale
Copy link
Contributor

I'm keen to get this sorted in time for python 3.14, so I've opened a PR here:#131428

@barneygale
Copy link
Contributor

Closing this PR as the bug has been fixed elsewhere. Thanks all the same for taking a look!

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@barneygalebarneygalebarneygale requested changes

+1 more reviewer

@nineteendonineteendonineteendo left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

@barneygalebarneygale

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@samtstephens@barneygale@nineteendo

[8]ページ先頭

©2009-2025 Movatter.jp