Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-89812: Addpathlib._PathBase#106337

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
barneygale merged 37 commits intopython:mainfrombarneygale:gh-89812-omgtarpath
Sep 30, 2023

Conversation

barneygale
Copy link
Contributor

@barneygalebarneygale commentedJul 2, 2023
edited
Loading

Add privatepathlib._PathBase class. This will be used by an experimental PyPI package to incubate atarfile.TarPath class.

Implementation notes

Inpathlib.py:

  • SplitPath into_PathBase andPath
  • Implement_PathBase.is_junction() andis_mount() usingstat(). In the latter case, restore the implementation deleted ingh-86943: implementpathlib.WindowsPath.is_mount() #31458
  • Implement_PathBase.resolve() usingabsolute() andreadlink()
  • Implement_PathBase._scandir() usingiterdir()

Intest_pathlib.py:

  • AddPathBaseTest to exercise the_PathBase class directly
  • SplitPathTest intoDummyPathTest andPathTest. The former exercises a representative subclass of_PathBase.
  • AddDummyPathWithSymlinksTest to exercise a similar subclass, this time with symlink support.

Questionable bits

Inpathlib.py, the_PathBase.is_junction() and_PathBase.resolve() methods are new code. The latter is very performance-sensitive and may need more work.

Future work

  • Makepathlib._PathBase public. There's still at least a couple of things to do first:
    • MakePurePath._flavour public
    • Figure out what to do with_PathBase.__hash__(),__eq__(), and other comparison methods.

CAM-Gerlach reacted with thumbs up emoji
@barneygale
Copy link
ContributorAuthor

barneygale commentedJul 7, 2023
edited
Loading

@brettcannon a long time ago ina PR far, far away, you said:

My current thinking (while I work through my PR review backlog to reach this PR), is this should be used forzipfile.Path. I also wouldn't be opposed to that happening this PR if@barneygale wanted to give that a shot while he waits for a review from me.

I couldn't make this work forzipfile.Path due to a couple of fundamental differences in the interface, e.g. theroot attribute stores a string inpathlib.PurePath, but aZipFile instance inzipfile.Path. It's possible to make it work with a couple of deprecations/removals, or an entirely newZipPath class.

So in this PR I've chosentarfile.TarPath as the initial target. I think it does a good job of showing the viability of thepathlib._VirtualPath interface.

However, I have a bunch of open questions in the PR description about howTarPath itself should work. It's not as straightforward as.zip files, because tarballs support symlinks and hardlinks, and becausetarfile.TarFile doesn't support reading and writing the same file.

These questions make me think thattarfile.TarPath should first be gestated in a PyPI package before we add it to the standard library. But that leaves a bit of a chicken-and-egg problem with gettingpathlib._VirtualPath in.

Do you have any advice on the best way forward?

Thanks!

@brettcannon
Copy link
Member

These questions make me think thattarfile.TarPath should first be gestated in a PyPI package before we add it to the standard library. But that leaves a bit of a chicken-and-egg problem with gettingpathlib._VirtualPath in.

You have a couple of options:

  1. Start a discussion on discuss.python.org fortarfile and see if there's clear consensus on what to do.
  2. Make_VirtualPath as private and do your experimental PyPI project knowing it might be perpetually tied to a specific Python version (and make that clear to users of the package).
  3. Go with your gut and hope for the best (and err on the side of leaving things out, especially if they can be added in later). 😅
barneygale reacted with thumbs up emoji

@merwok
Copy link
Member

I am unsure about the name VirtualPath!

The way I’ve seen it used, a TarPath class, or SshPath, or S3Path would be examples of virtual filesystems: I give them paths that look like regular filesystem paths and they do custom operations to return info or data.

Their base class is not itself a virtual path, it’s an (abstract?) base class!

barneygale reacted with thumbs up emoji

@barneygale
Copy link
ContributorAuthor

Fair point! How aboutPathBase? It squares with the names of base classes inio, I think.

merwok reacted with hooray emoji

@barneygale
Copy link
ContributorAuthor

Alternatively,pathlib.abc.VirtualPath?

@merwok
Copy link
Member

merwok commentedJul 10, 2023
edited
Loading

Not a fan of deep modules, and even.abc.VirtualPath doesn’t seem great to me.
I have posted on discuss about VirtualPath vs PathBase!

barneygale reacted with thumbs up emoji

@barneygalebarneygale changed the titleGH-89812: Addtarfile.TarPathGH-89812: Addpathlib._PathBaseJul 12, 2023
@barneygalebarneygale removed the request for review fromethanfurmanJuly 12, 2023 19:04
@RonnyPfannschmidt
Copy link

just stumbled uppon this and im wondering if a concept of a public Accessor of a path might make sense
where an accessor provides the implementation, then for example tarfile/zipfile paths could just have the accessor

additionally it would be trivial to have "roots" at a path or a "root" with a directory file descriptor/network protocol

@barneygale
Copy link
ContributorAuthor

just stumbled uppon this and im wondering if a concept of a public Accessor of a path might make sense where an accessor provides the implementation, then for example tarfile/zipfile paths could just have the accessor

We actually used to have accessor objects, but their interface was much the same asPath objects, and so they amounted to nothing more than a layer of indirection. They were removed in#25701.

If we add accessors back in, most users would need to subclasstwo classes to implement their own path objects: the accessor class and the path class. The latter would be necessary for any user who wants to add their own path methods, for example.

@RonnyPfannschmidt
Copy link

those "accessors" where nothing more than aliases, after all the implementation as done had no power over what the pure path portion can do with the real world

a actualy "accessor" would be able to be tied to something like a directory file descriptor, or a path object, or a entirely different thing thats not a filesystem

and yes there is some need for synchronization between concrete path capabilities and the accessors,

however there is certain equivalence classes to keep in mind (Posix, Windows), so in a lot of cases my impression is that there will be more "accessors" than paths

@barneygale
Copy link
ContributorAuthor

Intriguing. Could you sketch out the accessor API?

@merwok
Copy link
Member

(in the forums rather than a PR discussion please)

ethanfurman reacted with thumbs up emoji

@RonnyPfannschmidt
Copy link

i can dedicate some time to it later

a basic concept of the accessor version ought to consider something like:

  • windows paths

  • posix paths

  • unc paths

  • directory file descriptor paths (both secured and unsecured)

  • paths that pretend to be a chroot on top of another path/pathlike

  • archives

  • ssh/scp

  • urls/http

  • distinctions between readonly and readwrite accessors

  • capabilities of accessors (like users/perms/xattrs)

  • wrappers (like union fs) to support interesting testing
    (imagine a path that maps onto the real fs but pretends certain files are deleted/replaced)

    additionally for purely virtual accessors it might be interesting if parts of the path could be objects (like dates), but thats currently out of scope

@barneygale
Copy link
ContributorAuthor

barneygale commentedSep 13, 2023
edited
Loading

To summarise my thoughts on the above: in many cases this PR helps bring those things about, and I don't think it hinders any of those things. But let's chat more in a forum thread.

What's left in this PR:

On the latter point, there are still two subtle and related problems:

  • _PathBase.resolve(strict=True) can raiseOSError with the wrong values (errno, path). For example,resolve('/etc/hosts/foo') should raiseNotADirectoryError(path='/etc/hosts'), but instead raisesFileNotFoundError(path='/etc/hosts/foo')
  • To avoid an infinite loop, the implementation of_PathBase.readlink() must not callself.resolve(), and so users must supply a path with no symlinks, except in the final position. This is a bit different toPath.readlink().

Both of these issues could be solved if we can provide some way tostat() a path without resolvingany symlinks. Theresolve() method would call thisstat() method first, and only callreadlink() if the stat result indicates a symlink. A couple of options spring to mind:

  • We could allow_PathBase.stat(follow_symlinks=None), meaning "where possible, don't followany symlinks in the path; the path I've provided is the direct physical path". But it's not particularly obvious, andfollow_symlinks=None means something different inglob() andrglob().
  • We could add some other keyword-only argument to_PathBase.stat() with the same effect.
  • We could add a new method --direct_stat(),stat1(),dstat()?

@barneygale
Copy link
ContributorAuthor

Bah, pushed a commit labelled "WIP". In fact it contains a solution to the above problem: stash a_resolving flag on path objects, which meansresolve() can safely callstat() andreadlink() without hitting infinite recursion. No weird APIs for users.

This will require further refactoring in another PR.
@barneygale
Copy link
ContributorAuthor

Hey@AA-Turner and@gst, thank you for your earlier reviews, very much appreciated. Please could you give it another pass, if you can spare the time?

I feel OK about_PathBase.resolve(): it's entirely private for the timebeing, and I intend to add further tests and optimization before we drop the underscore from_PathBase, so if we can't spot any clear defects I suspect it's fine to land? Grateful for your guidance!

gst reacted with thumbs up emoji

@AA-TurnerAA-Turner self-requested a reviewSeptember 24, 2023 03:04
Copy link
Member

@AA-TurnerAA-Turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks! Just two notes. I think_split_stack makes sense a method.

A

barneygale reacted with heart emoji
@barneygalebarneygale merged commit89966a6 intopython:mainSep 30, 2023
@barneygale
Copy link
ContributorAuthor

Thank you so much for the reviews, everyone :-)

ap-- and ofek reacted with hooray emoji

@AA-Turner
Copy link
Member

Congratulations Barney, a lot of hard work on your part paid off!

A

ofek, barneygale, and jamesmyatt reacted with heart emoji

Glyphack pushed a commit to Glyphack/cpython that referenced this pull requestSep 2, 2024
Add private `pathlib._PathBase` class. This will be used by an experimental PyPI package to incubate a `tarfile.TarPath` class.Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@AA-TurnerAA-TurnerAA-Turner approved these changes

@ofekofekofek left review comments

@gstgstgst approved these changes

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

9 participants
@barneygale@brettcannon@merwok@ap--@RonnyPfannschmidt@AA-Turner@gst@ofek@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp