Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

bpo-46227: Add pathlib.Path.walk method#30340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

zmievsa
Copy link
Contributor

@zmievsazmievsa commentedJan 2, 2022
edited by bedevere-bot
Loading

@the-knights-who-say-ni

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed thePSF contributor agreement (CLA).

CLA Missing

Our records indicate the following people have not signed the CLA:

@Ovsyanka83

For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please followthe steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You cancheck yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

Copy link
Member

@sobolevnsobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thank you for submitting a PR!

Please, note that testing newly added code is probably a must. Adding tests will increase your chances to get this new feature accepted.

If you have any problems with CLA (which is also required), please, feel free to ping me - I will try to help you 🙂

AlexWaygood and zmievsa reacted with thumbs up emoji
@AlexWaygood
Copy link
Member

+1 to what@sobolevn said — there's a great guide to writing testshere. New tests for this should probably be added tothis file.

zmievsa reacted with thumbs up emoji

@barneygale
Copy link
Contributor

barneygale commentedJan 3, 2022
edited
Loading

I'd support adding some way to walk a directory hierarchy in pathlib.

But I'm not a huge fan of theos.walk() API, specifically the fact that it's aGenerator[Tuple[str,list[str],list[str]], which is clunky if you haven't already memorized it. Pathlib usually smooths over stuff like that.

Here's a possible API for discussion, still usingos.walk() under the hood:

diff --git a/Lib/pathlib.py b/Lib/pathlib.pyindex f1a33178e2..23de30abc6 100644--- a/Lib/pathlib.py+++ b/Lib/pathlib.py@@ -1008,15 +1008,19 @@ def samefile(self, other_path):             other_st = self._accessor.stat(other_path)         return os.path.samestat(st, other_st)-    def iterdir(self):+    def iterdir(self, recurse=False, dirs=True, files=True):         """Iterate over the files in this directory.  Does not yield any         result for the special paths '.' and '..'.         """-        for name in self._accessor.listdir(self):-            if name in {'.', '..'}:-                # Yielding a path object for these makes little sense-                continue-            yield self._make_child_relpath(name)+        for root, dirnames, filenames in os.walk(self):+            if dirs:+                for name in dirnames:+                    yield type(self)(root, name)+            if files:+                for name in filenames:+                    yield type(self)(root, name)+            if not recurse:+                break      def glob(self, pattern):         """Iterate over this subtree and yield all existing files (of any

@zmievsa
Copy link
ContributorAuthor

zmievsa commentedJan 4, 2022
edited
Loading

@barneygale I agree that os.walk API seems pretty clunky. I do, however, believe that the one you're offering is basicallyPath.glob("**/*") or[p for p in Path.glob("**/*") if p.is_file()]. Your approach would also require us to add 6 arguments to iterdir (the ones you offered + the ones os.walk already has), some of which (like followlinks=True and recurse=False, or recurse=False and onerrors=None) contradict each other.

So I believe that we should not add walk's logic into iterdir. We could make something like iterwalk that does all the same things as os.walk with a bit of a simpler api. However, I am not sure how to simplify os.walk API without losing its core features.

There's the issue that os.walk provides you with dir-by-dir iteration by yielding root which is pretty useful in many situations while your solution does not.

Also, os.walk distinguishes between dirs and files so you don't have to and provides you with a simple way to prevent recursion into certain subtrees by removing elements from dirs list, which is pretty hard to get with any simpler APIs. I will commit the changes that add this feature into my implementation tomorrow.

@barneygale
Copy link
Contributor

barneygale commentedJan 4, 2022
edited
Loading

Thanks. I didn't know you could edit the list of directories to prevent recursion - neat!

One more thing to consider: in part for performance reasons, pathlib internally represents a path as a drive, root, and list of string parts. This avoids some costly splitting and joining on directory separators -- for example,Path.parent doesn't parse anything or create any new string objects, whereasos.path.dirname() does.

Pathlib has historically re-implemented algorithms likeos.path.realpath(),os.path.expanduser() etc, and modified them to work with the (drive, root, parts) representation. The idea that this is faster than round-tripping via strings (and hence re-parsing / re-normalising).

In the last couple releases I've removed pathlib's ownrealpath() andexpanduser() implementations, and made it call through toos.path instead. This solved a couple minor bugs and removed a bit of duplication in the CPython codebase, but might have cost some performance. More recently I've been wondering where to draw the line on that sort of thing.

In that case of your patch, the key line isroot_path = Path(root). By using the main Path initialiser you're asking pathlib to re-parse and re-normalise a path. Compare this to thePath.glob() implementation, which does everything it can to avoid re-parsing paths.

I suspect this is all fine, and that practicality beats purity here.

@zmievsa
Copy link
ContributorAuthor

zmievsa commentedJan 4, 2022
edited
Loading

My implementation certainly requires quite a bit of review -- I've never commited to pathlib or to CPython in general. Implementation-wise, I could just reimplement os.walk or just fix the issues in my implementation -- either way is fine with me. But I think the most important issue at hand is the API for the Path.walk.

barneygale reacted with thumbs up emoji

@zmievsa
Copy link
ContributorAuthor

Added the changes I've mentioned. All the features of os.walk are supported now. Let's move our discussion here:
https://discuss.python.org/t/add-pathlib-path-walk-method/

@zmievsa
Copy link
ContributorAuthor

I will add the tests soon. If anyone has any ideas about what tests we need specifically, I'd love to hear them.

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actionsgithub-actionsbot added the staleStale PR or inactive for long period of time. labelFeb 8, 2022
Copy link
Contributor

@MaxwellDupreMaxwellDupre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Could you add some text to docs, please?
Otherwise how will users know this is available and how to use.

@zmievsazmievsa closed thisMay 8, 2022
@zmievsazmievsaforce-pushed thebpo-46227/add-pathlib.Path.walk-method branch fromebf6811 toa82baedCompareMay 8, 2022 14:39
@bedevere-bot
Copy link

Every change to Pythonrequires a NEWS entry.

Please, add it using theblurb_it Web app or theblurb command-line tool.

@Ovsyanka83Ovsyanka83mannequin mentioned this pull requestMay 8, 2022
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@sobolevnsobolevnsobolevn left review comments

@MaxwellDupreMaxwellDupreMaxwellDupre requested changes

Assignees
No one assigned
Labels
awaiting core reviewstaleStale PR or inactive for long period of time.type-featureA feature request or enhancement
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

7 participants
@zmievsa@the-knights-who-say-ni@AlexWaygood@barneygale@bedevere-bot@sobolevn@MaxwellDupre

[8]ページ先頭

©2009-2025 Movatter.jp