Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.4k
GH-119169: Skip reversing sibling directories inos.[f]walk(topdown=False)#119473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
…wn=False)`In `os.walk(topdown=False)`, don't bother reversing `walk_dirs`. This meansthat sibling directories are visited in a different order, but 1) thatorder is arbitrary and comes from `os.scandir()`, and 2) unlike in top-downmode, users can't influence which directories are visited or in what order.This change caused `test_walk_bottom_up` to fail. I think this test madeassertions that were too specific and relied on `os.scandir()` returningthings in a specific order, and the test code is pretty hard to understandonce you get into the details. I've replaced it with a version of the sametest from `test_pathlib_abc.py`.
os.walk(topdown=False)os.[f]walk(topdown=False)barneygale commentedJul 6, 2024
Hey@serhiy-storchaka, just for context: this change doesn't provide much of a speedup in itself, but it enables more significant speedups seen inGH-119186. I've separated it out here because it's an observable change that breaks a test. But I suspect that doesn't matter as the |
serhiy-storchaka commentedJul 8, 2024
If |
barneygale commentedJul 8, 2024
It presently visits sibling directories in the same order they appear in What I'm proposing here is that we don'tneed to visit sibling directories in the same order they appear in
Perhaps a worked example would be helpful? I can put one together. |
barneygale commentedJul 8, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Here's an example, first with the current >>>os.mkdir('mydir')>>>os.mkdir('mydir/subdir1')>>>os.mkdir('mydir/subdir2')>>>forentryinos.walk('mydir',topdown=False):...print(entry)... ('mydir/subdir2', [], [])('mydir/subdir1', [], [])('mydir', ['subdir2','subdir1'], []) And with this PR: >>>forentryinos.walk('mydir',topdown=False):...print(entry)... ('mydir/subdir1', [], [])('mydir/subdir2', [], [])('mydir', ['subdir2','subdir1'], []) Note how the entry for If we think the new behaviour is acceptable (e.g. because we never guaranteed the order of sibling visits) then it unlocks an optimization: we can add paths to the stackwhile scanning directories rather than in a separate loop afterwards. |
serhiy-storchaka commentedJul 8, 2024
I think that it would be confusing if The fact that the current code needs to reverse the lists of directories is just an implementation detail. It is not exposed to the user. If the code used BTW, how much it makes difference? What it in comparison of using |
barneygale commentedJul 8, 2024
barneygale commentedJul 8, 2024
Hm, the |
Uh oh!
There was an error while loading.Please reload this page.
In
os.walk(topdown=False), don't bother reversingwalk_dirs. This means that sibling directories are visited in a different order, but 1) that order is arbitrary and comes fromos.scandir(), and 2) unlike in top-down mode, users can't influence which directories are visited or in what order.This change caused
test_walk_bottom_upto fail. I think this test made assertions that were too specific and relied onos.scandir()returning things in a specific order. The test code is pretty hard to understand once you get into the details. I've replaced it with a version of the same test fromtest_pathlib_abc.py. The new test checks that every directory is yieldedbefore all its ancestors, andafter all its descendants, but does not check the order of its yieldedsiblings.os.walk()#119169