Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.4k
gh-101196: Make isdir/isfile/exists faster on Windows#101324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
eryksun commentedJan 25, 2023
A builtin |
Co-authored-by: Eryk Sun <eryksun@gmail.com>
eryksun commentedJan 25, 2023
A few cases that are handled by
|
mdboom commentedJan 25, 2023
I guess this means we don't have test coverage for them...? In any event, I agree we should probably try to cover these cases. I'll see what can be done.
My initial implementation from this morning did that, which I like because it doesn't repeat what's clearly battle-tested code, but it made thingsslower in the event of symlinks. I was almost ok with that given that that isn't the common case, but your solution seemed to make everything a little faster (ignoring these new corner cases). |
eryksun commentedJan 25, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There shouldn't be an issue with speed for common cases as long as you only fall back on
|
Lib/test/test_ntpath.py Outdated
| @unittest.skipIf(sys.platform!='win32',"drive letters are a windows concept") | ||
| deftest_isfile_driveletter(self): | ||
| current_drive=os.path.splitdrive(os.path.abspath(__file__))[0]+"\\" | ||
| self.assertFalse(os.path.isfile(current_drive)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The reason this is false is because a relative drive path like "C:" resolves to the working directory on the drive. Did you want to test a volume device path liker'\\.\C:' instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ah, yes. And this is exactly the thing that (erroneously) returnsTrue without checking the error code fromGetFileInformationByHandleEx.
mdboom commentedJan 26, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
@eryksun: Thanks for all of your help and pointers. I've implemented support for these corner cases by calling I'm a little stuck on writing unit tests for them. I was able to cover I'm not sure how to recreate the case you describe in For |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
zooba left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Looks fine to me. Any other POVs?
erlend-aasland commentedFeb 6, 2023
Looks good to me, but I'll add that there's a lot of code duplication in here; it might be worth it to refactor the code shortly after landing this PR, so the added code is more maintainable and readable. |
mdboom commentedFeb 6, 2023
I'll take a first pass at reducing some of this duplication. |
zooba commentedFeb 6, 2023
My read of the duplication is that there's just enough difference that it's probably not worth it. The main change I'd like to see (but wasn't requiring) is to add an option to If all the additional functions were using the same OS call, it would make sense to merge them all. But they're subtly different each time, and since this is for perf, we don't want to introduce more switches. |
erlend-aasland commentedFeb 6, 2023 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
+1 (I suggest adding that in a follow-up PR; let's land this.)
True that. |
eryksun commentedFeb 7, 2023
Michael, are you still working on refactoring the code? Steve and Erlend seem to be in agreement that it's ready to merge as is. I think it's ready. |
zooba commentedFeb 8, 2023
Just heard that it's good to go as it is. |
miss-islington commentedFeb 8, 2023
miss-islington commentedFeb 8, 2023
miss-islington commentedFeb 8, 2023
Sorry,@mdboom and@zooba, I could not cleanly backport this to |
miss-islington commentedFeb 8, 2023
Sorry@mdboom and@zooba, I had trouble checking out the |
zooba commentedFeb 8, 2023
This ought to be safe to backport from a public API perspective, but I'll let the bot figure out how easy it'll be. |
miss-islington commentedFeb 8, 2023
miss-islington commentedFeb 8, 2023
Sorry,@mdboom and@zooba, I could not cleanly backport this to |
zooba commentedFeb 8, 2023
Okay, it's not going to backport trivially :) If anyone's up for it, go ahead, otherwise I'll take a look when I get time. |
hauntsaninja commentedFeb 9, 2023
(It's relatively rare that we backport performance improvements, so if it ends up being complicated maybe we shouldn't) |
* main: (82 commits)pythongh-101670: typo fix in PyImport_ExtendInittab() (python#101723)pythonGH-99293: Document that `Py_TPFLAGS_VALID_VERSION_TAG` shouldn't be used. (#pythonGH-101736) no-issue: Add Dong-hee Na as the cjkcodecs codeowner (pythongh-101731)pythongh-101678: Merge math_1_to_whatever() and math_1() (python#101730)pythongh-101678: refactor the math module to use special functions from c11 (pythonGH-101679)pythongh-85984: Remove legacy Lib/pty.py code. (python#92365)pythongh-98831: Use opcode metadata for stack_effect() (python#101704)pythongh-101283: Version was just released, so should be changed in 3.11.3 (pythonGH-101719)pythongh-101283: Fix use of unbound variable (pythonGH-101712)pythongh-101283: Improved fallback logic for subprocess with shell=True on Windows (pythonGH-101286)pythongh-101277: Port more itertools static types to heap types (python#101304)pythongh-98831: Modernize CALL and family (python#101508)pythonGH-101696: invalidate type version tag in `_PyStaticType_Dealloc` (python#101697)pythongh-100221: Fix creating dirs in `make sharedinstall` (pythonGH-100329)pythongh-101670: typo fix in PyImport_AppendInittab() (pythonGH-101672)pythongh-101196: Make isdir/isfile/exists faster on Windows (pythonGH-101324)pythongh-101614: Don't treat python3_d.dll as a Python DLL when checking extension modules for incompatibility (pythonGH-101615)pythongh-100933: Improve `check_element` helper in `test_xml_etree` (python#100934)pythonGH-101578: Normalize the current exception (pythonGH-101607)pythongh-47937: Note that Popen attributes are read-only (python#93070) ...
Uh oh!
There was an error while loading.Please reload this page.
This makes
isdir,isfileandexistsfaster on Windows, by making fewer API calls than callingos.statand then looking at that result.On the following microbenchmarks:
Microbenchmarks
I get the following results:
So therefore, these operations are around 13%-28% faster, depending on whether the file exists, is a symlink etc.
NOTE: This does not improve the performance of
pathlib.Path.is_dirand friends. The behavior there is slightly different in terms of error handling, but I think a similar approach could also be applied there. If this PR is acceptable, I plan to do that as follow-up work.