Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-102613: Improve performance ofpathlib.Path.rglob()#104244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

barneygale
Copy link
Contributor

@barneygalebarneygale commentedMay 6, 2023
edited
Loading

Stop de-duplicating results in_RecursiveWildcardSelector. A new_DoubleRecursiveWildcardSelector class is introduced which performs de-duplication, but this is usedonly for patterns with multiple non-adjacent** segments, such aspath.glob('**/foo/**'). By avoiding the use of a set in most cases,PurePath.__hash__() is not called, and so paths do not need to be parsed and (case-) normalised.

Also merge adjacent** segments in patterns.

Timings:

$ ./python -m timeit -s 'from pathlib import Path; p = Path()' 'list(p.glob("**/*"))'1 loop, best of 5: 197 msec per loop   # before2 loops, best of 5: 146 msec per loop  # after--> 35% faster
$ ./python -m timeit -s 'from pathlib import Path; p = Path()' 'list(p.glob("**/**/*"))'1 loop, best of 5: 1.77 sec per loop   # before2 loops, best of 5: 146 msec per loop  # after--> 12x faster
$ ./python -m timeit -s 'from pathlib import Path; p = Path()' 'list(p.glob("**/*/**"))'1 loop, best of 5: 738 msec per loop   # before1 loop, best of 5: 731 msec per loop   # after--> about the same

Stop de-duplicating results in `_RecursiveWildcardSelector`. A new`_DoubleRecursiveWildcardSelector` class is introduced which performsde-duplication, but this is used _only_ for patterns with multiplenon-adjacent `**` segments, such as `path.glob('**/foo/**')`. By avoidingthe use of a set, `PurePath.__hash__()` is not called, and so paths do notneed to be parsed and (case-) normalised.Also merge adjacent '**' segments in patterns.
@barneygalebarneygale merged commitc0ece3d intopython:mainMay 7, 2023
jbower-fb pushed a commit to jbower-fb/cpython that referenced this pull requestMay 8, 2023
…nGH-104244)Stop de-duplicating results in `_RecursiveWildcardSelector`. A new`_DoubleRecursiveWildcardSelector` class is introduced which performsde-duplication, but this is used _only_ for patterns with multiplenon-adjacent `**` segments, such as `path.glob('**/foo/**')`. By avoidingthe use of a set, `PurePath.__hash__()` is not called, and so paths do notneed to be stringified and case-normalised.Also merge adjacent '**' segments in patterns.
carljm added a commit to carljm/cpython that referenced this pull requestMay 9, 2023
* main: (47 commits)pythongh-97696 Remove unnecessary check for eager_start kwarg (python#104188)pythonGH-104308: socket.getnameinfo should release the GIL (python#104307)pythongh-104310: Add importlib.util.allowing_all_extensions() (pythongh-104311)pythongh-99113: A Per-Interpreter GIL! (pythongh-104210)pythonGH-104284: Fix documentation gettext build (python#104296)pythongh-89550: Buffer GzipFile.write to reduce execution time by ~15% (python#101251)pythongh-104223: Fix issues with inheriting from buffer classes (python#104227)pythongh-99108: fix typo in Modules/Setup (python#104293)pythonGH-104145: Use fully-qualified cross reference types for the bisect module (python#104172)pythongh-103193: Improve `getattr_static` test coverage (python#104286)  Trim trailing whitespace and test on CI (python#104275)pythongh-102500: Remove mention of bytes shorthand (python#104281)pythongh-97696: Improve and fix documentation for asyncio eager tasks (python#104256)pythongh-99108: Replace SHA3 implementation HACL* version (python#103597)pythongh-104273: Remove redundant len() calls in argparse function (python#104274)pythongh-64660: Don't hardcode Argument Clinic return converter result variable name (python#104200)pythongh-104265 Disallow instantiation of `_csv.Reader` and `_csv.Writer` (python#104266)pythonGH-102613: Improve performance of `pathlib.Path.rglob()` (pythonGH-104244)pythongh-103650: Fix perf maps address format (python#103651)pythonGH-89812: Churn `pathlib.Path` methods (pythonGH-104243)  ...
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@JelleZijlstraJelleZijlstraJelleZijlstra approved these changes

Assignees
No one assigned
Labels
performancePerformance or resource usagetopic-pathlib
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@barneygale@JelleZijlstra@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp