Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-114847: Speed upposixpath.realpath()#114848

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
barneygale merged 23 commits intopython:mainfrombarneygale:gh-114847
Apr 5, 2024

Conversation

@barneygale
Copy link
Contributor

@barneygalebarneygale commentedFeb 1, 2024
edited
Loading

Apply the following optimizations toposixpath.realpath():

  • Remove use of recursion
  • Construct child paths directly rather than usingjoin()
  • Useos.getcwd[b]() rather thanabspath()
  • Usestartswith(sep) rather thanisabs()
  • Use slicing rather thansplit()

$ ./python -m timeit -s"from os.path import realpath""realpath('.')"100000 loops, best of 5: 2.74 usec per loop# before200000 loops, best of 5: 1.53 usec per loop# after# --> 1.79x faster$ ./python -m timeit -s"from os.path import realpath""realpath('..')"100000 loops, best of 5: 2.9  usec per loop# before200000 loops, best of 5: 1.71 usec per loop# after# --> 1.7x faster$ ./python -m timeit -s"from os.path import realpath""realpath('../..')" 50000 loops, best of 5: 4.1  usec per loop# before200000 loops, best of 5: 1.99 usec per loop# after# --> 2.06x faster$ ./python -m timeit -s"from os.path import realpath""realpath('Lib/test')"50000 loops, best of 5: 8.06 usec per loop# before50000 loops, best of 5: 6.19 usec per loop# after# --> 1.3x faster$ ./python -m timeit -s"from os.path import realpath""realpath('Lib/test/nonexistent/..')"20000 loops, best of 5: 12.6 usec per loop# before50000 loops, best of 5:  9.4 usec per loop# after# --> 1.34x faster$ ./python -m timeit -s"from os.path import realpath""realpath('/home/barney')"50000 loops, best of 5: 6.41 usec per loop# before50000 loops, best of 5: 4.62 usec per loop# after# --> 1.39x faster

Apply the following optimizations to `posixpath.realpath()`:- Remove use of recursion- Directly construct child paths rather than using `join()`- Use `os.getcwd[b]()` rather than `abspath()`- Use `startswith(sep)` rather than `isabs()`
@barneygalebarneygale added the performancePerformance or resource usage labelFeb 1, 2024
@barneygale
Copy link
ContributorAuthor

I've added a few wordy comments because:

  1. When I first read the code, it took me some time to really understand whatseen is doing
  2. Applying tail call optimisation (this PR) further obscures the purpose ofseen

Hopefully it's a bit easier for the next person to grok with the comments in place.

Copy link
Member

@gpsheadgpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I like that this gets rid of recursion.

@barneygale
Copy link
ContributorAuthor

Given that symlinks are rare, perhaps I've actually slowed things down by checking inseen prior to runninglstat(). I'll see if reverting that change is faster.

@encukou
Copy link
Member

The introduction of aquerying flag that's reset once changes behaviour, sincerest can return back into an existing path.

Given a directory and a link to it:

$ mkdir a_dir$ ln -s a_dir a_link

with Python 3.12 I get:

>>> os.path.realpath('foo/../a_link')'/tmp/links/a_dir'

With this PR:

>>> os.path.realpath('foo/../a_link')'/tmp/links/a_link'

The previous behaviour matches coreutils:

$ realpath -m foo/../a_link/tmp/links/a_dir
barneygale reacted with thumbs up emoji

@encukou
Copy link
Member

Thanks. I see the other use ofquerying -- a symlink loop -- retains previous behaviour (which is tested), even though it doesn't match coreutils'realpath -m :/

$ mkdir a_dir$ ln -s a_dir a_link$ ln -s loop1 loop2$ ln -s loop2 loop1$ realpath -m loop1/../a_link//tmp/links/a_dir
>>> os.path.realpath('foo/../loop1/../a_link/')'/tmp/links/a_link'

Guess that's one bug we need to let live.

Copy link
Member

@encukouencukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM, thank you!

Co-authored-by: Petr Viktorin <encukou@gmail.com>
@barneygale
Copy link
ContributorAuthor

Thank you for the reviews!

@barneygalebarneygaleenabled auto-merge (squash)April 5, 2024 12:06
@barneygalebarneygale merged commitabfa16b intopython:mainApr 5, 2024
@barneygale
Copy link
ContributorAuthor

Thanks. I see the other use ofquerying -- a symlink loop -- retains previous behaviour (which is tested), even though it doesn't match coreutils'realpath -m :/

(For posterity:) I opened an issue about this:#117546

diegorusso pushed a commit to diegorusso/cpython that referenced this pull requestApr 17, 2024
Apply the following optimizations to `posixpath.realpath()`:- Remove use of recursion- Construct child paths directly rather than using `join()`- Use `os.getcwd[b]()` rather than `abspath()`- Use `startswith(sep)` rather than `isabs()`- Use slicing rather than `split()`Co-authored-by: Petr Viktorin <encukou@gmail.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@gpsheadgpsheadgpshead left review comments

@encukouencukouencukou approved these changes

Assignees

No one assigned

Labels

performancePerformance or resource usage

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@barneygale@encukou@gpshead

[8]ページ先頭

©2009-2025 Movatter.jp