Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-132380: Use unicode hash/compare for the mcache.#133669

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
nascheme wants to merge5 commits intopython:main
base:main
Choose a base branch
Loading
fromnascheme:mcache-str-hash

Conversation

nascheme
Copy link
Member

@naschemenascheme commentedMay 8, 2025
edited
Loading

This allows the type lookup cache to work with non-interned strings.

pyperformance results

This allows cache to work with non-interned strings.
@picnixz
Copy link
Member

(just adding the skip news label so that you don't get pinged by the bot everytime you push, saying "failed checks")

Ensure we don't read the cache in the case the 'name' argument is anon-exact unicode string.  It's possible to have an overridden `__eq__`method and it using `_PyUnicode_Equal()` in that case would be wrong.
@ngoldbaum
Copy link
Contributor

ngoldbaum commentedMay 8, 2025
edited
Loading

I cherry-picked this PR onto the 3.14 branch and built CPython and LibCST. I had to combineInstagram/LibCST#1324 andInstagram/LibCST#1295 and also back out the Python-level caching I added inInstagram/LibCST#1295, since that's unnecessary with this PR.

I see substantially improved multithreaded scaling, although there's still some contention. Looking at the profiles, it seems like the contention is coming from GC pauses?

Here's the profile I record on 3.14b1 and the 3.14 branch with this PR applied, respectively:

https://share.firefox.dev/43bHvhH

https://share.firefox.dev/3GM0Xdu

Here's the profile using multiprocessing:

https://share.firefox.dev/42QR32I

@ngoldbaum
Copy link
Contributor

(this is on a mac so I can't easily get python-level profiles and line numbers in LibCST's Python code)

@nascheme
Copy link
MemberAuthor

Thanks for the testing. I tested LibCST on Linux and also see a performance improvement. Running the following command in the numpy source folder:

python3 -m libcst.tool codemod --no-format strip_strings_from_types.StripStringsCommand numpy/_core

I get the elapsed run times:

base 3.13, getattr(): 30.02 sec
base 3.13, type_lookup(): 20.94 sec
3.13 + this PR, getattr(): 18.48 sec

@nascheme
Copy link
MemberAuthor

@colesbury This uses unicode string hash/compare instead of using the string pointer value. Unlike your suggestion, this doesn't use a separate lookup loop for the non-interned case, it just always uses the string hash/compare. pyperformance results show a small slowdown (0.4 %?), I was expecting worse since the non-interned case is so uncommon.

I can try a separate loop if you think that's worth pursing. The advantage of this approach is that it's fairly simple code-wise and I think would be a candidate to backport to 3.13 and 3.14. Perhaps for 3.15 we should try a per-thread cache.

Handle common cases early.
Copy link
Contributor

@colesburycolesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think we should do it this way. I don't think it's worth suffering even a small performance penalty for a rare case (non-interned lookup keys), when we can support that without any performance hit.

@@ -0,0 +1,2 @@
For free-threaded build, allow non-interned strings to be cached in the type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think this caches non-interned strings. It seems to me that it allows non-interned strings as the lookup key, but the cache still only contains interned strings.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@colesburycolesburycolesbury left review comments

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

Assignees
No one assigned
Labels
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@nascheme@picnixz@ngoldbaum@colesbury

[8]ページ先頭

©2009-2025 Movatter.jp