NotificationsYou must be signed in to change notification settings
Fork32.3k
Star67.8k

gh-132380: Use unicode hash/compare for the mcache.#133669

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Closed

nascheme wants to merge5 commits intopython:mainfromnascheme:mcache-str-hash

Closed

gh-132380: Use unicode hash/compare for the mcache.#133669

nascheme wants to merge5 commits intopython:mainfromnascheme:mcache-str-hash

Conversation

Copy link

Member

nascheme commentedMay 8, 2025•
edited
Loading

This allows the type lookup cache to work with non-interned strings.

pyperformance results

Issue:Lock contention inside_PyType_LookupRef #132380

Use unicode hash/compare for mcache.

b6bca55

This allows cache to work with non-interned strings.

nascheme added performance

Performance or resource usage

topic-free-threading labels

May 8, 2025

bedevere-appbot mentioned this pull request

May 8, 2025

Lock contention inside_PyType_LookupRef#132380

Open

picnixz added the skip news label

May 8, 2025

Copy link

Member

picnixz commentedMay 8, 2025

(just adding the skip news label so that you don't get pinged by the bot everytime you push, saying "failed checks")

nascheme added3 commits

May 8, 2025 11:23

Add NEWS.

3ebc1e8

Merge 'origin/main' into mcache-str-hash

9c4e7c4

Only look in cache for exact unicode strings.

8335fa1

Ensure we don't read the cache in the case the 'name' argument is anon-exact unicode string.  It's possible to have an overridden `__eq__`method and it using `_PyUnicode_Equal()` in that case would be wrong.

nascheme removed the skip news label

May 8, 2025

Copy link

Contributor

ngoldbaum commentedMay 8, 2025•
edited
Loading

I cherry-picked this PR onto the 3.14 branch and built CPython and LibCST. I had to combineInstagram/LibCST#1324 andInstagram/LibCST#1295 and also back out the Python-level caching I added inInstagram/LibCST#1295, since that's unnecessary with this PR.

I see substantially improved multithreaded scaling, although there's still some contention. Looking at the profiles, it seems like the contention is coming from GC pauses?

Here's the profile I record on 3.14b1 and the 3.14 branch with this PR applied, respectively:

https://share.firefox.dev/43bHvhH

https://share.firefox.dev/3GM0Xdu

Here's the profile using multiprocessing:

https://share.firefox.dev/42QR32I

Copy link

Contributor

ngoldbaum commentedMay 8, 2025

(this is on a mac so I can't easily get python-level profiles and line numbers in LibCST's Python code)

Copy link

MemberAuthor

nascheme commentedMay 8, 2025

Thanks for the testing. I tested LibCST on Linux and also see a performance improvement. Running the following command in the numpy source folder:

python3 -m libcst.tool codemod --no-format strip_strings_from_types.StripStringsCommand numpy/_core

I get the elapsed run times:

base 3.13, getattr(): 30.02 sec
base 3.13, type_lookup(): 20.94 sec
3.13 + this PR, getattr(): 18.48 sec

nascheme marked this pull request as ready for review

May 9, 2025 19:11

nascheme requested a review frommarkshannon as acode owner

May 9, 2025 19:11

bedevere-appbot added the awaiting core review label

May 9, 2025

nascheme requested a review fromcolesbury

May 9, 2025 19:12

nascheme mentioned this pull request

May 9, 2025

GH-132380: Add optimization for non-interned type lookup.#132652

Closed

Copy link

MemberAuthor

nascheme commentedMay 9, 2025

@colesbury This uses unicode string hash/compare instead of using the string pointer value. Unlike your suggestion, this doesn't use a separate lookup loop for the non-interned case, it just always uses the string hash/compare. pyperformance results show a small slowdown (0.4 %?), I was expecting worse since the non-interned case is so uncommon.

I can try a separate loop if you think that's worth pursing. The advantage of this approach is that it's fairly simple code-wise and I think would be a candidate to backport to 3.13 and 3.14. Perhaps for 3.15 we should try a per-thread cache.

Optimize mcache_name_eq().

7723ceb

Handle common cases early.

colesbury reviewed

May 12, 2025

View reviewed changes

Copy link

Contributor

colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think we should do it this way. I don't think it's worth suffering even a small performance penalty for a rare case (non-interned lookup keys), when we can support that without any performance hit.

Misc/NEWS.d/next/Core_and_Builtins/2025-05-08-11-22-57.gh-issue-132380._9vB7H.rst

		@@ -0,0 +1,2 @@
		For free-threaded build, allow non-interned strings to be cached in the type

Copy link

Contributor

colesburyMay 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think this caches non-interned strings. It seems to me that it allows non-interned strings as the lookup key, but the cache still only contains interned strings.

ngoldbaum mentioned this pull request

May 22, 2025

Reduce contention on free-threaded buildInstagram/LibCST#1325

Open

Copy link

MemberAuthor

nascheme commentedJun 4, 2025

Closing this since I agree with Sam that the performance hit is too much to pay for improving such a rare case.

nascheme closed this

Jun 4, 2025

Labels

awaiting core review performance

Performance or resource usage

topic-free-threading

4 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-132380: Use unicode hash/compare for the mcache.#133669

gh-132380: Use unicode hash/compare for the mcache.#133669

Uh oh!

Conversation

nascheme commentedMay 8, 2025•
edited
Loading

Uh oh!

Uh oh!

picnixz commentedMay 8, 2025

Uh oh!

ngoldbaum commentedMay 8, 2025•
edited
Loading

Uh oh!

Uh oh!

ngoldbaum commentedMay 8, 2025

Uh oh!

nascheme commentedMay 8, 2025

Uh oh!

nascheme commentedMay 9, 2025

Uh oh!

colesbury left a comment

Choose a reason for hiding this comment

Uh oh!

colesburyMay 12, 2025

Choose a reason for hiding this comment

Uh oh!

nascheme commentedJun 4, 2025

Uh oh!

Uh oh!

Movatterモバイル変換

Uh oh!

gh-132380: Use unicode hash/compare for the mcache.#133669

gh-132380: Use unicode hash/compare for the mcache.#133669

Uh oh!

Conversation

nascheme commentedMay 8, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

picnixz commentedMay 8, 2025

Uh oh!

ngoldbaum commentedMay 8, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

ngoldbaum commentedMay 8, 2025

Uh oh!

nascheme commentedMay 8, 2025

Uh oh!

nascheme commentedMay 9, 2025

Uh oh!

colesbury left a comment

Choose a reason for hiding this comment

Uh oh!

colesburyMay 12, 2025

Choose a reason for hiding this comment

Uh oh!

nascheme commentedJun 4, 2025

Uh oh!

Uh oh!

nascheme commentedMay 8, 2025•
edited
Loading

ngoldbaum commentedMay 8, 2025•
edited
Loading