Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-132657: Add lock-free set contains implementation#132290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
nascheme wants to merge12 commits intopython:main
base:main
Choose a base branch
Loading
fromnascheme:set_lockfree_contains

Conversation

nascheme
Copy link
Member

@naschemenascheme commentedApr 8, 2025
edited
Loading

This roughly follows what was done for dictobject to make a lock-free lookup operation. On a benchmark runningset.__contains__ in a tight loop, this is 1.5x faster on my computer. In the bm_deepcopy benchmark, the gains are very modest, between 1 to 2% faster. On the "set_contains" scaling benchmark, the results are much better. Also, the multi-threaded scaling of "copy" and "deepcopy" seem to be measurably improved.

Summary of changes:

  • refactorset_lookkey() intoset_do_lookup() which now takes a function pointer that does the entry comparison. This is similar to dictobject anddo_lookup(). In an optimized build, the comparison function is inlined and there should be no performance cost to this.
  • changeset_do_lookup to return a status separately from the entry value.
  • addset_compare_frozenset() and use if the object is a frozenset. For the free-threaded build, this avoids some overhead (locking, atomic operations, incref/decref on key)
  • useFT_ATOMIC_* macros as needed for atomic loads and stores
  • use a deferred free on the set table array, if shared (only on free-threaded build, normal build always does an immediate free)
  • for free-threaded build, use explicit for loop to zero the table, rather thanmemcpy().
  • when mutating the set, assignso->table to NULL while the change is a happening. Assign the real table array after the change is done.

Free-threading scaling benchmark results from the attached scripts (result for 6 cores in parallel). This is a modified version of theftscalingbenchmark.py script.

basethis PR
dict_contains4.0x faster4.0x faster
tuple_contains5.4x faster5.3x faster
list_contains7.1x faster6.1x faster
frozenset_contains1.0x faster5.9x faster
frozenset_contains_dunder6.4x faster3.9x faster
set_contains1.0x slower5.4x faster
set_contains_dunder1.4x faster5.6x faster
shallow_copy1.9x faster3.7x faster
deepcopy2.5x faster3.5x faster

ftscaling_set.py.txt

@naschemenascheme changed the titleAdd lock-free set contains implementionGH-132657: Add lock-free set contains implementationJul 12, 2025
@naschemenascheme added performancePerformance or resource usage topic-free-threading 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelsJul 12, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by@nascheme for commit70a1c1f 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F132290%2Fmerge

If you want to schedule another build, you need to add the🔨 test-with-buildbots label again.

@bedevere-botbedevere-bot removed the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labelJul 12, 2025
}
Py_ssize_t ep_hash = ep->hash;
if (ep_hash == hash) {
if (PyUnicode_CheckExact(startkey)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This optimization was introduced to avoid the check on mutating tables and the incref/decref on thestartkey. Since that is not relevant for the frozenset, we can could perhaps remove this fast path. (there will still be a minor gain because unicode_eq is used directly, but thePyUnicode_CheckExact check also takes time for the non-unicode cases).

Seeeendebakpt@93035c4, textHacked up version...

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Okay. Based on my benchmarking, the difference is small. So, removing that special case seems better.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@eendebakpteendebakpteendebakpt left review comments

@kumaraditya303kumaraditya303kumaraditya303 left review comments

@rhettingerrhettingerAwaiting requested review from rhettingerrhettinger will be requested when the pull request is marked ready for reviewrhettinger is a code owner

Assignees
No one assigned
Labels
performancePerformance or resource usagetopic-free-threading
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@nascheme@bedevere-bot@eendebakpt@kumaraditya303

[8]ページ先頭

©2009-2025 Movatter.jp