This roughly follows what was done for dictobject to make a lock-free lookup operation. On a benchmark runningset.__contains__ in a tight loop, this is 1.5x faster on my computer. In the bm_deepcopy benchmark, the gains are very modest, between 1 to 2% faster. On the "set_contains" scaling benchmark, the results are much better. Also, the multi-threaded scaling of "copy" and "deepcopy" seem to be measurably improved.

Summary of changes:

refactorset_lookkey() intoset_do_lookup() which now takes a function pointer that does the entry comparison. This is similar to dictobject anddo_lookup(). In an optimized build, the comparison function is inlined and there should be no performance cost to this.
changeset_do_lookup to return a status separately from the entry value.
addset_compare_frozenset() and use if the object is a frozenset. For the free-threaded build, this avoids some overhead (locking, atomic operations, incref/decref on key)
useFT_ATOMIC_* macros as needed for atomic loads and stores
use a deferred free on the set table array, if shared (only on free-threaded build, normal build always does an immediate free)
for free-threaded build, use explicit for loop to zero the table, rather thanmemcpy().
when mutating the set, assignso->table to NULL while the change is a happening. Assign the real table array after the change is done.

Free-threading scaling benchmark results from the attached scripts (result for 6 cores in parallel). This is a modified version of theftscalingbenchmark.py script.

	base	this PR
dict_contains	4.0x faster	4.0x faster
tuple_contains	5.4x faster	5.3x faster
list_contains	7.1x faster	6.1x faster
frozenset_contains	1.0x faster	5.9x faster
frozenset_contains_dunder	6.4x faster	3.9x faster
set_contains	1.0x slower	5.4x faster
set_contains_dunder	1.4x faster	5.6x faster
shallow_copy	1.9x faster	3.7x faster
deepcopy	2.5x faster	3.5x faster

ftscaling_set.py.txt

Issue:copy.copy and copy.deepcopy scale poorly with free-threading #132657

wip: lock-free set contains

ff1d60d

kumaraditya303 reviewed

Apr 13, 2025

View reviewed changes

Objects/setobject.c OutdatedShow resolvedHide resolved

nascheme mentioned this pull request

Jul 11, 2025

gh-132657: Avoid locks and refcounts in frozenset operations#136107

Open

nascheme added8 commits

July 11, 2025 16:52

Use FT_ATOMIC_* macros.

55ab02a

This makes for longer code vs using the custom LOAD_*/STORE_* macros.However, I think this makes the code more clear.

Increase items and loops for set test.

7df8f02

Re-order some atomic store operations.

157cd60

Add and use set_compare_frozenset().

4c3596c

Merge 'origin/main' into set_lockfree_contains

6efe562

Fix _PyMem_FreeDelayed() calls, need size.

8ff7dbd

Fix frozenset contains method.

87278ef

Add NEWS.

b2affbf

nascheme changed the title~~Add lock-free set contains implemention~~GH-132657: Add lock-free set contains implementation

Jul 12, 2025

bedevere-appbot mentioned this pull request

Jul 12, 2025

copy.copy and copy.deepcopy scale poorly with free-threading#132657

Open

Re-generate clinic output.

70a1c1f

nascheme added performance

Performance or resource usage

topic-free-threading 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section labels

Jul 12, 2025

Copy link

bedevere-bot commentedJul 12, 2025

🤖 New build scheduled with the buildbot fleet by@nascheme for commit70a1c1f 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F132290%2Fmerge

If you want to schedule another build, you need to add the🔨 test-with-buildbots label again.

bedevere-bot removed the 🔨 test-with-buildbotsTest PR w/ buildbots; report in status section label

Jul 12, 2025

eendebakpt reviewed

Jul 13, 2025

View reviewed changes

Misc/NEWS.d/next/Core_and_Builtins/2025-07-11-19-57-27.gh-issue-132657.vwDuO2.rst OutdatedShow resolvedHide resolved

Objects/setobject.c Outdated

		}
		Py_ssize_t ep_hash = ep->hash;
		if (ep_hash == hash) {
		if (PyUnicode_CheckExact(startkey)

Copy link

Contributor

eendebakptJul 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This optimization was introduced to avoid the check on mutating tables and the incref/decref on thestartkey. Since that is not relevant for the frozenset, we can could perhaps remove this fast path. (there will still be a minor gain because unicode_eq is used directly, but thePyUnicode_CheckExact check also takes time for the non-unicode cases).

Seeeendebakpt@93035c4, textHacked up version...

Copy link

MemberAuthor

naschemeJul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Okay. Based on my benchmarking, the difference is small. So, removing that special case seems better.

nascheme added2 commits

July 14, 2025 10:48

Better markup in NEWS.

64b17af

Remove unicode case for set_compare_frozenset.

6c339f4

Labels

performance

Performance or resource usage

topic-free-threading

4 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GH-132657: Add lock-free set contains implementation#132290

Are you sure you want to change the base?

GH-132657: Add lock-free set contains implementation#132290

Uh oh!

Conversation

nascheme commentedApr 8, 2025•
edited
Loading

Uh oh!

Uh oh!