This uses critical sections to make heapq methods that update the heap thread-safe when the GIL is disabled. This is accomplished by using the @critical_section clinic directive.

cc:@mpage @colesbury

Issue:Audit all built-in modules for thread safety #116738

pythongh-116738: Make _heapq module thread-safe

ec3cade

Copy link

python-cla-botbot commentedJun 2, 2025•
edited
Loading

All commit authors signed the Contributor License Agreement.

bedevere-appbot mentioned this pull request

Jun 2, 2025

Audit all built-in modules for thread safety#116738

Open

colesbury requested review frommpage andcolesbury

June 2, 2025 16:06

StanFromIreland reviewed

Jun 2, 2025

View reviewed changes

Copy link

Contributor

StanFromIreland left a comment•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There is a lot of duplication in tests because of min/max heaps, why not organize with subtests?

Also for testing the heap, it is probably best to reuse the existing methods fromtest_heapq

cpython/Lib/test/test_heapq.py

Lines 106 to 116 in0558275

	defcheck_invariant(self,heap):
	# Check the heap invariant.
	forpos,iteminenumerate(heap):
	ifpos:# pos 0 has no parent
	parentpos= (pos-1)>>1
	self.assertTrue(heap[parentpos]<=item)

	defcheck_max_invariant(self,heap):
	forpos,iteminenumerate(heap[1:],start=1):
	parentpos= (pos-1)>>1
	self.assertGreaterEqual(heap[parentpos],item)

ZeroIntensity added the topic-free-threading label

Jun 2, 2025

ZeroIntensity self-requested a review

June 2, 2025 17:10

ZeroIntensity reviewed

Jun 2, 2025

View reviewed changes

Copy link

Member

ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think it would also be a good idea to get rid of the borrowed reference usage here.

Modules/_heapqmodule.cShow resolvedHide resolved

yoney marked this pull request as ready for review

June 2, 2025 17:38

yoney requested a review fromrhettinger as acode owner

June 2, 2025 17:38

bedevere-appbot added the awaiting review label

Jun 2, 2025

colesbury added the needs backport to 3.14bugs and security fixes label

Jun 2, 2025

Copy link

Contributor

colesbury commentedJun 2, 2025

The Modules/_heapqmodule.c change looks good to me. I haven't looked through the tests yet. I don't think should change the implementation to avoid borrowed references: I'd rather keep the change small and limited to the thread safety fix rather than try to "clean" things up, and I'm not really convinced that avoiding borrowed references here would make things better.

@yoney - would you please add add a NEWS entry viablurb. You can useuvx blurb, if you have the uv tools, or pip install it, orblurb-it.

Copy link

Member

ZeroIntensity commentedJun 2, 2025

I'd rather keep the change small and limited to the thread safety fix rather than try to "clean" things up, and I'm not really convinced that avoiding borrowed references here would make things better.

Ok, but we should definitely do this in a follow-up (possibly only for 3.15). There are definitely some things here that aren't safe. For example:

lastelt=PyList_GET_ITEM(heap,n-1) ;Py_INCREF(lastelt);if (PyList_SetSlice(heap,n-1,n,NULL)) {Py_DECREF(lastelt);returnNULL;}n--;if (!n)returnlastelt;returnitem=PyList_GET_ITEM(heap,0);

A finalizer could either release the critical section or explicitly clear the list, which could cause thatPyList_GET_ITEM call to returnNULL. I guess that's not related to borrowed references, though--more of a problem withPyList_GET_ITEM not doing validation.

There's also some incredibly horrible things going on, like this:

returnitem=PyList_GET_ITEM(heap,0);PyList_SET_ITEM(heap,0,lastelt);

returnitem starts out as a borrowed reference, but then has the ownership implicitly handed off throughPyList_SET_ITEM, which doesn't decref it.

pythongh-116738: Add news entry in Misc/NEWS/next

01b2be4

Copy link

ContributorAuthor

yoney commentedJun 2, 2025

Ok, but we should definitely do this in a follow-up (possibly only for 3.15). There are definitely some things here that aren't safe.

@ZeroIntensity I agree that there are things we should follow up on. I initially tried to address some of them as part of the free-threading change, but it introduces complexity to the review and makes the free-threading change harder to review, so I decided to follow up on those issues separately.

ZeroIntensity reviewed

Jun 2, 2025

View reviewed changes

Copy link

Member

ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nitpicks

Lib/test/test_free_threading/test_heapq.py OutdatedShow resolvedHide resolved

Lib/test/test_free_threading/test_heapq.pyShow resolvedHide resolved

Lib/test/test_free_threading/test_heapq.py OutdatedShow resolvedHide resolved

Misc/NEWS.d/next/Core_and_Builtins/2025-06-02-13-57-40.gh-issue-116738.ycJsL8.rst OutdatedShow resolvedHide resolved

rhettinger removed their request for review

June 3, 2025 05:20

mpage reviewed

Jun 3, 2025

View reviewed changes

Modules/_heapqmodule.c Outdated

		@@ -128,7 +129,7 @@ Push item onto heap, maintaining the heap invariant.

		staticPyObject*
		_heapq_heappush_impl(PyObjectmodule,PyObjectheap,PyObject*item)
		/[clinic end generated code: output=912c094f47663935 input=7c69611f3698aceb]/
		/[clinic end generated code: output=912c094f47663935 input=f7a4f03ef8d52e67]/
		{
		if (PyList_Append(heap,item))

Copy link

Contributor

mpageJun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

PyList_Append acquires a critical section onheap:

cpython/Objects/listobject.c

Lines 531 to 543 in1ffe913

	int
	PyList_Append(PyObjectop,PyObjectnewitem)
	{
	if (PyList_Check(op)&& (newitem!=NULL)) {
	intret;
	Py_BEGIN_CRITICAL_SECTION(op);
	ret=_PyList_AppendTakeRef((PyListObject*)op,Py_NewRef(newitem));
	Py_END_CRITICAL_SECTION();
	returnret;
	}
	PyErr_BadInternalCall();
	return-1;
	}

I think this is probably OK from a correctness perspective: if someone sneaks in and modifies the heap betweenPyList_Append releasing the critical section and us reacquiring the lock onheap the call tosiftdown should still execute correctly. However, it's not great for performance. It might be worth inlining the implementation ofPyList_Append here, sans the calls to acquire/release the critical section, to address that.

Copy link

Contributor

colesburyJun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think#128126 avoids the problem we used to have where the secondPy_BEGIN_CRITICAL_SECTION() would trigger a release and re-acquisition of the lock.

There's still some extra unnecessary overhead, so using_PyList_AppendTakeRef(), which assumes the caller holds the lock, make sense to me.

Copy link

ContributorAuthor

yoneyJun 3, 2025•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@mpage,@colesbury, Thank you for the review, updated to use_PyList_AppendTakeRef()

yoney added2 commits

June 3, 2025 13:05

pythongh-116738: Address the review comments

68f3a26

pythongh-116738: Address the review comments

41d145a

Copy link

ContributorAuthor

yoney commentedJun 3, 2025

There is a lot of duplication in tests because of min/max heaps, why not organize with subtests?

@StanFromIreland Thank you so much for your review! I've already refactored the code and moved some repeated parts into separate functions while addressing the other comments. I'm not sure if subtests will provide more code reuse here. What do you think?

StanFromIreland reviewed

Jun 4, 2025

View reviewed changes

Lib/test/test_free_threading/test_heapq.py Outdated

Comment on lines 182 to 224

		defis_min_heap_property_satisfied(self,heap:list[object])->bool:
		"""
		The value of a parent node should be less than or equal to the
		values of its children.
		"""
		returnself.is_heap_property_satisfied(heap,HeapKind.MIN)

		defis_max_heap_property_satisfied(self,heap:list[object])->bool:
		"""
		The value of a parent node should be greater than or equal to the
		values of its children.
		"""
		returnself.is_heap_property_satisfied(heap,HeapKind.MAX)

		@staticmethod
		defis_heap_property_satisfied(
		heap:list[object],heap_kind:HeapKind
		)->bool:
		"""
		Check if the heap property is satisfied.
		"""
		op=operator.leifheap_kind==HeapKind.MINelseoperator.ge
		# position 0 has no parent
		forposinrange(1,len(heap)):
		parent_pos= (pos-1)>>1
		ifnotop(heap[parent_pos],heap[pos]):
		returnFalse

		returnTrue

		@staticmethod
		defis_sorted_ascending(lst:list[object])->bool:
		"""
		Check if the list is sorted in ascending order (non-decreasing).
		"""
		returnall(lst[i-1]<=lst[i]foriinrange(1,len(lst)))

		@staticmethod
		defis_sorted_descending(lst:list[object])->bool:
		"""
		Check if the list is sorted in descending order (non-increasing).
		"""
		returnall(lst[i-1]>=lst[i]foriinrange(1,len(lst)))

Copy link

Contributor

StanFromIrelandJun 4, 2025•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change

	defis_min_heap_property_satisfied(self,heap:list[object])->bool:
	"""
	Thevalueofaparentnodeshouldbelessthanorequaltothe
	valuesofitschildren.
	"""
	returnself.is_heap_property_satisfied(heap,HeapKind.MIN)

	defis_max_heap_property_satisfied(self,heap:list[object])->bool:
	"""
	Thevalueofaparentnodeshouldbegreaterthanorequaltothe
	valuesofitschildren.
	"""
	returnself.is_heap_property_satisfied(heap,HeapKind.MAX)

	@staticmethod
	defis_heap_property_satisfied(
	heap:list[object],heap_kind:HeapKind
	)->bool:
	"""
	Checkiftheheappropertyissatisfied.
	"""
	op=operator.leifheap_kind==HeapKind.MINelseoperator.ge
	# position 0 has no parent
	forposinrange(1,len(heap)):
	parent_pos= (pos-1)>>1
	ifnotop(heap[parent_pos],heap[pos]):
	returnFalse

	returnTrue

	@staticmethod
	defis_sorted_ascending(lst:list[object])->bool:
	"""
	Checkifthelistissortedinascendingorder (non-decreasing).
	"""
	returnall(lst[i-1]<=lst[i]foriinrange(1,len(lst)))

	@staticmethod
	defis_sorted_descending(lst:list[object])->bool:
	"""
	Checkifthelistissortedindescendingorder (non-increasing).
	"""
	returnall(lst[i-1]>=lst[i]foriinrange(1,len(lst)))
	defcheck_invariant(self,heap):
	# Check the heap invariant.
	forpos,iteminenumerate(heap):
	ifpos:# pos 0 has no parent
	parentpos= (pos-1)>>1
	self.assertTrue(heap[parentpos]<=item)

	defcheck_max_invariant(self,heap):
	forpos,iteminenumerate(heap[1:],start=1):
	parentpos= (pos-1)>>1
	self.assertGreaterEqual(heap[parentpos],item)

Reuse our existing checks for tests:https://github.com/python/cpython/blob/main/Lib/test/test_heapq.py ?

I don't see a need for the sorted methods too, we can just check the invariant which will simplify this quite a bit? Using the enum for two options also seems over the top, why not just'min'/'max'?

Copy link

ContributorAuthor

yoneyJun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Using the enum for two options also seems over the top, why not just'min'/'max'

I believe it might be possible to replace all enum usages with strings, and this could be fine in the test. However, I'm not quite sure that removing just 3 lines of enum definition would improve the code. Personally, I would prefer to keep the enum.

pythongh-116738: Remove type hints

a22da8f

ZeroIntensity approved these changes

Jun 5, 2025

View reviewed changes

Copy link

Member

ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM with two tiny complaints

Lib/test/test_free_threading/test_heapq.py OutdatedShow resolvedHide resolved

bedevere-appbot added awaiting core review and removed awaiting review labels

Jun 5, 2025

pythongh-116738: Address the review comments

75a1d3a

ZeroIntensity reviewed

Jun 5, 2025

View reviewed changes

Modules/_heapqmodule.c

		if (PyList_Append(heap,item))
		// In a free-threaded build, the heap is locked at this point.
		// Therefore, calling _PyList_AppendTakeRef() is safe and no overhead.
		if (_PyList_AppendTakeRef((PyListObject*)heap,Py_XNewRef(item)))

Copy link

Member

ZeroIntensityJun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Wait, why is this nowXNewRef?item can never beNULL.

Copy link

ContributorAuthor

yoneyJun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I wasn't sure about "item can never beNULL" and in the previous version,PyList_Append() had aNULL check. That's why I thought this would be safe. However,_PyList_AppendTakeRef() expects anon-NULL value foritem so it doesn't really make it safer.

Can I assume that "item can never beNULL" because_heapq_heappush_impl() is called from clinic-generated code?

Copy link

Contributor

mpageJun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think it might be possible foritem to be NULL if you vectorcall this from a C extension? If we want to mirror the previous implementation we should add a check thatitem is notNULL:

if (newitem==NULL) {PyErr_BadInternalCall();returnNULL}

mpage reviewed

Jun 5, 2025

View reviewed changes

Lib/test/test_free_threading/test_heapq.py

		barrier=Barrier(nthreads)

		defwrapper_func(*args):
		# Wait for all threadss to reach this point before proceeding.

Copy link

Contributor

mpageJun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change

	# Wait for allthreadss to reach this point before proceeding.
	# Wait for allthreads to reach this point before proceeding.

Modules/_heapqmodule.c

		if (PyList_Append(heap,item))
		// In a free-threaded build, the heap is locked at this point.
		// Therefore, calling _PyList_AppendTakeRef() is safe and no overhead.
		if (_PyList_AppendTakeRef((PyListObject*)heap,Py_XNewRef(item)))

Copy link

Contributor

mpageJun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think it might be possible foritem to be NULL if you vectorcall this from a C extension? If we want to mirror the previous implementation we should add a check thatitem is notNULL:

if (newitem==NULL) {PyErr_BadInternalCall();returnNULL}

Modules/_heapqmodule.c

		if (PyList_Append(heap,item)) {
		// In a free-threaded build, the heap is locked at this point.
		// Therefore, calling _PyList_AppendTakeRef() is safe and no overhead.
		if (_PyList_AppendTakeRef((PyListObject*)heap,Py_XNewRef(item))) {

Copy link

Contributor

mpageJun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Same comment aboutPy_XNewRef applies here too.

Labels

awaiting core review needs backport to 3.14

bugs and security fixes

topic-free-threading

6 participants

Movatterモバイル変換

Uh oh!

gh-116738: Make _heapq module thread-safe#135036

Are you sure you want to change the base?

gh-116738: Make _heapq module thread-safe#135036

Conversation

yoney commentedJun 2, 2025• edited by bedevere-appbotLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

python-cla-botbot commentedJun 2, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

StanFromIreland left a comment• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZeroIntensity left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

colesbury commentedJun 2, 2025

Uh oh!

ZeroIntensity commentedJun 2, 2025

Uh oh!

yoney commentedJun 2, 2025

Uh oh!

ZeroIntensity left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yoneyJun 3, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yoney commentedJun 3, 2025

Uh oh!

StanFromIrelandJun 4, 2025• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZeroIntensity left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yoney commentedJun 2, 2025•
edited by bedevere-appbot
Loading

python-cla-botbot commentedJun 2, 2025•
edited
Loading

StanFromIreland left a comment•
edited
Loading

yoneyJun 3, 2025•
edited
Loading

StanFromIrelandJun 4, 2025•
edited
Loading