Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-116738: Make _heapq module thread-safe#135036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
yoney wants to merge4 commits intopython:main
base:main
Choose a base branch
Loading
fromyoney:ft_heapq

Conversation

yoney
Copy link
Contributor

@yoneyyoney commentedJun 2, 2025
edited by bedevere-appbot
Loading

This uses critical sections to make heapq methods that update the heap thread-safe when the GIL is disabled. This is accomplished by using the @critical_section clinic directive.

cc:@mpage@colesbury

@python-cla-bot
Copy link

python-cla-botbot commentedJun 2, 2025
edited
Loading

All commit authors signed the Contributor License Agreement.

CLA signed

Copy link
Contributor

@StanFromIrelandStanFromIreland left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There is a lot of duplication in tests because of min/max heaps, why not organize with subtests?

Also for testing the heap, it is probably best to reuse the existing methods fromtest_heapq

defcheck_invariant(self,heap):
# Check the heap invariant.
forpos,iteminenumerate(heap):
ifpos:# pos 0 has no parent
parentpos= (pos-1)>>1
self.assertTrue(heap[parentpos]<=item)
defcheck_max_invariant(self,heap):
forpos,iteminenumerate(heap[1:],start=1):
parentpos= (pos-1)>>1
self.assertGreaterEqual(heap[parentpos],item)

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think it would also be a good idea to get rid of the borrowed reference usage here.

yoney reacted with thumbs up emoji
@yoneyyoney marked this pull request as ready for reviewJune 2, 2025 17:38
@yoneyyoney requested a review fromrhettinger as acode ownerJune 2, 2025 17:38
@colesburycolesbury added the needs backport to 3.14bugs and security fixes labelJun 2, 2025
@colesbury
Copy link
Contributor

The Modules/_heapqmodule.c change looks good to me. I haven't looked through the tests yet. I don't think should change the implementation to avoid borrowed references: I'd rather keep the change small and limited to the thread safety fix rather than try to "clean" things up, and I'm not really convinced that avoiding borrowed references here would make things better.

@yoney - would you please add add a NEWS entry viablurb. You can useuvx blurb, if you have the uv tools, or pip install it, orblurb-it.

yoney reacted with thumbs up emoji

@ZeroIntensity
Copy link
Member

I'd rather keep the change small and limited to the thread safety fix rather than try to "clean" things up, and I'm not really convinced that avoiding borrowed references here would make things better.

Ok, but we should definitely do this in a follow-up (possibly only for 3.15). There are definitely some things here that aren't safe. For example:

lastelt=PyList_GET_ITEM(heap,n-1) ;Py_INCREF(lastelt);if (PyList_SetSlice(heap,n-1,n,NULL)) {Py_DECREF(lastelt);returnNULL;}n--;if (!n)returnlastelt;returnitem=PyList_GET_ITEM(heap,0);

A finalizer could either release the critical section or explicitly clear the list, which could cause thatPyList_GET_ITEM call to returnNULL. I guess that's not related to borrowed references, though--more of a problem withPyList_GET_ITEM not doing validation.

There's also some incredibly horrible things going on, like this:

returnitem=PyList_GET_ITEM(heap,0);PyList_SET_ITEM(heap,0,lastelt);

returnitem starts out as a borrowed reference, but then has the ownership implicitly handed off throughPyList_SET_ITEM, which doesn't decref it.

yoney reacted with thumbs up emoji

@yoney
Copy link
ContributorAuthor

Ok, but we should definitely do this in a follow-up (possibly only for 3.15). There are definitely some things here that aren't safe.

@ZeroIntensity I agree that there are things we should follow up on. I initially tried to address some of them as part of the free-threading change, but it introduces complexity to the review and makes the free-threading change harder to review, so I decided to follow up on those issues separately.

ZeroIntensity reacted with thumbs up emoji

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nitpicks

heap = list(range(OBJECT_COUNT))
shuffle(heap)

def heapify_func(heap: list[int]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We generally avoid type hints in tests because there's nothing that checks them.

mpage reacted with thumbs up emoji
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@ZeroIntensity Thank you so much for your review! I've addressed all the other comments except this one. I think type hints can still be useful for readability and there might check in the future. Do you think I should remove them, or would it be okay to keep them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I agree in general, but this has been brought up again and again, and as far as I know has always been rejected for the standard library. Here's the most recent thread on the topic:https://discuss.python.org/t/static-type-annotations-in-cpython/65068.

But maybe this is worth doing in tests?@AlexWaygood, as a typing expert and as someone in the thread that I linked, do you have any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Unchecked type hints tend to go out of date astonishingly quickly due to the code being updated but people forgetting to update the type hints, and that often ends up being pretty confusing for readers. I agree that type hints can be great documentation, but I'd prefer no documentation to confusing or out-of-date documentation :-)

@rhettingerrhettinger removed their request for reviewJune 3, 2025 05:20
@@ -128,7 +129,7 @@ Push item onto heap, maintaining the heap invariant.

static PyObject *
_heapq_heappush_impl(PyObject *module, PyObject *heap, PyObject *item)
/*[clinic end generated code: output=912c094f47663935 input=7c69611f3698aceb]*/
/*[clinic end generated code: output=912c094f47663935 input=f7a4f03ef8d52e67]*/
{
if (PyList_Append(heap, item))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

PyList_Append acquires a critical section onheap:

int
PyList_Append(PyObject*op,PyObject*newitem)
{
if (PyList_Check(op)&& (newitem!=NULL)) {
intret;
Py_BEGIN_CRITICAL_SECTION(op);
ret=_PyList_AppendTakeRef((PyListObject*)op,Py_NewRef(newitem));
Py_END_CRITICAL_SECTION();
returnret;
}
PyErr_BadInternalCall();
return-1;
}

I think this is probably OK from a correctness perspective: if someone sneaks in and modifies the heap betweenPyList_Append releasing the critical section and us reacquiring the lock onheap the call tosiftdown should still execute correctly. However, it's not great for performance. It might be worth inlining the implementation ofPyList_Append here, sans the calls to acquire/release the critical section, to address that.

yoney reacted with thumbs up emoji
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think#128126 avoids the problem we used to have where the secondPy_BEGIN_CRITICAL_SECTION() would trigger a release and re-acquisition of the lock.

There's still some extra unnecessary overhead, so using_PyList_AppendTakeRef(), which assumes the caller holds the lock, make sense to me.

yoney reacted with thumbs up emoji
Copy link
ContributorAuthor

@yoneyyoneyJun 3, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@mpage,@colesbury, Thank you for the review, updated to use_PyList_AppendTakeRef()

@yoney
Copy link
ContributorAuthor

There is a lot of duplication in tests because of min/max heaps, why not organize with subtests?

@StanFromIreland Thank you so much for your review! I've already refactored the code and moved some repeated parts into separate functions while addressing the other comments. I'm not sure if subtests will provide more code reuse here. What do you think?

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@mpagempagempage left review comments

@ZeroIntensityZeroIntensityZeroIntensity left review comments

@AlexWaygoodAlexWaygoodAlexWaygood left review comments

@StanFromIrelandStanFromIrelandStanFromIreland left review comments

@colesburycolesburyAwaiting requested review from colesbury

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

6 participants
@yoney@colesbury@ZeroIntensity@mpage@AlexWaygood@StanFromIreland

[8]ページ先頭

©2009-2025 Movatter.jp