Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-133885: Use locks instead of critical sections for _zstd#134289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
colesbury merged 18 commits intopython:mainfromemmatyping:zstd-mutex-locks
May 23, 2025

Conversation

emmatyping
Copy link
Member

@emmatypingemmatyping commentedMay 19, 2025
edited by bedevere-appbot
Loading

Move from using critical sections to locks for the (de)compression methods. Since the methods allow other threads to run, we should use a lock rather than a critical section.

I'm not totally sure about the removal of the critical sections in the_get_(C|D)Dict function, but it should be safe to yield to other threads in an initializer as the values cannot be modified (ZstdDicts are immutable).

Also remove critical sections around code only run in initializer.
Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This seems like a downgrade. Though rare, functions likePyMem_Malloc can call arbitrary Python code, so we lose our protection against deadlocks. What's the rationale here?

@colesburycolesbury self-requested a reviewMay 20, 2025 17:52
@colesbury
Copy link
Contributor

@ZeroIntensity -PyMem_Malloc is not allowed to call back into Python.

@ZeroIntensity
Copy link
Member

Oh, that's news to me. When did that change?

@colesbury
Copy link
Contributor

@ZeroIntensity - that's always been the case, as far as I know. A lot of things would break ifPyMem_Malloc were allowed to call back into Python.PyObject_GC_Malloc used to be able to call back into Python (via triggering GC), but that hasn't been the case since 3.12.

@ZeroIntensity
Copy link
Member

PyObject_GC_Malloc used to be able to call back into Python (via triggering GC), but that hasn't been the case since 3.12.

Ah, that's what I was thinking of.

We do need to document this contract somewhere. I'm imagining a case where an embedder sets their own allocator that calls Python code, because they have an attached thread state. The documentation doesn't say you can't do that, as far as I can tell.

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ok, I understand this a little better now, sorry!

I'm a bit iffy about the lock-ordering here. Ithink the lock is only held under non-reentrant calls?PyErr_SetString and friends seem to be safe, but it looks like they might be able to re-enter if someone did weird things to the exception state. I wouldn't worry about it, really, but it's worth bringing up.

@dpdani
Copy link
Contributor

Are we sure about the implications of implicit locking in de/compressor contexts?
I believe we might carefully implement this now only to discourage it later.

If two threads concurrently write into a compressor context with implicit locking they might succeed in corrupting their data. Say:

  1. thread 1 writes<html>
  2. thread 2 writes a png file
  3. thread 1 writes</html>

Or pick whichever other interleaving of the two threads.
The corruption is only evident after decompressing back the data: from zstd's point of view there is nothing wrong with the above.

This is just one example, but I think there's reasonable expectation that users won't want to share de/compressor contexts anyways.

I think raising an exception when detecting concurrent use is a more desirable outcome.

@emmatyping
Copy link
MemberAuthor

If two threads concurrently write into a compressor context with implicit locking they might succeed in corrupting their data. Say:

thread 1 writes
thread 2 writes a png file
thread 1 writes

I think I would say don't write different data to the same stream with threading. I can append things to a shared list that makes it invalid for what I want but that is a programmer error. I don't particularly see a reason to share a compression context between threads, but I do think we might see people move a compression context into a thread to do all the compression in another thread, and we should try to support that.

This also reverts the setdefault usage which we don't need anymore and refactors the _zstd_load_(c,d)_dict functions to not use goto/properly lock around dictionary usage.
@dpdani
Copy link
Contributor

Oh yes, I agree.

Let me explain a bit better what my intention was with python-zstandard.

Instead of using a lock-based approach, I used a flag to signify that a de/compressor context was in use by some thread.
If another thread tried to set the same de/compressor's flag (with an atomic compare-and-set operation) while it was still in use, then it would raise an exception.

The differences with using locks are:

  1. aConcurrentUseException is raised when appropriate, which should
  2. make users be aware that data corruption may occur, caused by multithreading.

And, in both approaches:

  1. no segmentation fault can be issued by the underlying zstd implementation.

IIUC, you alluded to an ownership transfer model. Note that this approach is not in contrast with that model. If a de/compressor object is created by one thread and then transferred to a different thread for use, no exception is raised. The only problematic usage that the exception would signal is when a thread calls into a context method while another thread was already using the context.

In fact, this is not even in contrast with other thread synchronization mechanisms that don't use ownership transfer (e.g. barriers): if the program makes sure that there is noconcurrent access, then the exception is never raised.

Of course, the choice is up to you. I just wanted to make sure we considered both approaches.

@ZeroIntensity
Copy link
Member

Hmm, I don't think we should be inventing object ownership models here. That makes sense for a third-party, but that sort of thing should get its own proposal for the stdlib as a whole.

Locking should be sufficient, right? If torn writes at the method-level are an issue, we should just document that users need to have an additional layer of synchronization when using compressors concurrently.

- Assert locks are held in `_get_(C/D)Dict`- Update self->d_dict outside `Py_BEGIN_ALLOW_THREADS`- removed critical section AC I missed
@ZeroIntensity
Copy link
Member

!buildbot nogil refleak

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by@ZeroIntensity for commitd142aa8 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F134289%2Fmerge

The command will test the builders whose names match following regular expression:nogil refleak

The builders matched are:

  • AMD64 CentOS9 NoGIL Refleaks PR
  • PPC64LE Fedora Rawhide NoGIL refleaks PR
  • s390x Fedora Rawhide NoGIL refleaks PR
  • aarch64 Fedora Rawhide NoGIL refleaks PR
  • AMD64 Fedora Rawhide NoGIL refleaks PR

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yay, buildbots passed! LGTM too, apart from one (speculative) concern about re-entrancy.

emmatyping reacted with hooray emoji
@emmatyping
Copy link
MemberAuthor

!buildbot nogil refleak

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by@emmatyping for commit69a755b 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F134289%2Fmerge

The command will test the builders whose names match following regular expression:nogil refleak

The builders matched are:

  • AMD64 CentOS9 NoGIL Refleaks PR
  • PPC64LE Fedora Rawhide NoGIL refleaks PR
  • AMD64 Fedora Rawhide NoGIL refleaks PR
  • s390x Fedora Rawhide NoGIL refleaks PR
  • aarch64 Fedora Rawhide NoGIL refleaks PR

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM assuming buildbots pass

@emmatyping
Copy link
MemberAuthor

Hm, that CentOS 9 buildbot doesn't have zstd installed but it is failing due to unrelated issues.

@emmatyping
Copy link
MemberAuthor

(FWIW, I ran the refleak hunter locally on a nogil build and it passed)

@emmatyping
Copy link
MemberAuthor

Opened#134557 for the free threaded buildbot refleak

@colesburycolesbury merged commit8dbc119 intopython:mainMay 23, 2025
46 of 49 checks passed
@miss-islington-app
Copy link

Thanks@emmatyping for the PR, and@colesbury for merging it 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestMay 23, 2025
…thongh-134289)Move from using critical sections to locks for the (de)compression methods.Since the methods allow other threads to run, we should use a lock ratherthan a critical section.(cherry picked from commit8dbc119)Co-authored-by: Emma Smith <emma@emmatyping.dev>
@bedevere-app
Copy link

GH-134560 is a backport of this pull request to the3.14 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.14bugs and security fixes labelMay 23, 2025
@emmatyping
Copy link
MemberAuthor

Thanks all for the reviews! Thinking through free threading reentrancy at the C level is something I'm still getting a feel for. Chatting about this fix has definitely helped with my understanding :)

ZeroIntensity reacted with heart emoji

colesbury pushed a commit that referenced this pull requestMay 23, 2025
…h-134289) (gh-134560)Move from using critical sections to locks for the (de)compression methods.Since the methods allow other threads to run, we should use a lock ratherthan a critical section.(cherry picked from commit8dbc119)Co-authored-by: Emma Smith <emma@emmatyping.dev>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@kumaraditya303kumaraditya303kumaraditya303 left review comments

@colesburycolesburycolesbury approved these changes

@ZeroIntensityZeroIntensityZeroIntensity approved these changes

Assignees
No one assigned
Labels
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

6 participants
@emmatyping@colesbury@ZeroIntensity@dpdani@bedevere-bot@kumaraditya303

[8]ページ先頭

©2009-2025 Movatter.jp