Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-128942: make arraymodule.c free-thread safe#128943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
kumaraditya303 merged 40 commits intopython:mainfromtom-pytel:fix-issue-128942
Feb 27, 2025

Conversation

tom-pytel
Copy link
Contributor

@tom-pyteltom-pytel commentedJan 17, 2025
edited by bedevere-appbot
Loading

This PR is a work in progress but it currently fixes the crashes mentioned in the related issue so I am sending it up now to elicit feedback to make sure I am on the right path. So far the changes have been made functional but as minimal as possible to facilitate review. The critical sections are fairly broad so optimization may follow if functionality is confirmed. The changes are mostly additions of critical sections to publicly visible methods and slots gotten from the two structures listed below (as well as a few internal functions):

staticPyMethodDefarray_methods[]= {ARRAY_ARRAY_APPEND_METHODDEFARRAY_ARRAY_BUFFER_INFO_METHODDEFARRAY_ARRAY_BYTESWAP_METHODDEFARRAY_ARRAY_CLEAR_METHODDEFARRAY_ARRAY___COPY___METHODDEFARRAY_ARRAY_COUNT_METHODDEFARRAY_ARRAY___DEEPCOPY___METHODDEFARRAY_ARRAY_EXTEND_METHODDEFARRAY_ARRAY_FROMFILE_METHODDEFARRAY_ARRAY_FROMLIST_METHODDEFARRAY_ARRAY_FROMBYTES_METHODDEFARRAY_ARRAY_FROMUNICODE_METHODDEFARRAY_ARRAY_INDEX_METHODDEFARRAY_ARRAY_INSERT_METHODDEFARRAY_ARRAY_POP_METHODDEFARRAY_ARRAY___REDUCE_EX___METHODDEFARRAY_ARRAY_REMOVE_METHODDEFARRAY_ARRAY_REVERSE_METHODDEFARRAY_ARRAY_TOFILE_METHODDEFARRAY_ARRAY_TOLIST_METHODDEFARRAY_ARRAY_TOBYTES_METHODDEFARRAY_ARRAY_TOUNICODE_METHODDEFARRAY_ARRAY___SIZEOF___METHODDEF    {"__class_getitem__",Py_GenericAlias,METH_O|METH_CLASS,PyDoc_STR("See PEP 585")},    {NULL,NULL}/*sentinel*/};staticPyType_Slotarray_slots[]= {    {Py_tp_dealloc,array_dealloc},    {Py_tp_repr,array_repr},    {Py_tp_getattro,PyObject_GenericGetAttr},    {Py_tp_doc, (void*)arraytype_doc},    {Py_tp_richcompare,array_richcompare},    {Py_tp_iter,array_iter},    {Py_tp_methods,array_methods},    {Py_tp_members,array_members},    {Py_tp_getset,array_getsets},    {Py_tp_alloc,PyType_GenericAlloc},    {Py_tp_new,array_new},    {Py_tp_traverse,array_tp_traverse},/*assequence*/    {Py_sq_length,array_length},    {Py_sq_concat,array_concat},    {Py_sq_repeat,array_repeat},    {Py_sq_item,array_item},    {Py_sq_ass_item,array_ass_item},    {Py_sq_contains,array_contains},    {Py_sq_inplace_concat,array_inplace_concat},    {Py_sq_inplace_repeat,array_inplace_repeat},/*asmapping*/    {Py_mp_length,array_length},    {Py_mp_subscript,array_subscr},    {Py_mp_ass_subscript,array_ass_subscr},/*asbuffer*/    {Py_bf_getbuffer,array_buffer_getbuf},    {Py_bf_releasebuffer,array_buffer_relbuf},    {0,NULL},};

The following public methods/slots have been modified directly:

* array_array_buffer_info   = ARRAY_ARRAY_BUFFER_INFO_METHODDEF* array_array_byteswap      = ARRAY_ARRAY_BYTESWAP_METHODDEF* array_array_clear         = ARRAY_ARRAY_CLEAR_METHODDEF* array_array_count         = ARRAY_ARRAY_COUNT_METHODDEF* array_array_fromlist      = ARRAY_ARRAY_FROMLIST_METHODDEF  / CRITICAL2* array_array_fromunicode   = ARRAY_ARRAY_FROMUNICODE_METHODDEF* array_array_index         = ARRAY_ARRAY_INDEX_METHODDEF* array_array_pop           = ARRAY_ARRAY_POP_METHODDEF* array_array_remove        = ARRAY_ARRAY_REMOVE_METHODDEF* array_array_reverse       = ARRAY_ARRAY_REVERSE_METHODDEF* array_array_tofile        = ARRAY_ARRAY_TOFILE_METHODDEF* array_array_tolist        = ARRAY_ARRAY_TOLIST_METHODDEF* array_array_tobytes       = ARRAY_ARRAY_TOBYTES_METHODDEF* array_array_tounicode     = ARRAY_ARRAY_TOUNICODE_METHODDEF* array_array___sizeof__    = ARRAY_ARRAY___SIZEOF___METHODDEF* array_repr                = Py_tp_repr* array_richcompare         = Py_tp_richcompare  / CRITICAL2* array_new                 = Py_tp_new* array_length              = Py_sq_length* array_concat              = Py_sq_concat  / CRITICAL2* array_repeat              = Py_sq_repeat* array_item                = Py_sq_item* array_ass_item            = Py_sq_ass_item* array_contains            = Py_sq_contains* array_inplace_repeat      = Py_sq_inplace_repeat* array_length              = Py_mp_length* array_subscr              = Py_mp_subscript  -> array_item* array_ass_subscr          = Py_mp_ass_subscript  / CRITICAL2* array_buffer_getbuf       = Py_bf_getbuffer* array_buffer_relbuf       = Py_bf_releasebuffer

The following have not been modified but depend for safety on the functions they call:

+ array_array_append        = ARRAY_ARRAY_APPEND_METHODDEF  -> ins+ array_array___copy__      = ARRAY_ARRAY___COPY___METHODDEF  -> array_slice+ array_array___deepcopy__  = ARRAY_ARRAY___DEEPCOPY___METHODDEF  -> array_array___copy__+ array_array_extend        = ARRAY_ARRAY_EXTEND_METHODDEF  -> array_do_extend  / CRITICAL2+ array_array_frombytes     = ARRAY_ARRAY_FROMBYTES_METHODDEF  -> frombytes  / CRITICAL2+ array_array_insert        = ARRAY_ARRAY_INSERT_METHODDEF  -> ins+ array_array___reduce_ex__ = ARRAY_ARRAY___REDUCE_EX___METHODDEF  -> array_array_tolist_impl, array_array_tobytes_impl+ array_inplace_concat      = Py_sq_inplace_concat  -> array_do_extend  / CRITICAL2

The following look safe as is:

- array_array_fromfile      = ARRAY_ARRAY_FROMFILE_METHODDEF  -> array_array_frombytes- array_dealloc             = Py_tp_dealloc- array_iter                = Py_tp_iter- array_tp_traverse         = Py_tp_traverse

And the following non-public utility functions have been made safe:

* array_slice* array_do_extend  / CRITICAL2* ins* frombytes  / CRITICAL2

Here are a few questions needing guidance:

  • Changed clinic sig ofarray_array_frombytes() so I could lock the original source object during op (maybe its a writeable bytearray). Want to prevent modification of data during the operation but not sure this will even do this so not sure is necessary or even desired. Should I remove this and leave just thePyObject_GetBuffer() for protection? Also for this added a forward declaration ofarray_array_frombytes() forarray_array_fromfile_impl() instead of moving stuff around for minimal impact.

  • Inarray_new there may or may not be an initializer object, if there is not then theargs tuple is used as a stand-in for the critical section lock. Is there a better way to do this?

Misc:

  • array_richcompare was incrementing refcount ofPy_True andPy_False, removed that as they are immortal.

  • The following were larger / slightly more complicated functions to check so should double checkarray_array_frombytes,array_richcompare,array_ass_subscr andarray_new.

More work is still needed, like the array iterator and more testing and verification.

erlend-aasland reacted with thumbs up emoji
@bedevere-app
Copy link

Most changes to Pythonrequire a NEWS entry. Add one using theblurb_it web app or theblurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply theskip news label instead.

@tom-pytel
Copy link
ContributorAuthor

Requesting a look from@ZeroIntensity.

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for doing this! At a glance, don't usegoto with critical sections--it's messier and can be error-prone (IIRC there's problems with it compiling on MSVC too). Instead, create a wrapper function like this:

staticPyObject*array_something(arrayobject*self,PyObject*arg){PyObject*res;Py_BEGIN_CRITICAL_SECTION(self);res=array_something_lock_held();// the actual implementation functionPy_END_CRITICAL_SECTION();returnres;}

For functions that are defined with theArgument Clinic, you can use@critical_section to do this even faster. (Seewhat I did for thessl module.)

@tom-pytel
Copy link
ContributorAuthor

tom-pytel commentedJan 17, 2025
edited
Loading

Made requested changes, also made arrayiter safe.

Also found a good old-fashioned non-free-threaded bug which probably goes back a long way - if try toarrayiter.__setstate__() when it is exhausted then segfault, fixed here obviously.

Question remaining about need to lock bytes source object infrombytes() or if holding the buffer is sufficient?

@erlend-aasland
Copy link
Contributor

Also found a good old-fashioned non-free-threaded bug which probably goes back a long way - if try toarrayiter.__setstate__() when it is exhausted then segfault, fixed here obviously.

Could you separate that out from this PR and submit a dedicated PR for this bug? We might want to backport it.

@erlend-aasland
Copy link
Contributor

Regarding the style nits: please make sure all new code (and inmost cases, also changed code) followPEP-7.

@tom-pytel
Copy link
ContributorAuthor

Still would like to know ifarray_array_frombytes() really needs to lock thebytes object or if just holding the buffer is sufficient. The idea being trying to prevent the CONTENTS of the buffer changing during the operation. Just holding buffer is not enough if is writable but not sure locking object would help that anyway.

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We're getting there! Now, we're overlocking things a bit. This isn't always the rule, but in general you only need to lock something if you want to avoid inconsistent state (making it a "critical section"!).

That, and keep in mind that things that have a_Py orPy prefix are generally thread safe themselves and will lock if they need to. You don't need to lock for every call to Python.

@kumaraditya303
Copy link
Contributor

./python -mtest test_array -R 3:3Using random seed: 19191006670:00:00 load avg: 36.22 Run 1 test sequentially in a single process0:00:00 load avg: 36.22 [1/1] test_arraybeginning 6 repetitions. Showing number of leaks (. for 0 or less, X for 10 or more)123:456XX. ...0:00:16 load avg: 28.56 [1/1] test_array passed== Tests result: SUCCESS ==1 test OK.Total duration: 16.1 secTotal tests: run=886 skipped=1Total test files: run=1/1Result: SUCCESS

No refleaks.

ZeroIntensity and tom-pytel reacted with hooray emoji

@kumaraditya303kumaraditya303enabled auto-merge (squash)February 27, 2025 13:37
@tom-pytel
Copy link
ContributorAuthor

No refleaks.

Thx for the onboarding with this@ZeroIntensity

ZeroIntensity reacted with heart emoji

@kumaraditya303kumaraditya303 merged commit8ba0d7b intopython:mainFeb 27, 2025
42 checks passed
@colesbury
Copy link
Contributor

The PR regressed free threading performance substantially on the scimark benchmarks from pyperformance, so I'm going to revert the PR for now. We'll need to figure out a way of makingarraymodule.c thread-safe without regressing perf.

tom-pytel and ZeroIntensity reacted with thumbs up emoji

colesbury added a commit to colesbury/cpython that referenced this pull requestFeb 28, 2025
…)"The change regressed performance on scimark benchmarks from thepyperformance benchmark suite.This reverts commit8ba0d7b.
@tom-pytel
Copy link
ContributorAuthor

The PR regressed free threading performance substantially on the scimark benchmarks from pyperformance, so I'm going to revert the PR for now. We'll need to figure out a way of makingarraymodule.c thread-safe without regressing perf.

Will play with it checking perf along the way to see if a lesser standard of thread-safe can work (ignore data integrity and just make sure nothing crashes).

@colesbury
Copy link
Contributor

colesbury commentedFeb 28, 2025
edited
Loading

I looked at it briefly under Linux'sperf record and noticed a few things:

  • The most important functions arearray_subscr,array_item, andarray_ass_subscr in that order. Probably nothing else matters much.
  • The critical section code is less efficient because it's in a shared library so there's a bunch of non-inlined calls to_PyThreadState_GetCurrent().

We can consider using a similar strategy asPyListObject where we avoid locking during most read-only single-element accesses. It definitely will complicate the implementation, though.

We can also consider movingarraymodule.c out of a shared library. I'm not really sure what the considerations are for that.

@tom-pytel
Copy link
ContributorAuthor

We can consider using a similar strategy asPyListObject where we avoid locking during most read-only single-element accesses. It definitely will complicate the implementation, though.

List deals with objects with their ownPyObject headers, array is just a blob of memory so not sure how much of that strategy translates.

colesbury added a commit that referenced this pull requestFeb 28, 2025
The change regressed performance on scimark benchmarks from thepyperformance benchmark suite.This reverts commit8ba0d7b.
@colesbury
Copy link
Contributor

I guess it'll be a bit simpler thanPyListObject then. The primary thread-safety hazard is attempting to read or write an entry inob_item while thearrayobject is concurrently resized andob_item is freed and re-allocated.

So if we adapt the list strategy it might look like:

  • Lock thearrayobject during resize and multi-element accesses (e.g.,array_subscr with slices)
  • Mark thearrayobject as shared if it's accessed by a non-owning thread (requires locking)
  • When resizing thearrayobject, use QSBR to delay the free ofob_item if thearrayobject is shared
  • Single-elementarray_subscr andarray_ass_subscr can generally avoid locking

@tom-pytel
Copy link
ContributorAuthor

I guess it'll be a bit simpler thanPyListObject then. The primary thread-safety hazard is attempting to read or write an entry inob_item while thearrayobject is concurrently resized andob_item is freed and re-allocated.

Simplest worst case scenario thing depending on performance could be a spinlock to make access toob_item andob_size shared atomic. Might lose integrity but keep speed and avoid crashes, put responsibility for the data validity on user in multithreaded scenarios.

In any case, got some crude numbers for impact of justarray_item,array_subscr andass_subscr by removing their locks to see rough best case scenario. For reference the order is: OLD array freethread main, NEW slow array freethread, NOLOCK slow array freethread without locking onarray_item,array_subscr orarray_ass_subscr:

### scimark_fft ###OLD:    Mean +- std dev: 238 ms +- 8 msNEW:    Mean +- std dev: 282 ms +- 11 msNOLOCK: Mean +- std dev: 247 ms +- 4 ms### scimark_lu ###OLD:    Mean +- std dev: 88.7 ms +- 3.9 msNEW:    Mean +- std dev: 95.0 ms +- 1.5 msNOLOCK: Mean +- std dev: 88.5 ms +- 1.8 ms### scimark_monte_carlo ###OLD:    Mean +- std dev: 53.4 ms +- 0.5 msNEW:    Mean +- std dev: 56.0 ms +- 0.5 msNOLOCK: Mean +- std dev: 53.9 ms +- 0.9 ms### scimark_sor ###OLD:    Mean +- std dev: 98.1 ms +- 1.7 msNEW:    Mean +- std dev: 100  ms +- 1 msNOLOCK: Mean +- std dev: 97.7 ms +- 1.3 ms### scimark_sparse_mat_mult ###OLD:    Mean +- std dev: 3.84 ms +- 0.14 msNEW:    Mean +- std dev: 4.71 ms +- 0.07 msNOLOCK: Mean +- std dev: 3.87 ms +- 0.13 ms

I have all weekend to play with this so will try various things and what you mentioned.

@tom-pytel
Copy link
ContributorAuthor

Continued in#130771.

seehwan pushed a commit to seehwan/cpython that referenced this pull requestApr 16, 2025
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
seehwan pushed a commit to seehwan/cpython that referenced this pull requestApr 16, 2025
…)" (python#130707)The change regressed performance on scimark benchmarks from thepyperformance benchmark suite.This reverts commit8ba0d7b.
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@colesburycolesburycolesbury left review comments

@picnixzpicnixzpicnixz left review comments

@erlend-aaslanderlend-aaslanderlend-aasland left review comments

@ZeroIntensityZeroIntensityZeroIntensity left review comments

@kumaraditya303kumaraditya303kumaraditya303 approved these changes

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

6 participants
@tom-pytel@erlend-aasland@ZeroIntensity@kumaraditya303@colesbury@picnixz

[8]ページ先頭

©2009-2025 Movatter.jp