Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.7k
gh-132551: make BytesIO object free-thread safe#132616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
tom-pytel commentedApr 16, 2025
ZeroIntensity left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The implementation looks fine, but I'm worried that concurrent use ofBytesIO is an actual use-case that we need to consider.
For example, I could see someone using a logger that writes to an IO-like object from multiple threads. If they wanted to log to aBytesIO buffer (e.g., for temporarily suppressing the output), then that would scale pretty terribly with critical sections. I'm speculating, though.
tom-pytel commentedApr 16, 2025
Well, that's why I asked about a specific better benchmark to test with because
fromioimportBytesIOfromrandomimportrandintimportthreadingdefwrite(barrier,b):barrier.wait()b.write(b'0'*randint(100,1000))defcheck(funcs,*args):barrier=threading.Barrier(len(funcs))thrds= []forfuncinfuncs:thrd=threading.Thread(target=func,args=(barrier,*args))thrds.append(thrd)thrd.start()forthrdinthrds:thrd.join()if__name__=="__main__":count=0whileTrue:print(count:=count+1)check([write]*10,BytesIO()) |
ZeroIntensity commentedApr 16, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Well, yeah, we need to fix the races. I'm just wonderinghow we should fix it. Would you mind benchmarking (with this PR) the case I brought up against different numbers of threads? I want to see how bad the contention gets (and compare it to a GILful build as well). |
tom-pytel commentedApr 16, 2025
Ok, will do a bunch of tests tomorrow. |
tom-pytel commentedApr 17, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
EDIT: I just realized with a good bit of rewriting lock-free reads might be possible. EDIT2: Nope, see below. TL;DR: GIL build performance is same as expected while noGIL makes multithread writes possible and otherwise seems ok. Detail: noGIL PR has a minor hit to single-threaded read and a smaller hit to single-threaded write (cmdline timeit). Multithreaded timings are unstable (especially write, run the script and u will get 2x variation for that), but read looks slower than main while write is much faster than main with explicit locking (which is the only thing to compare it to as otherwise writes crash). Times in ns, lower is better. Ping@ZeroIntensity, timings as you requested. importsysfromioimportBytesIOfromthreadingimportBarrier,Lock,Threadfromtimeimporttime_nsBESTOF=5NLINES=50_000LINE=b'.'*99+b'\n'LLEN=len(LINE)LOCK=Lock()defread(barrier,out,b):barrier.wait()t0=time_ns()whileb.read(LLEN):passout[0]=time_ns()-t0defreadline(barrier,out,b):barrier.wait()t0=time_ns()whileb.readline():passout[0]=time_ns()-t0defwrite_locked(barrier,out,b):barrier.wait()t0=time_ns()for_inrange(NLINES):withLOCK:b.write(LINE)out[0]=time_ns()-t0defwrite(barrier,out,b):barrier.wait()t0=time_ns()for_inrange(NLINES):b.write(LINE)out[0]=time_ns()-t0defcheck(func,nthrds,b,is_wr):# is_wr compensates for write total size *= nthrds while read total size is fixedbarrier=Barrier(nthrds)outs= []thrds= []for_inrange(nthrds):out= [0]thrd=Thread(target=func,args=(barrier,out,b))outs.append(out)thrds.append(thrd)thrd.start()forthrdinthrds:thrd.join()returnsum(o[0]foroinouts)/ ((nthrdsifis_wrelse1)*NLINES)defcheckbest(func,nthrds,setup,is_wr):best=float('inf')for_inrange(BESTOF):best=min(best,check(func,nthrds,setup(),is_wr))returnbestif__name__=="__main__":nthrds=int((sys.argv+ ['1'])[1])print(f'read: ',checkbest(read,nthrds,lambda:BytesIO(LINE*NLINES),False),'ns')print(f'readline: ',checkbest(readline,nthrds,lambda:BytesIO(LINE*NLINES),False),'ns')print(f'write_locked: ',checkbest(write_locked,nthrds,lambda:BytesIO(),True),'ns')print(f'write: ',checkbest(write,nthrds,lambda:BytesIO(),True),'ns') |
serhiy-storchaka left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Many functions are just wrapped in critical section. I wonder, what if add special flag, METH_FT_SYNC, and when it set, wrap the method calling code in critical section?
tom-pytel commentedApr 17, 2025
If you're talking about stuff like |
serhiy-storchaka commentedApr 17, 2025
I mean that this would work also for methods that do not use Argument Clinic. Was this idea discussed before? |
tom-pytel commentedApr 17, 2025
Couldn't tell you that, I'm relatively new here (: But one drawback I can see would be that it would only work when calling via py mechanism, if u called directly from C you would still have to do the critical section on your own as opposed to a clinic-generated function which would have it, no? |
ZeroIntensity commentedApr 17, 2025
Yeah, I've suggested it in the past. Apparently, it's been tried and failed before. I don't know much of the details. |
Uh oh!
There was an error while loading.Please reload this page.
tom-pytel commentedApr 17, 2025
Ok, had a look and lock-free reads can almost certainly be done but they would not have the data integrity guarantees you would want from a file-like object without sync. They would work in the same way lock-free multithreaded iteration works, could get dup data and might not sync position correctly wrt writes, which is not what you expect from a "file". So scratch that idea. |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
5dd3a3a intopython:mainUh oh!
There was an error while loading.Please reload this page.
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
ngoldbaum commentedJul 24, 2025
Just to clarify - this didn't make it into 3.14, right? |
tom-pytel commentedJul 24, 2025
Looking at the code I don't see it, so no. |
ngoldbaum commentedJul 24, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
OK - I think we'll make it so pytest-run-parallel marks tests that use I happened to hit the issues fixed by this PR running the tests for |
Co-authored-by: Kumar Aditya <kumaraditya@python.org>
picnixz commentedSep 2, 2025
@kumaraditya303 should this be backported? |
kumaraditya303 commentedSep 3, 2025
I didn't backport this as the change is large and it only crashes when multiple threads modify it concurrently which shouldn't be done. |
tom-pytel commentedSep 5, 2025
It was at the end of the 3.14 cycle so a backport shouldn't be very painful, if you want I could have a look. |
ngoldbaum commentedSep 5, 2025
A backport would be appreciated, I think it would help resolve this discussion and get a downstream release supporting 3.14 out:pycompression/python-isal#233 (comment) |
Thanks@tom-pytel for the PR, and@kumaraditya303 for merging it 🌮🎉.. I'm working now to backport this PR to: 3.14. |
Sorry,@tom-pytel and@kumaraditya303, I could not cleanly backport this to |
GH-138551 is a backport of this pull request to the3.14 branch. |
(cherry picked from commit5dd3a3a)Co-authored-by: Tomasz Pytel <tompytel@gmail.com>Co-authored-by: Kumar Aditya <kumaraditya@python.org>
kumaraditya303 commentedSep 5, 2025
I created#138551 to backport it if it helps, although I think that the isal issue is about sharing |
Uh oh!
There was an error while loading.Please reload this page.
Full
pyperformancesuite times (can anyone can suggest better benchmark forBytesIO?):Changes,
*functions had critical section added,-not. Atomics were added forexportsfield:Notes:
bufand orposatomic everywhere.test_free_threading.test_io.TestBytesIOclean under--parallel-threadsTSAN.bytesiobuf_getbufferdoesn't need critical sections because as far as I see its only used forBytesIOinbytesio.cwhere its use is protected by a critical section in_io_BytesIO_getbuffer_impl(its also used for a random non-BytesIO-related test in_testcapimodule.c).BytesIOunshare_bufferin threads on a free-threaded build #132551