Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.7k
Description
Feature or enhancement
Proposal:
Code reading data in pure python tends to make a buffer variable, callos.read() which returns a separate newly allocated buffer of data, then copy/append that data onto the pre-allocated buffer[0]. That creates unnecessary extra buffer objects, as well as unnecessary copies. Provideos.readinto for directly filling aBuffer Protocol object.
os.readinto should closely mirror_Py_read which underlies os.read in order to get the same behaviors around retries as well as well-tested cross-platform support.
Move simple cases that use os.read (ex. [0]) to use the new API when it makes code simpler and more efficient. Potentially addingreadinto to more readable/writeable file-like proxy objects or objects which transform the data (ex.Lib/_compression) is out of scope for this issue.
[0]
Lines 1914 to 1921 in298dda5
| # Wait for exec to fail or succeed; possibly raising an | |
| # exception (limited in size) | |
| errpipe_data=bytearray() | |
| whileTrue: | |
| part=os.read(errpipe_read,50000) | |
| errpipe_data+=part | |
| ifnotpartorlen(errpipe_data)>50000: | |
| break |
cpython/Lib/multiprocessing/forkserver.py
Lines 384 to 392 in298dda5
| defread_signed(fd): | |
| data=b'' | |
| length=SIGNED_STRUCT.size | |
| whilelen(data)<length: | |
| s=os.read(fd,length-len(data)) | |
| ifnots: | |
| raiseEOFError('unexpected EOF') | |
| data+=s | |
| returnSIGNED_STRUCT.unpack(data)[0] |
Lines 1695 to 1701 in298dda5
| defreadinto(self,b): | |
| """Same as RawIOBase.readinto().""" | |
| m=memoryview(b).cast('B') | |
| data=self.read(len(m)) | |
| n=len(data) | |
| m[:n]=data | |
| returnn |
os.read loops to migrate
Well containedos.read loops
multiprocessing.forkserver read_signed-@cmaloney -gh-129205: Update multiprocessing.forkserver to use os.readinto #129425[x]subprocess Popen._execute_child-@cmaloney -gh-129205: Use os.readinto() in subprocess errpipe_read #129498
os.read loop interleaved with other code
_pyio FileIO.read FileIO.readall FileIO.readintosee,Reduce copies when reading files in pyio, match behavior of _io #129005 --@cmaloney_pyrepl.unix_console UnixConsole.input_buffer-- fixed length underlying buffer with "pos" / window on top.pty _copy. Operates around a "high waterlevel" / attempt to have a fixed-ish size buffer. Wrapsos.readwith a_readfunction.subprocess Popen.communicate. Note, this feels like something non-contiguous Py_buffer would be really good for, particularly inself.text_modewhere currently all the bytes are "copied" into a contiguousbytesto turn then turn into text...tarfile _Stream._read and _Stream.__read. Note, builds _LowLevelFile aroundos.read, but other read methods also available.
Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
Linked PRs
- gh-129205: Add os.readinto API for reading data into a caller provided buffer #129211
- gh-129205: Modernize test_eintr #129316
- gh-129205: Update multiprocessing.forkserver to use os.readinto #129425
- gh-129205: Use os.readinto() in subprocess errpipe_read #129498
- gh-129205: Experiment BytesIO._readfrom() #130098