Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-132108: Add Buffer Protocol support to int.from_bytes to improve performance#132109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
cmaloney wants to merge4 commits intopython:main
base:main
Choose a base branch
Loading
fromcmaloney:int_from_byteslike

Conversation

cmaloney
Copy link
Contributor

@cmaloneycmaloney commentedApr 5, 2025
edited by bedevere-appbot
Loading

Speed up conversion frombytes-like objects likebytearray while keeping conversion frombytes stable.

On a--with-lto --enable-optimizaitons build on my 64 bit Linux box:

new:

from_bytes_flags: Mean +- std dev: 28.6 ns +- 0.5 nsbench_convert[bytes]: Mean +- std dev: 50.4 ns +- 1.4 nsbench_convert[bytearray]: Mean +- std dev: 51.3 ns +- 0.7 ns

old:

from_bytes_flags: Mean +- std dev: 28.1 ns +- 1.1 nsbench_convert[bytes]: Mean +- std dev: 50.3 ns +- 4.3 nsbench_convert[bytearray]: Mean +- std dev: 64.7 ns +- 0.9 ns

Benchmark code:

importpyperfimporttimedeffrom_bytes_flags(loops):range_it=range(loops)t0=time.perf_counter()for_inrange_it:int.from_bytes(b'\x00\x10',byteorder='big')int.from_bytes(b'\x00\x10',byteorder='little')int.from_bytes(b'\xfc\x00',byteorder='big',signed=True)int.from_bytes(b'\xfc\x00',byteorder='big',signed=False)int.from_bytes([255,0,0],byteorder='big')returntime.perf_counter()-t0sample_bytes= [b'',b'\x00',b'\x01',b'\x7f',b'\x80',b'\xff',b'\x01\x00',b'\x7f\xff',b'\x80\x00',b'\xff\xff',b'\x01\x00\x00',]sample_bytearray= [bytearray(v)forvinsample_bytes]defbench_convert(loops,values):range_it=range(loops)t0=time.perf_counter()for_inrange_it:forvalinvalues:int.from_bytes(val)returntime.perf_counter()-t0runner=pyperf.Runner()runner.bench_time_func('from_bytes_flags',from_bytes_flags,inner_loops=10)runner.bench_time_func('bench_convert[bytes]',bench_convert,sample_bytes,inner_loops=10)runner.bench_time_func('bench_convert[bytearray]',bench_convert,sample_bytearray,inner_loops=10)

Speed up conversion from `bytes-like` objects like `bytearray` whilekeeping conversion from `bytes` stable.On a `--with-lto --enable-optimizaitons` build on my 64 bit Linux box:new:from_bytes_flags: Mean +- std dev: 28.6 ns +- 0.5 nsbench_convert[bytes]: Mean +- std dev: 50.4 ns +- 1.4 nsbench_convert[bytearray]: Mean +- std dev: 51.3 ns +- 0.7 nsold:from_bytes_flags: Mean +- std dev: 28.1 ns +- 1.1 nsbench_convert[bytes]: Mean +- std dev: 50.3 ns +- 4.3 nsbench_convert[bytearray]: Mean +- std dev: 64.7 ns +- 0.9 nsBenchmark code:```pythonimport pyperfimport timedef from_bytes_flags(loops):    range_it = range(loops)    t0 = time.perf_counter()    for _ in range_it:        int.from_bytes(b'\x00\x10', byteorder='big')        int.from_bytes(b'\x00\x10', byteorder='little')        int.from_bytes(b'\xfc\x00', byteorder='big', signed=True)        int.from_bytes(b'\xfc\x00', byteorder='big', signed=False)        int.from_bytes([255, 0, 0], byteorder='big')    return time.perf_counter() - t0sample_bytes = [    b'',    b'\x00',    b'\x01',    b'\x7f',    b'\x80',    b'\xff',    b'\x01\x00',    b'\x7f\xff',    b'\x80\x00',    b'\xff\xff',    b'\x01\x00\x00',]sample_bytearray = [bytearray(v) for v in sample_bytes]def bench_convert(loops, values):    range_it = range(loops)    t0 = time.perf_counter()    for _ in range_it:        for val in values:            int.from_bytes(val)    return time.perf_counter() - t0runner = pyperf.Runner()runner.bench_time_func('from_bytes_flags', from_bytes_flags, inner_loops=10)runner.bench_time_func('bench_convert[bytes]', bench_convert, sample_bytes, inner_loops=10)runner.bench_time_func('bench_convert[bytearray]', bench_convert, sample_bytearray, inner_loops=10)```
skirpichev
skirpichev previously approved these changesApr 5, 2025
picnixz
picnixz previously approved these changesApr 5, 2025
Copy link
Member

@picnixzpicnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can we have benchmarks for very large bytes? maybe you can also say how much we're gaining in the NEWS entry that way.

@picnixzpicnixz changed the titlegh-132108: Add Buffer Protocol support to int.from_bytesgh-132108: Add Buffer Protocol support to int.from_bytes to improve performanceApr 5, 2025
@picnixz
Copy link
Member

picnixz commentedApr 5, 2025
edited
Loading

Small question but how do we cope with classes thatexplicitly define.__bytes__() and are buffer-like? like custombytes objects? (this is an edge-case but still, it can be a breaking change).

Note thatPyObject_Bytes first call__bytes__, then callPyBytes_FromObject if there is no__bytes__ and only then are buffer-like objects considered, but not before. So__bytes__ has a higher priority than buffer-like interface.

Instead, we should restrict ourselves to exact buffer objects, namely exact bytes and bytearray objects.

@picnixzpicnixz dismissed theirstale reviewApril 5, 2025 10:04

I want to check that the edge cases are not an issue.

@cmaloney
Copy link
ContributorAuthor

cmaloney commentedApr 5, 2025
edited
Loading

Cases including classes which implement__bytes__() that return both valid (ex.bytes) and non-valid (ex.str) values are tested intest_long,test_from_bytes so I don't think any critical behavior changes there.

As you point out, if code returns a different set ofmachine bytes when exporting buffer protocol vs__bytes__(), this will change behavior.__bytes__() will not be run, instead just the buffer export will be called. That same issue will come up inPyObject_Bytes vs.PyBytes_FromObject calls asPyObject_Bytes checks__bytes__() first whilePyBytes_FromObject does buffer protocol first and never checks__bytes__(). Code here usesPyObject_Bytes(). I don't think CPython strongly uses one or the other as "more correct".

Could match existing behavior by always checking for a__bytes__ member and!PyBytes_CheckExact() (avoid__bytes__() call forbytes as it changes performance and wasn't present before). To me that isn't as good of an implementation. It is slower (more branches), more complex code, and I prefer encouraging buffer protocol for best performance.

Could restrict to known CPython types (bytes,bytearray,array,memoryview), but that lowers the usefulness to me as systems which implement buffer and__bytes__ for efficiency can't use the newer and potentially more efficient buffer protocol here. It also requires more condition / type checks thanPyObject_CheckBuffer.


Walking through common types passed toint.from_bytes() more explicitly:

  1. exactbytes, the new code will get the data using aPy_buffer rather than increment the ref to thebytes (PyBytes_CheckExact case). Perf test shows performance is stable for that.
  2. "bytes-like" objects (subclasses ofbytes,bytearray,memoryview,array) used the buffer protocol to copy before, use now. Less calls/branches/checks getting to exporting the buffer. Removes a copy of that buffer into aPyBytes. Perf test shows faster forbytearray, likely is for other cases as well.
  3. list,tuple, iterable (other thanstr):PyObject_CheckBuffer will fail for so code will callPyObject_Bytes which will callPyBytes_FromObject to handle, same as before.
  4. str: Doesn't export bytes. That fails / raises aTypeError Intest_long,test_from_bytes validates that behavior. Behavior is unchanged.
  5. Objects that implement__bytes__() but don't support buffer protocol: Tested intest_longtest_from_bytes (ValidBytes, InvalidBytes, RaisingBytes). These behave as before.PyObject_CheckBuffer will fail for so code will callPyObject_Bytes which will callPyBytes_FromObject to handle, same as before.
  6. Objects that implement__bytes__() and support buffer protocol: The__bytes__() function will no longer be called; If it broke the API contract by returningstr for instance code will now run using its buffer protocol to get the underlyingmachine bytes instead of throwing an exception.

Copy link
Member

@sobolevnsobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is a breaking change. Example.

Before:

>>>classX(bytes):...def__bytes__(self):...returnb'X'...>>>int.from_bytes(X(b'a'))88

After:

>>>classX(bytes):...def__bytes__(self):...returnb'X'...>>>int.from_bytes(X(b'a'))97

cmaloney reacted with thumbs up emoji
Comment on lines 1 to 2
Speed up:meth:`int.from_bytes` when passed a bytes-like object such as a
:class:`bytearray`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
Speed up:meth:`int.from_bytes` when passed a bytes-like object such as a
:class:`bytearray`.
Speed up:meth:`int.from_bytes` when passed a bytes-like object such as
:class:`bytes` and:class:`bytearray`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I do not think that it affectsbytes.

@skirpichev
Copy link
Member

This is a breaking change. Example.

Docssays:

Called bybytes to compute a byte-string representation of an object. This should return abytes object. Theobject class itself does not provide this method.

Ifb'X' is a byte-string representation ofb'a' - you are, probably, correct. Otherwise it's just an example, that you could break something, by overriding dunder methods in subclasses. Say,

>>>classint2(int):...def__float__(self):...return3.14...         >>>float(int2(123))3.14

@sobolevn
Copy link
Member

This is an example that the method resolution order changes. It now ignores custom__bytes__ method, I don't think that changing__bytes__ method onbytes subclass is an artificial example. I am pretty sure that people use that in the wild.

The reverse logic is true: PR's author must prove that it does not break things.

cmaloney and srinivasreddy reacted with thumbs up emoji

@skirpichev
Copy link
Member

It now ignores custom__bytes__ method

I would say that if you expose something different via buffer protocol and__bytes__ dunder - it's your fault, isn't? (Though, this constraint isn't documented explicitly.) Just as we, probably, could assume thatfloat(int2(123)) = float(int(int2(123))).

cmaloney reacted with thumbs up emoji

@python-cla-bot
Copy link

python-cla-botbot commentedApr 6, 2025
edited
Loading

All commit authors signed the Contributor License Agreement.

CLA signed

@cmaloney
Copy link
ContributorAuthor

I'll see if I can make a largely performance neutral version that checks__bytes__ before using buffer protocol. The potential disconnect between__bytes__() and__buffer__() concerns me, feels like a source of easy to code hard to detect until they show up somewhere that's a problem bugs... Wondering if there's an efficient way to say something like "If__bytes__() is set,__buffer__() should be cleared (or defaulted tomemoryview(__bytes__())?

>>>classdistinct_bytes_buffer(bytes):...def__bytes__(self):...returnb'b'... ...def__buffer__(self,flags):...returnmemoryview(b'c')... ... ...classsame_bytes_buffer(bytes):...def__bytes__(self):...returnb'b'... ...def__buffer__(self,flags):...returnmemoryview(b'b')...>>>int.from_bytes(distinct_bytes_buffer(b'a'))...99>>>int.from_bytes(same_bytes_buffer(b'a'))...98>>>int.from_bytes(b'a')...97>>>int.from_bytes(b'b')...98>>>int.from_bytes(b'c')...99

@cmaloney
Copy link
ContributorAuthor

Some back pieces for reference:__bytes__ was added tobytes() in 3.11#68422 while type hints were being worked on. The buffer protocol was before that in 3.0 (pep-3118).

@cmaloney
Copy link
ContributorAuthor

cmaloney commentedApr 6, 2025
edited
Loading

Another edge case around these,__bytes__() is only used once and must return only a object that inherits frombytes, on whichPyBytes_AsString is used that returns theob_sval inline storage value. Ifbytes internal storageob_sval,__buffer__(), and__bytes__() vary then all three do sometimes get returned. I think it would be interesting to normalize to a specific behavior (straw man: always__buffer__() first), but that definitely isn't the case today (And suspect would take a larger proposal / PEP to change?).

>>>classmy_bytes(bytes):...def__bytes__(self):...returnb"bytes"... ...def__buffer__(self,flags):...returnmemoryview(b"buffer")... ...classdistinct_bytes_buffer(bytes):...def__bytes__(self):...returnmy_bytes(b"ob_sval")... ...def__buffer__(self,flags):...returnmemoryview(b"distinct_buffer")... ...a=distinct_bytes_buffer(b"distinct_ob_sval")...bytes(a)...b'ob_sval'

Copy link
Member

@serhiy-storchakaserhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM.

Comment on lines 1 to 2
Speed up:meth:`int.from_bytes` when passed a bytes-like object such as a
:class:`bytearray`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I do not think that it affectsbytes.

Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
@cmaloney
Copy link
ContributorAuthor

Created a branch which matches resolution order ofPyObject_Bytes which gives a small performance improvement (~2%, avoids touching reference count) in the common from exact bytes case, keeps most the improvement forbytearray.

branch matching PyObject_Bytes order:

from_bytes_flags: Mean +- std dev: 27.3 ns +- 0.7 nsbench_convert[bytes]: Mean +- std dev: 47.7 ns +- 0.4 nsbench_convert[bytearray]: Mean +- std dev: 54.1 ns +- 0.9 ns

Sobytearray goes from64.7 ns +- 0.9 ns (main) to54.1 ns +- 0.9 ns with change.

@sobolevn's example now returns the same value both before and after:

>>>classX(bytes):...def__bytes__(self):...returnb'X'... ...int.from_bytes(X(b'a'))...88

Should I incorporate here? (cc:@serhiy-storchaka,@sobolevn,@skirpichev)


full diff from main:https://github.com/python/cpython/compare/main...cmaloney:cpython:exp/bytes_first?collapse=1

diff from PR:cmaloney@189f219

@skirpichev
Copy link
Member

Also docs says: "The argument bytes must either be abytes-like object or an iterable producing bytes." Something is wrong: either implementation (in the main) or docs.

@serhiy-storchaka
Copy link
Member

It may be an iterable producing bytes (not the bytes objects, but integers in the range 0 to 255).

@skirpichev
Copy link
Member

It may be an iterable producing bytes (not the bytes objects, but integers in the range 0 to 255).

Yes, this part of the sentence might be at least not clear. But I meant the first part, which has a reference to the glossary term.

from_bytes_flags: Mean +- std dev: [main] 28.3 ns +- 1.3 ns -> [exactbytes] 27.3 ns +- 0.3 ns: 1.04x fasterbench_convert[bytearray]: Mean +- std dev: [main] 65.8 ns +- 3.3 ns -> [exactbytes] 53.1 ns +- 5.1 ns: 1.24x fasterbench_convert_big[bytes]: Mean +- std dev: [main] 51.8 ns +- 0.6 ns -> [exactbytes] 50.3 ns +- 0.5 ns: 1.03x fasterbench_convert_big[bytearray]: Mean +- std dev: [main] 65.8 ns +- 3.0 ns -> [exactbytes] 53.5 ns +- 5.3 ns: 1.23x fasterBenchmark hidden because not significant (1): bench_convert[bytes]
@cmaloney
Copy link
ContributorAuthor

Updated to usePyBytes_CheckExact first as that case is common and it speeds upbytes relative to main. Also tested some bigger byte strings, added speedup note around 128, 256, and 512 bytebytearray objects which are~1.2x faster, thanks for the suggestion@picnixz

from_bytes_flags: Mean +- std dev: [main] 28.3 ns +- 1.3 ns -> [exactbytes] 27.3 ns +- 0.3 ns: 1.04x fasterbench_convert[bytearray]: Mean +- std dev: [main] 65.8 ns +- 3.3 ns -> [exactbytes] 53.1 ns +- 5.1 ns: 1.24x fasterbench_convert_big[bytes]: Mean +- std dev: [main] 51.8 ns +- 0.6 ns -> [exactbytes] 50.3 ns +- 0.5 ns: 1.03x fasterbench_convert_big[bytearray]: Mean +- std dev: [main] 65.8 ns +- 3.0 ns -> [exactbytes] 53.5 ns +- 5.3 ns: 1.23x fasterBenchmark hidden because not significant (1): bench_convert[bytes]
Updated benchmark code

importpyperfimporttimedeffrom_bytes_flags(loops):range_it=range(loops)t0=time.perf_counter()for_inrange_it:int.from_bytes(b'\x00\x10',byteorder='big')int.from_bytes(b'\x00\x10',byteorder='little')int.from_bytes(b'\xfc\x00',byteorder='big',signed=True)int.from_bytes(b'\xfc\x00',byteorder='big',signed=False)int.from_bytes([255,0,0],byteorder='big')returntime.perf_counter()-t0sample_bytes= [b'',b'\x00',b'\x01',b'\x7f',b'\x80',b'\xff',b'\x01\x00',b'\x7f\xff',b'\x80\x00',b'\xff\xff',b'\x01\x00\x00',]sample_bytearray= [bytearray(v)forvinsample_bytes]sample_big= [b'\xff'*128,b'\xff'*256,b'\xff'*512]sample_big_ba= [bytearray(v)forvinsample_bytes]defbench_convert(loops,values):range_it=range(loops)t0=time.perf_counter()for_inrange_it:forvalinvalues:int.from_bytes(val)returntime.perf_counter()-t0runner=pyperf.Runner()# Validate base bytes w/ flags doesn't change perf.runner.bench_time_func('from_bytes_flags',from_bytes_flags,inner_loops=10)runner.bench_time_func('bench_convert[bytes]',bench_convert,sample_bytes,inner_loops=10)runner.bench_time_func('bench_convert[bytearray]',bench_convert,sample_bytearray,inner_loops=10)runner.bench_time_func('bench_convert_big[bytes]',bench_convert,sample_big,inner_loops=10)runner.bench_time_func('bench_convert_big[bytearray]',bench_convert,sample_big_ba,inner_loops=10)

@gpsheadgpshead self-assigned thisMay 21, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@skirpichevskirpichevskirpichev left review comments

@sobolevnsobolevnsobolevn left review comments

@serhiy-storchakaserhiy-storchakaserhiy-storchaka approved these changes

@picnixzpicnixzpicnixz left review comments

Assignees

@gpsheadgpshead

Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

6 participants
@cmaloney@picnixz@skirpichev@sobolevn@serhiy-storchaka@gpshead

[8]ページ先頭

©2009-2025 Movatter.jp