Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
gh-132108: Add Buffer Protocol support to int.from_bytes to improve performance#132109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
Speed up conversion from `bytes-like` objects like `bytearray` whilekeeping conversion from `bytes` stable.On a `--with-lto --enable-optimizaitons` build on my 64 bit Linux box:new:from_bytes_flags: Mean +- std dev: 28.6 ns +- 0.5 nsbench_convert[bytes]: Mean +- std dev: 50.4 ns +- 1.4 nsbench_convert[bytearray]: Mean +- std dev: 51.3 ns +- 0.7 nsold:from_bytes_flags: Mean +- std dev: 28.1 ns +- 1.1 nsbench_convert[bytes]: Mean +- std dev: 50.3 ns +- 4.3 nsbench_convert[bytearray]: Mean +- std dev: 64.7 ns +- 0.9 nsBenchmark code:```pythonimport pyperfimport timedef from_bytes_flags(loops): range_it = range(loops) t0 = time.perf_counter() for _ in range_it: int.from_bytes(b'\x00\x10', byteorder='big') int.from_bytes(b'\x00\x10', byteorder='little') int.from_bytes(b'\xfc\x00', byteorder='big', signed=True) int.from_bytes(b'\xfc\x00', byteorder='big', signed=False) int.from_bytes([255, 0, 0], byteorder='big') return time.perf_counter() - t0sample_bytes = [ b'', b'\x00', b'\x01', b'\x7f', b'\x80', b'\xff', b'\x01\x00', b'\x7f\xff', b'\x80\x00', b'\xff\xff', b'\x01\x00\x00',]sample_bytearray = [bytearray(v) for v in sample_bytes]def bench_convert(loops, values): range_it = range(loops) t0 = time.perf_counter() for _ in range_it: for val in values: int.from_bytes(val) return time.perf_counter() - t0runner = pyperf.Runner()runner.bench_time_func('from_bytes_flags', from_bytes_flags, inner_loops=10)runner.bench_time_func('bench_convert[bytes]', bench_convert, sample_bytes, inner_loops=10)runner.bench_time_func('bench_convert[bytearray]', bench_convert, sample_bytearray, inner_loops=10)```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Can we have benchmarks for very large bytes? maybe you can also say how much we're gaining in the NEWS entry that way.
picnixz commentedApr 5, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Small question but how do we cope with classes thatexplicitly define Note that Instead, we should restrict ourselves to exact buffer objects, namely exact bytes and bytearray objects. |
I want to check that the edge cases are not an issue.
cmaloney commentedApr 5, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Cases including classes which implement As you point out, if code returns a different set ofmachine bytes when exporting buffer protocol vs Could match existing behavior by always checking for a Could restrict to known CPython types ( Walking through common types passed to
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This is a breaking change. Example.
Before:
>>>classX(bytes):...def__bytes__(self):...returnb'X'...>>>int.from_bytes(X(b'a'))88
After:
>>>classX(bytes):...def__bytes__(self):...returnb'X'...>>>int.from_bytes(X(b'a'))97
Speed up:meth:`int.from_bytes` when passed a bytes-like object such as a | ||
:class:`bytearray`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Speed up:meth:`int.from_bytes` when passed a bytes-like object such as a | |
:class:`bytearray`. | |
Speed up:meth:`int.from_bytes` when passed a bytes-like object such as | |
:class:`bytes` and:class:`bytearray`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I do not think that it affectsbytes
.
Docssays:
If >>>classint2(int):...def__float__(self):...return3.14... >>>float(int2(123))3.14 |
This is an example that the method resolution order changes. It now ignores custom The reverse logic is true: PR's author must prove that it does not break things. |
Misc/NEWS.d/next/Core_and_Builtins/2025-04-04-20-38-29.gh-issue-132108.UwZIQy.rst OutdatedShow resolvedHide resolved
Uh oh!
There was an error while loading.Please reload this page.
I would say that if you expose something different via buffer protocol and |
python-cla-botbot commentedApr 6, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
I'll see if I can make a largely performance neutral version that checks >>>classdistinct_bytes_buffer(bytes):...def__bytes__(self):...returnb'b'... ...def__buffer__(self,flags):...returnmemoryview(b'c')... ... ...classsame_bytes_buffer(bytes):...def__bytes__(self):...returnb'b'... ...def__buffer__(self,flags):...returnmemoryview(b'b')...>>>int.from_bytes(distinct_bytes_buffer(b'a'))...99>>>int.from_bytes(same_bytes_buffer(b'a'))...98>>>int.from_bytes(b'a')...97>>>int.from_bytes(b'b')...98>>>int.from_bytes(b'c')...99 |
cmaloney commentedApr 6, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Another edge case around these, >>>classmy_bytes(bytes):...def__bytes__(self):...returnb"bytes"... ...def__buffer__(self,flags):...returnmemoryview(b"buffer")... ...classdistinct_bytes_buffer(bytes):...def__bytes__(self):...returnmy_bytes(b"ob_sval")... ...def__buffer__(self,flags):...returnmemoryview(b"distinct_buffer")... ...a=distinct_bytes_buffer(b"distinct_ob_sval")...bytes(a)...b'ob_sval' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM.
Speed up:meth:`int.from_bytes` when passed a bytes-like object such as a | ||
:class:`bytearray`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I do not think that it affectsbytes
.
Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
Created a branch which matches resolution order of branch matching PyObject_Bytes order:
So @sobolevn's example now returns the same value both before and after: >>>classX(bytes):...def__bytes__(self):...returnb'X'... ...int.from_bytes(X(b'a'))...88 Should I incorporate here? (cc:@serhiy-storchaka,@sobolevn,@skirpichev) full diff from main:https://github.com/python/cpython/compare/main...cmaloney:cpython:exp/bytes_first?collapse=1 diff from PR:cmaloney@189f219 |
Also docs says: "The argument bytes must either be abytes-like object or an iterable producing bytes." Something is wrong: either implementation (in the main) or docs. |
It may be an iterable producing bytes (not the bytes objects, but integers in the range 0 to 255). |
Yes, this part of the sentence might be at least not clear. But I meant the first part, which has a reference to the glossary term. |
from_bytes_flags: Mean +- std dev: [main] 28.3 ns +- 1.3 ns -> [exactbytes] 27.3 ns +- 0.3 ns: 1.04x fasterbench_convert[bytearray]: Mean +- std dev: [main] 65.8 ns +- 3.3 ns -> [exactbytes] 53.1 ns +- 5.1 ns: 1.24x fasterbench_convert_big[bytes]: Mean +- std dev: [main] 51.8 ns +- 0.6 ns -> [exactbytes] 50.3 ns +- 0.5 ns: 1.03x fasterbench_convert_big[bytearray]: Mean +- std dev: [main] 65.8 ns +- 3.0 ns -> [exactbytes] 53.5 ns +- 5.3 ns: 1.23x fasterBenchmark hidden because not significant (1): bench_convert[bytes]
Updated to use
Updated benchmark codeimportpyperfimporttimedeffrom_bytes_flags(loops):range_it=range(loops)t0=time.perf_counter()for_inrange_it:int.from_bytes(b'\x00\x10',byteorder='big')int.from_bytes(b'\x00\x10',byteorder='little')int.from_bytes(b'\xfc\x00',byteorder='big',signed=True)int.from_bytes(b'\xfc\x00',byteorder='big',signed=False)int.from_bytes([255,0,0],byteorder='big')returntime.perf_counter()-t0sample_bytes= [b'',b'\x00',b'\x01',b'\x7f',b'\x80',b'\xff',b'\x01\x00',b'\x7f\xff',b'\x80\x00',b'\xff\xff',b'\x01\x00\x00',]sample_bytearray= [bytearray(v)forvinsample_bytes]sample_big= [b'\xff'*128,b'\xff'*256,b'\xff'*512]sample_big_ba= [bytearray(v)forvinsample_bytes]defbench_convert(loops,values):range_it=range(loops)t0=time.perf_counter()for_inrange_it:forvalinvalues:int.from_bytes(val)returntime.perf_counter()-t0runner=pyperf.Runner()# Validate base bytes w/ flags doesn't change perf.runner.bench_time_func('from_bytes_flags',from_bytes_flags,inner_loops=10)runner.bench_time_func('bench_convert[bytes]',bench_convert,sample_bytes,inner_loops=10)runner.bench_time_func('bench_convert[bytearray]',bench_convert,sample_bytearray,inner_loops=10)runner.bench_time_func('bench_convert_big[bytes]',bench_convert,sample_big,inner_loops=10)runner.bench_time_func('bench_convert_big[bytearray]',bench_convert,sample_big_ba,inner_loops=10) |
Uh oh!
There was an error while loading.Please reload this page.
Speed up conversion from
bytes-like
objects likebytearray
while keeping conversion frombytes
stable.On a
--with-lto --enable-optimizaitons
build on my 64 bit Linux box:new:
old:
Benchmark code: