Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[WIP] gh-129813, PEP 782: Add PyBytesWriter C API#131681

Closed
vstinner wants to merge 30 commits intopython:mainfrom
vstinner:bytes_writer_size
Closed

[WIP] gh-129813, PEP 782: Add PyBytesWriter C API#131681
vstinner wants to merge 30 commits intopython:mainfrom
vstinner:bytes_writer_size

Conversation

@vstinner
Copy link
Member

@vstinnervstinner commentedMar 24, 2025
edited by bedevere-appbot
Loading

Add functions:

  • PyBytesWriter_Create()
  • PyBytesWriter_Discard()
  • PyBytesWriter_Finish()
  • PyBytesWriter_FinishWithSize()
  • PyBytesWriter_FinishWithEndPointer()
  • PyBytesWriter_Data()
  • PyBytesWriter_Allocated()
  • PyBytesWriter_SetSize()
  • PyBytesWriter_Resize()

Add functions:* PyBytesWriter_Create()* PyBytesWriter_Discard()* PyBytesWriter_Finish()* PyBytesWriter_FinishWithSize()* PyBytesWriter_FinishWithEndPointer()* PyBytesWriter_Data()* PyBytesWriter_Allocated()* PyBytesWriter_SetSize()* PyBytesWriter_Resize()
@vstinnervstinner changed the title[WIP] gh-129813: Add PyBytesWriter C API (with size flavor)[WIP] gh-129813: Add PyBytesWriter C API (flavor with size)Mar 24, 2025
* Add PyBytesWriter_GetSize()* Rename:  * PyBytesWriter_Data() => PyBytesWriter_GetData()  * PyBytesWriter_Allocated() => PyBytesWriter_GetAllocated()
Convert _PyBytes_FromHex().
Replace  PyBytes_FromStringAndSize(NULL, 0) withPy_GetConstant(Py_CONSTANT_EMPTY_BYTES).
@vstinnervstinner changed the title[WIP] gh-129813: Add PyBytesWriter C API (flavor with size)[WIP] gh-129813, PEP 782: Add PyBytesWriter C APIApr 2, 2025
@vstinner
Copy link
MemberAuthor

vstinner commentedApr 22, 2025
edited
Loading

This change has no impact on performance, even if the new public API allocates memory on the heap, instead of allocating on the stack. It uses a freelist to optimizePyBytesWriter_Create().

Microbenchmark on 3 functions, to compare the private_PyBytesWriter (ref) to the new publicPyBytesWriter (change):

  • bytes(list)
  • bytes.fromhex(str)
  • binascii.b2a_uu(bytes)
importpyperfimportbinasciirunner=pyperf.Runner()runner.bench_func('from list 100',bytes,list(b'x'*100))runner.bench_func('from list 1,000',bytes,list(b'x'*1_000))runner.bench_func('from hex 100',bytes.fromhex,bytes(range(100)).hex())runner.bench_func('from hex 1,000',bytes.fromhex, (b'x'*1_000).hex())runner.bench_func('b2a_uu',binascii.b2a_uu,b'x'*45)

Result:

Benchmarkrefchange
from list 100631 ns623 ns: 1.01x faster
from hex 100141 ns145 ns: 1.03x slower
from hex 1,0001.03 us1.04 us: 1.00x slower
b2a_uu112 ns111 ns: 1.01x faster
Geometric mean(ref)1.00x slower

Benchmark hidden because not significant (1): from list 1,000

@vstinner
Copy link
MemberAuthor

Benchmark comparingPyBytes_FromStringAndSize(NULL, length) (ref) toPyBytesWriter_Create() (change).

Benchmark:

importpyperfSIZES= (10,100,500)runner=pyperf.Runner()forsizeinSIZES:large_int= (2** (size*8)-1)runner.bench_func(f'to_bytes({size})',large_int.to_bytes,size)forsizeinSIZES:mem=memoryview(b'x'*size)runner.bench_func(f'memoryview({size}).tobytes()',mem.tobytes)

Result:

Benchmarkrefchange
to_bytes(10)56.3 ns66.4 ns: 1.18x slower (+10.1 ns)
to_bytes(100)152 ns162 ns: 1.06x slower (+10 ns)
to_bytes(500)563 ns559 ns: 1.01x faster (+4 ns)
memoryview(10).tobytes()37.5 ns47.0 ns: 1.25x slower (+9.5 ns)
memoryview(100).tobytes()35.3 ns46.6 ns: 1.32x slower (+11.3 ns)
memoryview(500).tobytes()45.5 ns55.3 ns: 1.21x slower (+9.8 ns)
Geometric mean(ref)1.16x slower

It's hard to beatPyBytes_FromStringAndSize(NULL, length) performance, sincePyBytesWriter_Create() is a wrapper built on top ofPyBytes_FromStringAndSize(NULL, length).

There is an overhead around10 ns when usingPyBytesWriter.

@serhiy-storchaka
Copy link
Member

Could you please benchmark the following?

  • ASCII, Latin1 and UTF-8 encoders. For ASCII-only and non-ASCII data.
  • The backslashreplace and xmlcharrefreplace error handlers (encoding).
  • PyBytes_FromFormat(). Especially with few % formats and large raw data between them.
  • PyBytes_DecodeEscape().

@vstinner
Copy link
MemberAuthor

I wrote a big PR to show how PEP 782 would look like and how it's being used. But if PEP 782 is accepted, I will only start by adding the API without using it. Then I will write separated changes to use the new API and run benchmarks on each change.

ASCII, Latin1 and UTF-8 encoders. For ASCII-only and non-ASCII data.

I didn't modify these encoders, they still use the private_PyBytesWriter API.

The backslashreplace and xmlcharrefreplace error handlers (encoding).

Same.

If I modify these encoders and error handlers later, I will run benchmarks to decide if it's acceptable to use the public API or not.

@vstinner
Copy link
MemberAuthor

Microbenchmark onPyBytes_FromFormat() andPyBytes_DecodeEscape() functions.

Details
importpyperfrunner=pyperf.Runner()importctypesfromctypesimportpythonapi,py_objectfromctypesimport (c_int,c_uint,c_long,c_ulong,c_size_t,c_ssize_t,c_char_p)PyBytes_FromFormat=pythonapi.PyBytes_FromFormatPyBytes_FromFormat.argtypes= (c_char_p,)PyBytes_FromFormat.restype=py_objectPyBytes_DecodeEscape=pythonapi.PyBytes_DecodeEscapePyBytes_DecodeEscape.argtypes= (c_char_p,c_size_t,c_char_p,c_size_t,c_char_p)PyBytes_DecodeEscape.restype=py_objectrunner.bench_func('Format hello world',PyBytes_FromFormat,b'Hello %s !',b'world')fmt= (b'Hell%c'+b' '*1024+b' %s')runner.bench_func('Format long format',PyBytes_FromFormat,fmt,c_int(ord('o')),b'world')s=b'abc\\ndef\\x40.'runner.bench_func('Decode simple',PyBytes_DecodeEscape,s,len(s),None,0,b'unused')s=b'x'*1024runner.bench_func('Decode long copy',PyBytes_DecodeEscape,s,len(s),None,0,b'unused')s=b'\\x40'*1024runner.bench_func('Decode long\\x40',PyBytes_DecodeEscape,s,len(s),None,0,b'unused')

Results:

Benchmarkrefpep782
Format long format1.06 us1.04 us: 1.02x faster
Decode simple776 ns743 ns: 1.04x faster
Decode long copy1.38 us1.34 us: 1.03x faster
Decode long \x402.70 us2.67 us: 1.01x faster
Geometric mean(ref)1.02x faster

Benchmark hidden because not significant (1): Format hello world

I'm not sure why PEP 782 is faster, but at least it's not slower :-)

I build Python withgcc -O3 (without PGO, LTO, CPU isolation).

serhiy-storchaka reacted with thumbs up emoji

@vstinner
Copy link
MemberAuthor

I started to split this huge PR into smaller PRs, see PRs attached to the issue#129813.

@vstinnervstinner deleted the bytes_writer_size branchDecember 3, 2025 15:37
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@corona10corona10Awaiting requested review from corona10corona10 will be requested when the pull request is marked ready for reviewcorona10 is a code owner

@erlend-aaslanderlend-aaslandAwaiting requested review from erlend-aaslanderlend-aasland will be requested when the pull request is marked ready for reviewerlend-aasland is a code owner

@serhiy-storchakaserhiy-storchakaAwaiting requested review from serhiy-storchakaserhiy-storchaka will be requested when the pull request is marked ready for reviewserhiy-storchaka is a code owner

@gpsheadgpsheadAwaiting requested review from gpsheadgpshead will be requested when the pull request is marked ready for reviewgpshead is a code owner

@picnixzpicnixzAwaiting requested review from picnixzpicnixz will be requested when the pull request is marked ready for reviewpicnixz is a code owner

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrentlyericsnowcurrently will be requested when the pull request is marked ready for reviewericsnowcurrently is a code owner

@ZeroIntensityZeroIntensityAwaiting requested review from ZeroIntensityZeroIntensity will be requested when the pull request is marked ready for reviewZeroIntensity is a code owner

Assignees

No one assigned

Labels

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@vstinner@serhiy-storchaka

Comments


[8]ページ先頭

©2009-2026 Movatter.jp