Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 782 – Add PyBytesWriter C API

Author:
Victor Stinner <vstinner at python.org>
Discussions-To:
Discourse thread
Status:
Final
Type:
Standards Track
Created:
27-Mar-2025
Python-Version:
3.15
Post-History:
18-Feb-2025
Resolution:
11-Sep-2025

Table of Contents

Important

This PEP is a historical document. The up-to-date, canonical documentation can now be found at thePyBytesWriter API.

×

SeePEP 1 for how to propose changes.

Abstract

Add a newPyBytesWriter C API to createbytes objects.

Soft deprecatePyBytes_FromStringAndSize(NULL,size) and_PyBytes_Resize() APIs. These APIs treat an immutablebytesobject as a mutable object. They remain available and maintained, don’temit deprecation warning, but are no longer recommended when writing newcode.

Rationale

Disallow creation of incomplete/inconsistent objects

Creating a Pythonbytes object usingPyBytes_FromStringAndSize(NULL,size) and_PyBytes_Resize()treats an immutablebytes object as mutable. It goes againstthe principle thatbytes objects are immutable. It also createsan incomplete or “invalid” object since bytes are not initialized. InPython, abytes object should always have its bytes fullyinitialized.

Inefficient allocation strategy

When creating a bytes string and the output size is unknown, onestrategy is to allocate a short buffer and extend it (to the exact size)each time a larger write is needed.

This strategy is inefficient because it requires enlarging the buffermultiple times. It’s more efficient to overallocate the buffer thefirst time that a larger write is needed. It reduces the number ofexpensiverealloc() operations which can imply a memory copy.

Specification

API

typePyBytesWriter
A Pythonbytes writer instance created byPyBytesWriter_Create().

The instance must be destroyed byPyBytesWriter_Finish() orPyBytesWriter_Discard().

Create, Finish, Discard

PyBytesWriter*PyBytesWriter_Create(Py_ssize_tsize)
Create aPyBytesWriter to writesize bytes.

Ifsize is greater than zero, allocatesize bytes, and set thewriter size tosize. The caller is responsible to writesizebytes usingPyBytesWriter_GetData().

On error, set an exception and return NULL.

size must be positive or zero.

PyObject*PyBytesWriter_Finish(PyBytesWriter*writer)
Finish aPyBytesWriter created byPyBytesWriter_Create().

On success, return a Pythonbytes object.On error, set an exception and returnNULL.

The writer instance is invalid after the call in any case.

PyObject*PyBytesWriter_FinishWithSize(PyBytesWriter*writer,Py_ssize_tsize)
Similar toPyBytesWriter_Finish(), but resize the writertosize bytes before creating thebytes object.
PyObject*PyBytesWriter_FinishWithPointer(PyBytesWriter*writer,void*buf)
Similar toPyBytesWriter_Finish(), but resize the writerusingbuf pointer before creating thebytes object.

Set an exception and returnNULL ifbuf pointer is outside theinternal buffer bounds.

Function pseudo-code:

Py_ssize_tsize=(char*)buf-(char*)PyBytesWriter_GetData(writer);returnPyBytesWriter_FinishWithSize(writer,size);
voidPyBytesWriter_Discard(PyBytesWriter*writer)
Discard aPyBytesWriter created byPyBytesWriter_Create().

Do nothing ifwriter isNULL.

The writer instance is invalid after the call.

High-level API

intPyBytesWriter_WriteBytes(PyBytesWriter*writer,constvoid*bytes,Py_ssize_tsize)
Grow thewriter internal buffer bysize bytes,writesize bytes ofbytes at thewriter end,and addsize to thewriter size.

Ifsize is equal to-1, callstrlen(bytes) to get thestring length.

On success, return0.On error, set an exception and return-1.

intPyBytesWriter_Format(PyBytesWriter*writer,constchar*format,...)
Similar toPyBytes_FromFormat(), but write the output directly atthe writer end. Grow the writer internal buffer on demand.Then add the written size to the writer size.

On success, return0.On error, set an exception and return-1.

Getters

Py_ssize_tPyBytesWriter_GetSize(PyBytesWriter*writer)
Get the writer size.
void*PyBytesWriter_GetData(PyBytesWriter*writer)
Get the writer data: start of the internal buffer.

The pointer is valid untilPyBytesWriter_Finish() orPyBytesWriter_Discard() is called onwriter.

Low-level API

intPyBytesWriter_Resize(PyBytesWriter*writer,Py_ssize_tsize)
Resize the writer tosize bytes. It can be used to enlarge or toshrink the writer.

Newly allocated bytes are left uninitialized.

On success, return0.On error, set an exception and return-1.

size must be positive or zero.

intPyBytesWriter_Grow(PyBytesWriter*writer,Py_ssize_tgrow)
Resize the writer by addinggrow bytes to the current writer size.

Newly allocated bytes are left uninitialized.

On success, return0.On error, set an exception and return-1.

size can be negative to shrink the writer.

void*PyBytesWriter_GrowAndUpdatePointer(PyBytesWriter*writer,Py_ssize_tsize,void*buf)
Similar toPyBytesWriter_Grow(), but update also thebufpointer.

Thebuf pointer is moved if the internal buffer is moved in memory.Thebuf relative position within the internal buffer is leftunchanged.

On error, set an exception and returnNULL.

buf must not beNULL.

Function pseudo-code:

Py_ssize_tpos=(char*)buf-(char*)PyBytesWriter_GetData(writer);if(PyBytesWriter_Grow(writer,size)<0){returnNULL;}return(char*)PyBytesWriter_GetData(writer)+pos;

Overallocation

PyBytesWriter_Resize() andPyBytesWriter_Grow()overallocate the internal buffer to reduce the number ofrealloc()calls and so reduce memory copies.

PyBytesWriter_Finish() trims overallocations: it shrinks theinternal buffer to the exact size when creating the finalbytesobject.

Thread safety

The API is not thread safe: a writer should only be used by a singlethread at the same time.

Soft deprecations

Soft deprecatePyBytes_FromStringAndSize(NULL,size) and_PyBytes_Resize() APIs. These APIs treat an immutablebytesobject as a mutable object. They remain available and maintained, don’temit deprecation warning, but are no longer recommended when writing newcode.

PyBytes_FromStringAndSize(str,size) is not soft deprecated. Onlycalls withNULLstr are soft deprecated.

Examples

High-level API

Create the bytes stringb"HelloWorld!":

PyObject*hello_world(void){PyBytesWriter*writer=PyBytesWriter_Create(0);if(writer==NULL){gotoerror;}if(PyBytesWriter_WriteBytes(writer,"Hello",-1)<0){gotoerror;}if(PyBytesWriter_Format(writer," %s!","World")<0){gotoerror;}returnPyBytesWriter_Finish(writer);error:PyBytesWriter_Discard(writer);returnNULL;}

Create the bytes string “abc”

Example creating the bytes stringb"abc", with a fixed size of 3 bytes:

PyObject*create_abc(void){PyBytesWriter*writer=PyBytesWriter_Create(3);if(writer==NULL){returnNULL;}char*str=PyBytesWriter_GetData(writer);memcpy(str,"abc",3);returnPyBytesWriter_Finish(writer);}

GrowAndUpdatePointer() example

Example using a pointer to write bytes and to track the written size.

Create the bytes stringb"HelloWorld":

PyObject*grow_example(void){// Allocate 10 bytesPyBytesWriter*writer=PyBytesWriter_Create(10);if(writer==NULL){returnNULL;}// Write some byteschar*buf=PyBytesWriter_GetData(writer);memcpy(buf,"Hello ",strlen("Hello "));buf+=strlen("Hello ");// Allocate 10 more bytesbuf=PyBytesWriter_GrowAndUpdatePointer(writer,10,buf);if(buf==NULL){PyBytesWriter_Discard(writer);returnNULL;}// Write more bytesmemcpy(buf,"World",strlen("World"));buf+=strlen("World");// Truncate the string at 'buf' position// and create a bytes objectreturnPyBytesWriter_FinishWithPointer(writer,buf);}

UpdatePyBytes_FromStringAndSize() code

Example of code using the soft deprecatedPyBytes_FromStringAndSize(NULL,size) API:

PyObject*result=PyBytes_FromStringAndSize(NULL,num_bytes);if(result==NULL){returnNULL;}if(copy_bytes(PyBytes_AS_STRING(result),start,num_bytes)<0){Py_CLEAR(result);}returnresult;

It can now be updated to:

PyBytesWriter*writer=PyBytesWriter_Create(num_bytes);if(writer==NULL){returnNULL;}if(copy_bytes(PyBytesWriter_GetData(writer),start,num_bytes)<0){PyBytesWriter_Discard(writer);returnNULL;}returnPyBytesWriter_Finish(writer);

Update_PyBytes_Resize() code

Example of code using the soft deprecated_PyBytes_Resize() API:

PyObject*v=PyBytes_FromStringAndSize(NULL,size);if(v==NULL){returnNULL;}char*p=PyBytes_AS_STRING(v);// ... fill bytes into 'p' ...if(_PyBytes_Resize(&v,(p-PyBytes_AS_STRING(v)))){returnNULL;}returnv;

It can now be updated to:

PyBytesWriter*writer=PyBytesWriter_Create(size);if(writer==NULL){returnNULL;}char*p=PyBytesWriter_GetData(writer);// ... fill bytes into 'p' ...returnPyBytesWriter_FinishWithPointer(writer,p);

Reference Implementation

Pull request gh-131681.

Notes on the CPython reference implementation which are not part of theSpecification:

  • The implementation allocates internally abytes object, soPyBytesWriter_Finish() just returns the object without havingto copy memory.
  • For strings up to 256 bytes, a small internal raw buffer of bytes isused. It avoids having to resize abytes object which isinefficient. At the end,PyBytesWriter_Finish() creates thebytes object from this small buffer.
  • A free list is used to reduce the cost of allocating aPyBytesWriter on the heap memory.

Backwards Compatibility

There is no impact on the backward compatibility, only new APIs areadded.

PyBytes_FromStringAndSize(NULL,size) and_PyBytes_Resize() APIsare soft deprecated. No new warnings is emitted when these functions areused and they are not planned for removal.

Prior Discussions

Copyright

This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.


Source:https://github.com/python/peps/blob/main/peps/pep-0782.rst

Last modified:2025-09-18 13:28:58 GMT


[8]ページ先頭

©2009-2025 Movatter.jp