Python Enhancement Proposals

Python »
PEP Index »
PEP 3118

PEP 3118 – Revising the buffer protocol

Author:: Travis Oliphant <oliphant at ee.byu.edu>, Carl Banks <pythondev at aerojockey.com>
Status:

Table of Contents

Important

This PEP is a historical document. The up-to-date, canonical documentation can now be found atBuffer Protocol,PyBufferProcs,PyMemoryView_FromObject.

Not all features proposed here were implemented. Specifically:

PyObject_CopyToObject was not added.
Additions to the struct string-syntax were not added, except for? (_Bool).
PyObject_GetMemoryView is namedPyMemoryView_FromObject.

This PEP targets Python 3.0, which was released more than a decade ago.Any proposals to add missing functionality shouldbe discussed as new features, not treated as finishing the implementationof this PEP.

SeePEP 1 for how to propose changes.

Abstract

This PEP proposes re-designing the buffer interface (PyBufferProcsfunction pointers) to improve the way Python allows memory sharing inPython 3.0

In particular, it is proposed that the character buffer portionof the API be eliminated and the multiple-segment portion bere-designed in conjunction with allowing for strided memoryto be shared. In addition, the new buffer interface willallow the sharing of any multi-dimensional nature of thememory and what data-format the memory contains.

This interface will allow any extension module to eithercreate objects that share memory or create algorithms thatuse and manipulate raw memory from arbitrary objects thatexport the interface.

Rationale

The Python 2.X buffer protocol allows different Python types toexchange a pointer to a sequence of internal buffers. Thisfunctionality isextremely useful for sharing large segments ofmemory between different high-level objects, but it is too limited andhas issues:

There is the little used “sequence-of-segments” option(bf_getsegcount) that is not well motivated.
There is the apparently redundant character-buffer option(bf_getcharbuffer)
There is no way for a consumer to tell the buffer-API-exportingobject it is “finished” with its view of the memory andtherefore no way for the exporting object to be sure that it issafe to reallocate the pointer to the memory that it owns (forexample, the array object reallocating its memory after sharingit with the buffer object which held the original pointer ledto the infamous buffer-object problem).
Memory is just a pointer with a length. There is no way todescribe what is “in” the memory (float, int, C-structure, etc.)
There is no shape information provided for the memory. But,several array-like Python types could make use of a standardway to describe the shape-interpretation of the memory(wxPython, GTK, pyQT, CVXOPT, PyVox, Audio and VideoLibraries, ctypes, NumPy, data-base interfaces, etc.)
There is no way to share discontiguous memory (except throughthe sequence of segments notion).
There are two widely used libraries that use the concept ofdiscontiguous memory: PIL and NumPy. Their view of discontiguousarrays is different, though. The proposed buffer interface allowssharing of either memory model. Exporters will typically use only oneapproach and consumers may choose to support discontiguousarrays of each type however they choose.
NumPy uses the notion of constant striding in each dimension as itsbasic concept of an array. With this concept, a simple sub-regionof a larger array can be described without copying the data.Thus, stride information is the additional information that must beshared.
The PIL uses a more opaque memory representation. Sometimes animage is contained in a contiguous segment of memory, but sometimesit is contained in an array of pointers to the contiguous segments(usually lines) of the image. The PIL is where the idea of multiplebuffer segments in the original buffer interface came from.
NumPy’s strided memory model is used more often in computationallibraries and because it is so simple it makes sense to supportmemory sharing using this model. The PIL memory model is sometimesused in C-code where a 2-d array can then be accessed using doublepointer indirection: e.g.image[i][j].
The buffer interface should allow the object to export either of thesememory models. Consumers are free to either require contiguous memoryor write code to handle one or both of these memory models.

Proposal Overview

Eliminate the char-buffer and multiple-segment sections of thebuffer-protocol.
Unify the read/write versions of getting the buffer.
Add a new function to the interface that should be called whenthe consumer object is “done” with the memory area.
Add a new variable to allow the interface to describe what is inmemory (unifying what is currently done now in struct andarray)
Add a new variable to allow the protocol to share shape information
Add a new variable for sharing stride information
Add a new mechanism for sharing arrays that mustbe accessed using pointer indirection.
Fix all objects in the core and the standard library to conformto the new interface
Extend the struct module to handle more format specifiers
Extend the buffer object into a new memory object which placesa Python veneer around the buffer interface.
Add a few functions to make it easy to copy contiguous datain and out of object supporting the buffer interface.

Specification

While the new specification allows for complicated memory sharing,simple contiguous buffers of bytes can still be obtained from anobject. In fact, the new protocol allows a standard mechanism fordoing this even if the original object is not represented as acontiguous chunk of memory.

The easiest way to obtain a simple contiguous chunk of memory isto use the provided C-API to obtain a chunk of memory.

Change thePyBufferProcs structure to

typedefstruct{getbufferprocbf_getbuffer;releasebufferprocbf_releasebuffer;}PyBufferProcs;

Both of these routines are optional for a type object

typedefint(*getbufferproc)(PyObject*obj,PyBuffer*view,intflags)

This function returns0 on success and-1 on failure (and raises anerror). The first variable is the “exporting” object. The secondargument is the address to a bufferinfo structure. Both arguments mustnever be NULL.

The third argument indicates what kind of buffer the consumer isprepared to deal with and therefore what kind of buffer the exporteris allowed to return. The new buffer interface allows for much morecomplicated memory sharing possibilities. Some consumers may not beable to handle all the complexity but may want to see if theexporter will let them take a simpler view to its memory.

In addition, some exporters may not be able to share memory in everypossible way and may need to raise errors to signal to some consumersthat something is just not possible. These errors should bePyErr_BufferError unless there is another error that is actuallycausing the problem. The exporter can use flags information tosimplify how much of the PyBuffer structure is filled in withnon-default values and/or raise an error if the object can’t support asimpler view of its memory.

The exporter should always fill in all elements of the bufferstructure (with defaults or NULLs if nothing else is requested). ThePyBuffer_FillInfo function can be used for simple cases.

Access flags

Some flags are useful for requesting a specific kind of memorysegment, while others indicate to the exporter what kind ofinformation the consumer can deal with. If certain information is notasked for by the consumer, but the exporter cannot share its memorywithout that information, then aPyErr_BufferError should be raised.

PyBUF_SIMPLE

This is the default flag state (0). The returned buffer may or maynot have writable memory. The format will be assumed to beunsigned bytes. This is a “stand-alone” flag constant. It neverneeds to be |’d to the others. The exporter will raise an error ifit cannot provide such a contiguous buffer of bytes.

PyBUF_WRITABLE

The returned buffer must be writable. If it is not writable,then raise an error.

PyBUF_FORMAT

The returned buffer must have true format information if this flagis provided. This would be used when the consumer is going to bechecking for what ‘kind’ of data is actually stored. An exportershould always be able to provide this information if requested. Ifformat is not explicitly requested then the format must be returnedasNULL (which means “B”, or unsigned bytes)

PyBUF_ND

The returned buffer must provide shape information. The memory willbe assumed C-style contiguous (last dimension varies the fastest).The exporter may raise an error if it cannot provide this kind ofcontiguous buffer. If this is not given then shape will be NULL.

PyBUF_STRIDES (impliesPyBUF_ND)

The returned buffer must provide strides information (i.e. thestrides cannot be NULL). This would be used when the consumer canhandle strided, discontiguous arrays. Handling stridesautomatically assumes you can handle shape. The exporter may raisean error if cannot provide a strided-only representation of thedata (i.e. without the suboffsets).

PyBUF_C_CONTIGUOUS

PyBUF_F_CONTIGUOUS

PyBUF_ANY_CONTIGUOUS

These flags indicate that the returned buffer must be respectively,C-contiguous (last dimension varies the fastest), Fortrancontiguous (first dimension varies the fastest) or either one.All of these flags imply PyBUF_STRIDES and guarantee that thestrides buffer info structure will be filled in correctly.

PyBUF_INDIRECT (impliesPyBUF_STRIDES)

The returned buffer must have suboffsets information (which can beNULL if no suboffsets are needed). This would be used when theconsumer can handle indirect array referencing implied by thesesuboffsets.

Specialized combinations of flags for specific kinds of memory_sharing.

Multi-dimensional (but contiguous)
PyBUF_CONTIG (PyBUF_ND|PyBUF_WRITABLE)
PyBUF_CONTIG_RO (PyBUF_ND)
Multi-dimensional using strides but aligned
PyBUF_STRIDED (PyBUF_STRIDES|PyBUF_WRITABLE)
PyBUF_STRIDED_RO (PyBUF_STRIDES)
Multi-dimensional using strides and not necessarily aligned
PyBUF_RECORDS (PyBUF_STRIDES|PyBUF_WRITABLE|PyBUF_FORMAT)
PyBUF_RECORDS_RO (PyBUF_STRIDES|PyBUF_FORMAT)
Multi-dimensional using sub-offsets
PyBUF_FULL (PyBUF_INDIRECT|PyBUF_WRITABLE|PyBUF_FORMAT)
PyBUF_FULL_RO (PyBUF_INDIRECT|PyBUF_FORMAT)

Thus, the consumer simply wanting a contiguous chunk of bytes fromthe object would usePyBUF_SIMPLE, while a consumer that understandshow to make use of the most complicated cases could usePyBUF_FULL.

The format information is only guaranteed to be non-NULL ifPyBUF_FORMAT is in the flag argument, otherwise it is expected theconsumer will assume unsigned bytes.

There is a C-API that simple exporting objects can use to fill-in thebuffer info structure correctly according to the provided flags if acontiguous chunk of “unsigned bytes” is all that can be exported.

The Py_buffer struct

The bufferinfo structure is:

structbufferinfo{void*buf;Py_ssize_tlen;intreadonly;constchar*format;intndim;Py_ssize_t*shape;Py_ssize_t*strides;Py_ssize_t*suboffsets;Py_ssize_titemsize;void*internal;}Py_buffer;

Before calling the bf_getbuffer function, the bufferinfo structure canbe filled with whatever, but thebuf field must be NULL whenrequesting a new buffer. Upon return from bf_getbuffer, thebufferinfo structure is filled in with relevant information about thebuffer. This same bufferinfo structure must be passed tobf_releasebuffer (if available) when the consumer is done with thememory. The caller is responsible for keeping a reference to obj untilreleasebuffer is called (i.e. the call to bf_getbuffer does not alterthe reference count of obj).

The members of the bufferinfo structure are:

buf

a pointer to the start of the memory for the object

len

the total bytes of memory the object uses. This should be thesame as the product of the shape array multiplied by the number ofbytes per item of memory.

readonly

an integer variable to hold whether or not the memory is readonly.1 means the memory is readonly, zero means the memory is writable.

format

a NULL-terminated format-string (following the struct-style syntaxincluding extensions) indicating what is in each element ofmemory. The number of elements is len / itemsize, where itemsizeis the number of bytes implied by the format. This can be NULL whichimplies standard unsigned bytes (“B”).

ndim

a variable storing the number of dimensions the memory represents.Must be >=0. A value of 0 means that shape and strides and suboffsetsmust beNULL (i.e. the memory represents a scalar).

shape

an array ofPy_ssize_t of lengthndims indicating theshape of the memory as an N-D array. Note that((*shape)[0]*...*(*shape)[ndims-1])*itemsize=len. If ndims is 0 (indicatinga scalar), then this must beNULL.

strides

address of aPy_ssize_t* variable that will be filled with apointer to an array ofPy_ssize_t of lengthndims (orNULLifndims is 0). indicating the number of bytes to skip to get tothe next element in each dimension. If this is not requested bythe caller (PyBUF_STRIDES is not set), then this should be setto NULL which indicates a C-style contiguous array or aPyExc_BufferError raised if this is not possible.

suboffsets

address of aPy_ssize_t* variable that will be filled with apointer to an array ofPy_ssize_t of length*ndims. Ifthese suboffset numbers are >=0, then the value stored along theindicated dimension is a pointer and the suboffset value dictateshow many bytes to add to the pointer after de-referencing. Asuboffset value that it negative indicates that no de-referencingshould occur (striding in a contiguous memory block). If allsuboffsets are negative (i.e. no de-referencing is needed, thenthis must be NULL (the default value). If this is not requestedby the caller (PyBUF_INDIRECT is not set), then this should beset to NULL or an PyExc_BufferError raised if this is not possible.

For clarity, here is a function that returns a pointer to theelement in an N-D array pointed to by an N-dimensional index whenthere are both non-NULL strides and suboffsets:

void*get_item_pointer(intndim,void*buf,Py_ssize_t*strides,Py_ssize_t*suboffsets,Py_ssize_t*indices){char*pointer=(char*)buf;inti;for(i=0;i<ndim;i++){pointer+=strides[i]*indices[i];if(suboffsets[i]>=0){pointer=*((char**)pointer)+suboffsets[i];}}return(void*)pointer;}

Notice the suboffset is added “after” the dereferencing occurs.Thus slicing in the ith dimension would add to the suboffsets inthe (i-1)st dimension. Slicing in the first dimension would changethe location of the starting pointer directly (i.e. buf wouldbe modified).

itemsize

This is a storage for the itemsize (in bytes) of each element of the sharedmemory. It is technically un-necessary as it can be obtained usingPyBuffer_SizeFromFormat, however an exporter may know thisinformation without parsing the format string and it is necessaryto know the itemsize for proper interpretation of striding.Therefore, storing it is more convenient and faster.

internal

This is for use internally by the exporting object. For example,this might be re-cast as an integer by the exporter and used tostore flags about whether or not the shape, strides, and suboffsetsarrays must be freed when the buffer is released. The consumershould never alter this value.

The exporter is responsible for making sure that any memory pointed toby buf, format, shape, strides, and suboffsets is valid untilreleasebuffer is called. If the exporter wants to be able to changean object’s shape, strides, and/or suboffsets before releasebuffer iscalled then it should allocate those arrays when getbuffer is called(pointing to them in the buffer-info structure provided) and free themwhen releasebuffer is called.

Releasing the buffer

The same bufferinfo struct should be used in the release-bufferinterface call. The caller is responsible for the memory of thePy_buffer structure itself.

typedefvoid(*releasebufferproc)(PyObject*obj,Py_buffer*view)

Callers of getbufferproc must make sure that this function is calledwhen memory previously acquired from the object is no longer needed.The exporter of the interface must make sure that any memory pointedto in the bufferinfo structure remains valid until releasebuffer iscalled.

If the bf_releasebuffer function is not provided (i.e. it is NULL),then it does not ever need to be called.

Exporters will need to define a bf_releasebuffer function if they canre-allocate their memory, strides, shape, suboffsets, or formatvariables which they might share through the struct bufferinfo.Several mechanisms could be used to keep track of how many getbuffercalls have been made and shared. Either a single variable could beused to keep track of how many “views” have been exported, or alinked-list of bufferinfo structures filled in could be maintained ineach object.

All that is specifically required by the exporter, however, is toensure that any memory shared through the bufferinfo structure remainsvalid until releasebuffer is called on the bufferinfo structureexporting that memory.

New C-API calls are proposed

intPyObject_CheckBuffer(PyObject*obj)

Return 1 if the getbuffer function is available otherwise 0.

intPyObject_GetBuffer(PyObject*obj,Py_buffer*view,intflags)

This is a C-API version of the getbuffer function call. It checks tomake sure object has the required function pointer and issues thecall. Returns -1 and raises an error on failure and returns 0 onsuccess.

voidPyBuffer_Release(PyObject*obj,Py_buffer*view)

This is a C-API version of the releasebuffer function call. It checksto make sure the object has the required function pointer and issuesthe call. This function always succeeds even if there is no releasebufferfunction for the object.

PyObject*PyObject_GetMemoryView(PyObject*obj)

Return a memory-view object from an object that defines the buffer interface.

A memory-view object is an extended buffer object that could replacethe buffer object (but doesn’t have to as that could be kept as asimple 1-d memory-view object). Its C-structure is

typedefstruct{PyObject_HEADPyObject*base;Py_bufferview;}PyMemoryViewObject;

This is functionally similar to the current buffer object except areference to base is kept and the memory view is not re-grabbed.Thus, this memory view object holds on to the memory of base until itis deleted.

This memory-view object will support multi-dimensional slicing and bethe first object provided with Python to do so. Slices of thememory-view object are other memory-view objects with the same basebut with a different view of the base object.

When an “element” from the memory-view is returned it is always abytes object whose format should be interpreted by the formatattribute of the memoryview object. The struct module can be used to“decode” the bytes in Python if desired. Or the contents can bepassed to a NumPy array or other object consuming the buffer protocol.

The Python name will be

__builtin__.memoryview

Methods:

__getitem__ (will support multi-dimensional slicing)

__setitem__ (will support multi-dimensional slicing)

tobytes (obtain a new bytes-object of a copy of the memory).

tolist (obtain a “nested” list of the memory. Everythingis interpreted into standard Python objectsas the struct module unpack would do – in factit uses struct.unpack to accomplish it).

Attributes (taken from the memory of the base object):

format
itemsize
shape
strides
suboffsets
readonly
ndim

Py_ssize_tPyBuffer_SizeFromFormat(constchar*)

Return the implied itemsize of the data-format area from a struct-styledescription.

PyObject*PyMemoryView_GetContiguous(PyObject*obj,intbuffertype,charfortran)

Return a memoryview object to a contiguous chunk of memory representedby obj. If a copy must be made (because the memory pointed to by objis not contiguous), then a new bytes object will be created and becomethe base object for the returned memory view object.

The buffertype argument can be PyBUF_READ, PyBUF_WRITE,PyBUF_UPDATEIFCOPY to determine whether the returned buffer should bereadable, writable, or set to update the original buffer if a copymust be made. If buffertype is PyBUF_WRITE and the buffer is notcontiguous an error will be raised. In this circumstance, the usercan use PyBUF_UPDATEIFCOPY to ensure that a writable temporarycontiguous buffer is returned. The contents of this contiguous bufferwill be copied back into the original object after the memoryviewobject is deleted as long as the original object is writable. If thisis not allowed by the original object, then a BufferError is raised.

If the object is multi-dimensional, then if fortran is ‘F’, the firstdimension of the underlying array will vary the fastest in the buffer.If fortran is ‘C’, then the last dimension will vary the fastest(C-style contiguous). If fortran is ‘A’, then it does not matter andyou will get whatever the object decides is more efficient. If a copyis made, then the memory must be freed by callingPyMem_Free.

You receive a new reference to the memoryview object.

intPyObject_CopyToObject(PyObject*obj,void*buf,Py_ssize_tlen,charfortran)

Copylen bytes of data pointed to by the contiguous chunk ofmemory pointed to bybuf into the buffer exported by obj. Return0 on success and return -1 and raise an error on failure. If theobject does not have a writable buffer, then an error is raised. Iffortran is ‘F’, then if the object is multi-dimensional, then the datawill be copied into the array in Fortran-style (first dimension variesthe fastest). If fortran is ‘C’, then the data will be copied intothe array in C-style (last dimension varies the fastest). If fortranis ‘A’, then it does not matter and the copy will be made in whateverway is more efficient.

intPyObject_CopyData(PyObject*dest,PyObject*src)

These last three C-API calls allow a standard way of getting data in andout of Python objects into contiguous memory areas no matter how it isactually stored. These calls use the extended buffer interface to performtheir work.

intPyBuffer_IsContiguous(Py_buffer*view,charfortran)

Return 1 if the memory defined by the view object is C-style (fortran= ‘C’) or Fortran-style (fortran = ‘F’) contiguous or either one(fortran = ‘A’). Return 0 otherwise.

voidPyBuffer_FillContiguousStrides(intndim,Py_ssize_t*shape,Py_ssize_t*strides,Py_ssize_titemsize,charfortran)

Fill the strides array with byte-strides of a contiguous (C-style iffortran is ‘C’ or Fortran-style if fortran is ‘F’ array of the givenshape with the given number of bytes per element.

intPyBuffer_FillInfo(Py_buffer*view,void*buf,Py_ssize_tlen,intreadonly,intinfoflags)

Fills in a buffer-info structure correctly for an exporter that canonly share a contiguous chunk of memory of “unsigned bytes” of thegiven length. Returns 0 on success and -1 (with raising an error) onerror.

PyExc_BufferError

A new error object for returning buffer errors which arise because anexporter cannot provide the kind of buffer that a consumer expects.This will also be raised when a consumer requests a buffer from anobject that does not provide the protocol.

Additions to the struct string-syntax

The struct string-syntax is missing some characters to fullyimplement data-format descriptions already available elsewhere (inctypes and NumPy for example). The Python 2.5 specification isathttp://docs.python.org/library/struct.html.

Here are the proposed additions:

Character	Description
‘t’	bit (number before states how many bits)
‘?’	platform _Bool type
‘g’	long double
‘c’	ucs-1 (latin-1) encoding
‘u’	ucs-2
‘w’	ucs-4
‘O’	pointer to Python Object
‘Z’	complex (whatever the next specifier is)
‘&’	specific pointer (prefix before another character)
‘T{}’	structure (detailed layout inside {})
‘(k1,k2,…,kn)’	multi-dimensional array of whatever follows
‘:name:’	optional name of the preceding element
‘X{}’	pointer to a function (optional function signature inside {} with any return valuepreceded by -> and placed at the end)

The struct module will be changed to understand these as well andreturn appropriate Python objects on unpacking. Unpacking along-double will return a decimal object or a ctypes long-double.Unpacking ‘u’ or ‘w’ will return Python unicode. Unpacking amulti-dimensional array will return a list (of lists if >1d).Unpacking a pointer will return a ctypes pointer object. Unpacking afunction pointer will return a ctypes call-object (perhaps). Unpackinga bit will return a Python Bool. White-space in the struct-stringsyntax will be ignored if it isn’t already. Unpacking a named-objectwill return some kind of named-tuple-like object that acts like atuple but whose entries can also be accessed by name. Unpacking anested structure will return a nested tuple.

Endian-specification (‘!’, ‘@’,’=’,’>’,’<’, ‘^’) is also allowedinside the string so that it can change if needed. Thepreviously-specified endian string is in force until changed. Thedefault endian is ‘@’ which means native data-types and alignment. Ifun-aligned, native data-types are requested, then the endianspecification is ‘^’.

According to the struct-module, a number can precede a charactercode to specify how many of that type there are. The(k1,k2,...,kn) extension also allows specifying if the data issupposed to be viewed as a (C-style contiguous, last-dimensionvaries the fastest) multi-dimensional array of a particular format.

Functions should be added to ctypes to create a ctypes object froma struct description, and add long-double, and ucs-2 to ctypes.

Examples of Data-Format Descriptions

Here are some examples of C-structures and how they would berepresented using the struct-style syntax.

<named> is the constructor for a named-tuple (not-specified yet).

float

'd' <–> Python float

complex double

'Zd' <–> Python complex

RGB Pixel data

'BBB' <–> (int, int, int)'B:r:B:g:B:b:' <–> <named>((int, int, int), (‘r’,’g’,’b’))

Mixed endian (weird but possible)

'>i:big:<i:little:' <–> <named>((int, int), (‘big’, ‘little’))

Nested structure

struct{intival;struct{unsignedshortsval;unsignedcharbval;unsignedcharcval;}sub;}"""i:ival:   T{      H:sval:      B:bval:      B:cval:    }:sub:"""

Nested array

struct{intival;doubledata[16*4];}"""i:ival:   (16,4)d:data:"""

Note, that in the last example, the C-structure compared against isintentionally a 1-d array and not a 2-d array data[16][4]. The reasonfor this is to avoid the confusions between static multi-dimensionalarrays in C (which are laid out contiguously) and dynamicmulti-dimensional arrays which use the same syntax to access elements,data[0][1], but whose memory is not necessarily contiguous. Thestruct-syntaxalways uses contiguous memory and themulti-dimensional character is information about the memory to becommunicated by the exporter.

In other words, the struct-syntax description does not have to matchthe C-syntax exactly as long as it describes the same memory layout.The fact that a C-compiler would think of the memory as a 1-d array ofdoubles is irrelevant to the fact that the exporter wanted tocommunicate to the consumer that this field of the memory should bethought of as a 2-d array where a new dimension is considered afterevery 4 elements.

Code to be affected

All objects and modules in Python that export or consume the oldbuffer interface will be modified. Here is a partial list.

buffer object
bytes object
string object
unicode object
array module
struct module
mmap module
ctypes module

Anything else using the buffer API.

Issues and Details

It is intended that this PEP will be back-ported to Python 2.6 byadding the C-API and the two functions to the existing bufferprotocol.

Previous versions of this PEP proposed a read/write locking scheme,but it was later perceived as a) too complicated for common simple usecases that do not require any locking and b) too simple for use casesthat required concurrent read/write access to a buffer with changing,short-living locks. It is therefore left to users to implement theirown specific locking scheme around buffer objects if they requireconsistent views across concurrent read/write access. A future PEPmay be proposed which includes a separate locking API after someexperience with these user-schemes is obtained

The sharing of strided memory and suboffsets is new and can be seen asa modification of the multiple-segment interface. It is motivated byNumPy and the PIL. NumPy objects should be able to share theirstrided memory with code that understands how to manage strided memorybecause strided memory is very common when interfacing with computelibraries.

Also, with this approach it should be possible to write generic codethat works with both kinds of memory without copying.

Memory management of the format string, the shape array, the stridesarray, and the suboffsets array in the bufferinfo structure is alwaysthe responsibility of the exporting object. The consumer should notset these pointers to any other memory or try to free them.

Several ideas were discussed and rejected:

Having a “releaser” object whose release-buffer was called. Thiswas deemed unacceptable because it caused the protocol to beasymmetric (you called release on something different than you“got” the buffer from). It also complicated the protocol withoutproviding a real benefit.
Passing all the struct variables separately into the function.This had the advantage that it allowed one to set NULL tovariables that were not of interest, but it also made the functioncall more difficult. The flags variable allows the sameability of consumers to be “simple” in how they call the protocol.

Code

The authors of the PEP promise to contribute and maintain the code forthis proposal but will welcome any help.

Examples

Ex. 1

This example shows how an image object that uses contiguous lines might expose its buffer:

structrgba{unsignedcharr,g,b,a;};structImageObject{PyObject_HEAD;...structrgba**lines;Py_ssize_theight;Py_ssize_twidth;Py_ssize_tshape_array[2];Py_ssize_tstride_array[2];Py_ssize_tview_count;};

“lines” points to malloced 1-D array of(structrgba*). Each pointerin THAT block points to a separately malloced array of(structrgba).

In order to access, say, the red value of the pixel at x=30, y=50, you’d use “lines[50][30].r”.

So what does ImageObject’s getbuffer do? Leaving error checking out:

intImage_getbuffer(PyObject*self,Py_buffer*view,intflags){staticPy_ssize_tsuboffsets[2]={0,-1};view->buf=self->lines;view->len=self->height*self->width;view->readonly=0;view->ndims=2;self->shape_array[0]=height;self->shape_array[1]=width;view->shape=&self->shape_array;self->stride_array[0]=sizeof(structrgba*);self->stride_array[1]=sizeof(structrgba);view->strides=&self->stride_array;view->suboffsets=suboffsets;self->view_count++;return0;}intImage_releasebuffer(PyObject*self,Py_buffer*view){self->view_count--;return0;}

Ex. 2

This example shows how an object that wants to expose a contiguouschunk of memory (which will never be re-allocated while the object isalive) would do that.

intmyobject_getbuffer(PyObject*self,Py_buffer*view,intflags){void*buf;Py_ssize_tlen;intreadonly=0;buf=/*Pointtobuffer*/len=/*Settosizeofbuffer*/readonly=/*Setto1ifreadonly*/returnPyObject_FillBufferInfo(view,buf,len,readonly,flags);}/*Noreleasebufferisnecessarybecausethememorywillneverbere-allocated*/

Ex. 3

A consumer that wants to only get a simple contiguous chunk of bytesfrom a Python object, obj would do the following:

Py_bufferview;intret;if(PyObject_GetBuffer(obj,&view,Py_BUF_SIMPLE)<0){/*errorreturn*/}/*Now,view.bufisthepointertomemoryview.lenisthelengthview.readonlyiswhetherornotthememoryisread-only.*//*Afterusingtheinformationandyoudon't need it anymore */if(PyBuffer_Release(obj,&view)<0){/*errorreturn*/}

Ex. 4

A consumer that wants to be able to use any object’s memory but iswriting an algorithm that only handle contiguous memory could do the following:

void*buf;Py_ssize_tlen;char*format;intcopy;copy=PyObject_GetContiguous(obj,&buf,&len,&format,0,'A');if(copy<0){/*errorreturn*/}/*processmemorypointedtobybufferifformatiscorrect*//*Optional:if,afterprocessing,wewanttocopydatafrombufferbackintotheobjectwecoulddo*/if(PyObject_CopyToObject(obj,buf,len,'A')<0){/*errorreturn*/}/*Makesurethatifacopywasmade,thememoryisfreed*/if(copy==1)PyMem_Free(buf);

Copyright

This PEP is placed in the public domain.

Source:https://github.com/python/peps/blob/main/peps/pep-3118.rst

Last modified:2025-02-14 08:06:03 GMT

Movatterモバイル変換

PEP 3118 – Revising the buffer protocol