Python Enhancement Proposals

Python »
PEP Index »
PEP 574

PEP 574 – Pickle protocol 5 with out-of-band data

Author:: Antoine Pitrou <solipsis at pitrou.net>
BDFL-Delegate:: Alyssa Coghlan
Status:

Table of Contents

Abstract

This PEP proposes to standardize a new pickle protocol version, andaccompanying APIs to take full advantage of it:

A new pickle protocol version (5) to cover the extra metadata neededfor out-of-band data buffers.
A newPickleBuffer type for__reduce_ex__ implementationsto return out-of-band data buffers.
A newbuffer_callback parameter when pickling, to handle out-of-banddata buffers.
A newbuffers parameter when unpickling to provide out-of-band databuffers.

The PEP guarantees unchanged behaviour for anyone not using the new APIs.

The pickle protocol was originally designed in 1995 for on-disk persistencyof arbitrary Python objects. The performance of a 1995-era storage mediumprobably made it irrelevant to focus on performance metrics such asuse of RAM bandwidth when copying temporary data before writing it to disk.

Nowadays the pickle protocol sees a growing use in applications where mostof the data isn’t ever persisted to disk (or, when it is, it uses a portableformat instead of Python-specific). Instead, pickle is being used to transmitdata and commands from one process to another, either on the same machineor on multiple machines. Those applications will sometimes deal with verylarge data (such as Numpy arrays or Pandas dataframes) that need to betransferred around. For those applications, pickle is currentlywasteful as it imposes spurious memory copies of the data being serialized.

As a matter of fact, the standardmultiprocessing module uses picklefor serialization, and therefore also suffers from this problem whensending large data to another process.

Third-party Python libraries, such as Dask[1], PyArrow[4]and IPyParallel[3], have started implementing alternativeserialization schemes with the explicit goal of avoiding copies on largedata. Implementing a new serialization scheme is difficult and oftenleads to reduced generality (since many Python objects support picklebut not the new serialization scheme). Falling back on pickle forunsupported types is an option, but then you get back the spuriousmemory copies you wanted to avoid in the first place. For example,dask is able to avoid memory copies for Numpy arrays andbuilt-in containers thereof (such as lists or dicts containing Numpyarrays), but if a large Numpy array is an attribute of a user-definedobject,dask will serialize the user-defined object as a picklestream, leading to memory copies.

The common theme of these third-party serialization efforts is to generatea stream of object metadata (which contains pickle-like information aboutthe objects being serialized) and a separate stream of zero-copy bufferobjects for the payloads of large objects. Note that, in this scheme,small objects such as ints, etc. can be dumped together with the metadatastream. Refinements can include opportunistic compression of large datadepending on its type and layout, likedask does.

This PEP aims to makepickle usable in a way where large data is handledas a separate stream of zero-copy buffers, letting the application handlethose buffers optimally.

Example

To keep the example simple and avoid requiring knowledge of third-partylibraries, we will focus here on a bytearray object (but the issue isconceptually the same with more sophisticated objects such as Numpy arrays).Like most objects, the bytearray object isn’t immediately understood bythe pickle module and must therefore specify its decomposition scheme.

Here is how a bytearray object currently decomposes for pickling:

>>>b.__reduce_ex__(4)(<class 'bytearray'>, (b'abc',), None)

This is because thebytearray.__reduce_ex__ implementation readsmorally as follows:

classbytearray:def__reduce_ex__(self,protocol):ifprotocol==4:returntype(self),bytes(self),None# Legacy code for earlier protocols omitted

In turn it produces the following pickle code:

>>>pickletools.dis(pickletools.optimize(pickle.dumps(b,protocol=4)))    0: \x80 PROTO      4    2: \x95 FRAME      30   11: \x8c SHORT_BINUNICODE 'builtins'   21: \x8c SHORT_BINUNICODE 'bytearray'   32: \x93 STACK_GLOBAL   33: C    SHORT_BINBYTES b'abc'   38: \x85 TUPLE1   39: R    REDUCE   40: .    STOP

(the call topickletools.optimize above is only meant to make thepickle stream more readable by removing the MEMOIZE opcodes)

We can notice several things about the bytearray’s payload (the sequenceof bytesb'abc'):

bytearray.__reduce_ex__ produces a first copy by instantiating anew bytes object from the bytearray’s data.
pickle.dumps produces a second copy when inserting the contents ofthat bytes object into the pickle stream, after the SHORT_BINBYTES opcode.
Furthermore, when deserializing the pickle stream, a temporary bytesobject is created when the SHORT_BINBYTES opcode is encountered (inducinga data copy).

What we really want is something like the following:

bytearray.__reduce_ex__ produces aview of the bytearray’s data.
pickle.dumps doesn’t try to copy that data into the pickle streambut instead passes the buffer view to its caller (which can decide on themost efficient handling of that buffer).
When deserializing,pickle.loads takes the pickle stream and thebuffer view separately, and passes the buffer view directly to thebytearray constructor.

We see that several conditions are required for the above to work:

__reduce__ or__reduce_ex__ must be able to returnsomethingthat indicates a serializable no-copy buffer view.
The pickle protocol must be able to represent references to such bufferviews, instructing the unpickler that it may have to get the actual bufferout of band.
Thepickle.Pickler API must provide its caller with a wayto receive such buffer views while serializing.
Thepickle.Unpickler API must similarly allow its caller to providethe buffer views required for deserialization.
For compatibility, the pickle protocol must also be able to contain directserializations of such buffer views, such that current uses of thepickleAPI don’t have to be modified if they are not concerned with memory copies.

Producer API

We are introducing a new typepickle.PickleBuffer which can beinstantiated from any buffer-supporting object, and is specifically meantto be returned from__reduce__ implementations:

classbytearray:def__reduce_ex__(self,protocol):ifprotocol>=5:returntype(self),(PickleBuffer(self),),None# Legacy code for earlier protocols omitted

PickleBuffer is a simple wrapper that doesn’t have all the memoryviewsemantics and functionality, but is specifically recognized by thepicklemodule if protocol 5 or higher is enabled. It is an error to try toserialize aPickleBuffer with pickle protocol version 4 or earlier.

Only the rawdata of thePickleBuffer will be considered by thepickle module. Any type-specificmetadata (such as shapes ordatatype) must be returned separately by the type’s__reduce__implementation, as is already the case.

PickleBuffer objects

ThePickleBuffer class supports a very simple Python API. Its constructortakes a singlePEP 3118-compatible object.PickleBufferobjects themselves support the buffer protocol, so consumers cancallmemoryview(...) on them to get additional informationabout the underlying buffer (such as the original type, shape, etc.).In addition,PickleBuffer objects have the following methods:

raw()

Return a memoryview of the raw memory bytes underlying the PickleBuffer,erasing any shape, strides and format information. This is required tohandle Fortran-contiguous buffers correctly in the pure Python pickleimplementation.

release()

Release the PickleBuffer’s underlying buffer, making it unusable.

On the C side, a simple API will be provided to create and inspectPickleBuffer objects:

PyObject*PyPickleBuffer_FromObject(PyObject*obj)

Create aPickleBuffer object holding a view over thePEP 3118-compatibleobj.

PyPickleBuffer_Check(PyObject*obj)

Return whetherobj is aPickleBuffer instance.

constPy_buffer*PyPickleBuffer_GetBuffer(PyObject*picklebuf)

Return a pointer to the internalPy_buffer owned by thePickleBufferinstance. An exception is raised if the buffer is released.

intPyPickleBuffer_Release(PyObject*picklebuf)

Release thePickleBuffer instance’s underlying buffer.

Buffer requirements

PickleBuffer can wrap any kind of buffer, including non-contiguousbuffers. However, it is required that__reduce__ only returns acontiguousPickleBuffer (contiguity here is meant in thePEP 3118sense: either C-ordered or Fortran-ordered). Non-contiguous bufferswill raise an error when pickled.

This restriction is primarily an ease-of-implementation issue for thepickle module but also other consumers of out-of-band buffers.The simplest solution on the provider side is to return a contiguouscopy of a non-contiguous buffer; a sophisticated provider, though, maydecide instead to return a sequence of contiguous sub-buffers.

Consumer API

pickle.Pickler.__init__ andpickle.dumps are augmented with an additionalbuffer_callback parameter:

classPickler:def__init__(self,file,protocol=None,...,buffer_callback=None):"""      If *buffer_callback* is None (the default), buffer views are      serialized into *file* as part of the pickle stream.      If *buffer_callback* is not None, then it can be called any number      of times with a buffer view.  If the callback returns a false value      (such as None), the given buffer is out-of-band; otherwise the      buffer is serialized in-band, i.e. inside the pickle stream.      The callback should arrange to store or transmit out-of-band buffers      without changing their order.      It is an error if *buffer_callback* is not None and *protocol* is      None or smaller than 5.      """defpickle.dumps(obj,protocol=None,*,...,buffer_callback=None):"""   See above for *buffer_callback*.   """

pickle.Unpickler.__init__ andpickle.loads are augmented with anadditionalbuffers parameter:

classUnpickler:def__init__(file,*,...,buffers=None):"""      If *buffers* is not None, it should be an iterable of buffer-enabled      objects that is consumed each time the pickle stream references      an out-of-band buffer view.  Such buffers have been given in order      to the *buffer_callback* of a Pickler object.      If *buffers* is None (the default), then the buffers are taken      from the pickle stream, assuming they are serialized there.      It is an error for *buffers* to be None if the pickle stream      was produced with a non-None *buffer_callback*.      """defpickle.loads(data,*,...,buffers=None):"""   See above for *buffers*.   """

Protocol changes

Three new opcodes are introduced:

BYTEARRAY8 creates a bytearray from the data following it in the picklestream and pushes it on the stack (just likeBINBYTES8 does for bytesobjects);
NEXT_BUFFER fetches a buffer from thebuffers iterable and pushesit on the stack.
READONLY_BUFFER makes a readonly view of the top of the stack.

When pickling encounters aPickleBuffer, that buffer can be consideredin-band or out-of-band depending on the following conditions:

if nobuffer_callback is given, the buffer is in-band;
if abuffer_callback is given, it is called with the buffer. If thecallback returns a true value, the buffer is in-band; if the callbackreturns a false value, the buffer is out-of-band.

An in-band buffer is serialized as follows:

If the buffer is writable, it is serialized into the pickle stream as ifit were abytearray object.
If the buffer is readonly, it is serialized into the pickle stream as ifit were abytes object.

An out-of-band buffer is serialized as follows:

If the buffer is writable, aNEXT_BUFFER opcode is appended to thepickle stream.
If the buffer is readonly, aNEXT_BUFFER opcode is appended to thepickle stream, followed by aREADONLY_BUFFER opcode.

The distinction between readonly and writable buffers is motivated below(see “Mutability”).

Side effects

Improved in-band performance

Even in-band pickling can be improved by returning aPickleBufferinstance from__reduce_ex__, as one copy is avoided on the serializationpath[10][12].

Caveats

Mutability

PEP 3118 buffers can be readonly or writable. Some objects,such as Numpy arrays, need to be backed by a mutable buffer for fulloperation. Pickle consumers that use thebuffer_callback andbuffersarguments will have to be careful to recreate mutable buffers. When doingI/O, this implies using buffer-passing API variants such asreadinto(which are also often preferable for performance).

Data sharing

If you pickle and then unpickle an object in the same process, passingout-of-band buffer views, then the unpickled object may be backed by thesame buffer as the original pickled object.

For example, it might be reasonable to implement reduction of a Numpy arrayas follows (crucial metadata such as shapes is omitted for simplicity):

classndarray:def__reduce_ex__(self,protocol):ifprotocol==5:returnnumpy.frombuffer,(PickleBuffer(self),self.dtype)# Legacy code for earlier protocols omitted

Then simply passing the PickleBuffer around fromdumps toloadswill produce a new Numpy array sharing the same underlying memory as theoriginal Numpy object (and, incidentally, keeping it alive):

>>>importnumpyasnp>>>a=np.zeros(10)>>>a[0]0.0>>>buffers=[]>>>data=pickle.dumps(a,protocol=5,buffer_callback=buffers.append)>>>b=pickle.loads(data,buffers=buffers)>>>b[0]=42>>>a[0]42.0

This won’t happen with the traditionalpickle API (i.e. without passingbuffers andbuffer_callback parameters), because then the buffer viewis serialized inside the pickle stream with a copy.

Rejected alternatives

Using the existing persistent load interface

Thepickle persistence interface is a way of storing references todesignated objects in the pickle stream while handling their actualserialization out of band. For example, one might consider the followingfor zero-copy serialization of bytearrays:

classMyPickle(pickle.Pickler):def__init__(self,*args,**kwargs):super().__init__(*args,**kwargs)self.buffers=[]defpersistent_id(self,obj):iftype(obj)isnotbytearray:returnNoneelse:index=len(self.buffers)self.buffers.append(obj)return('bytearray',index)classMyUnpickle(pickle.Unpickler):def__init__(self,*args,buffers,**kwargs):super().__init__(*args,**kwargs)self.buffers=buffersdefpersistent_load(self,pid):type_tag,index=pidiftype_tag=='bytearray':returnself.buffers[index]else:assert0# unexpected type

This mechanism has two drawbacks:

Eachpickle consumer must reimplementPickler andUnpicklersubclasses, with custom code for each type of interest. Essentially,N pickle consumers end up each implementing custom code for M producers.This is difficult (especially for sophisticated types such as Numpyarrays) and poorly scalable.
Each object encountered by the pickle module (even simple built-in objectssuch as ints and strings) triggers a call to the user’spersistent_id()method, leading to a possible performance drop compared to nominal.
(the Python 2cPickle module supported an undocumentedinst_persistent_id() hook that was only called on non-built-in types;it was added in 1997 in order to alleviate the performance issue ofcallingpersistent_id, presumably at ZODB’s request)

Passing a sequence of buffers in`buffer_callback`

By passing a sequence of buffers, rather than a single buffer, we wouldpotentially save on function call overhead in case a large numberof buffers are produced during serialization. This would needadditional support in the Pickler to save buffers before calling thecallback. However, it would also prevent the buffer callback from returninga boolean to indicate whether a buffer is to be serialized in-band orout-of-band.

We consider that having a large number of buffers to serialize is anunlikely case, and decided to pass a single buffer to the buffer callback.

Allow serializing a`PickleBuffer` in protocol 4 and earlier

If we were to allow serializing aPickleBuffer in protocols 4 and earlier,it would actually make a supplementary memory copy when the buffer is mutable.Indeed, a mutablePickleBuffer would serialize as a bytearray objectin those protocols (that is a first copy), and serializing the bytearrayobject would callbytearray.__reduce_ex__ which returns a bytes object(that is a second copy).

To prevent__reduce__ implementors from introducing involuntaryperformance regressions, we decided to rejectPickleBuffer whenthe protocol is smaller than 5. This forces implementors to switch to__reduce_ex__ and implement protocol-dependent serialization, takingadvantage of the best path for each protocol (or at least treat protocol5 and upwards separately from protocols 4 and downwards).

Implementation

The PEP was initially implemented in the author’s GitHub fork[6].It was later merged into Python 3.8[7].

A backport for Python 3.6 and 3.7 is downloadable from PyPI[8].

Support for pickle protocol 5 and out-of-band buffers was added to Numpy[11].

Support for pickle protocol 5 and out-of-band buffers was added to the ApacheArrow Python bindings[9].

Related work

Dask.distributed implements a custom zero-copy serialization with fallbackto pickle[2].

PyArrow implements zero-copy component-based serialization for a fewselected types[5].

PEP 554 proposes hosting multiple interpreters in a single process, withprovisions for transferring buffers between interpreters as a communicationscheme.

Acknowledgements

Thanks to the following people for early feedback: Alyssa Coghlan, OlivierGrisel, Stefan Krah, MinRK, Matt Rocklin, Eric Snow.

Thanks to Pierre Glaser and Olivier Grisel for experimenting with theimplementation.

References

[1]

Dask.distributed – A lightweight library for distributed computingin Pythonhttps://distributed.readthedocs.io/

[2]

Dask.distributed custom serializationhttps://distributed.readthedocs.io/en/latest/serialization.html

[3]

IPyParallel – Using IPython for parallel computinghttps://ipyparallel.readthedocs.io/

[4]

PyArrow – A cross-language development platform for in-memory datahttps://arrow.apache.org/docs/python/

[5]

PyArrow IPC and component-based serializationhttps://arrow.apache.org/docs/python/ipc.html#component-based-serialization

[6]

pickle5 branch on GitHubhttps://github.com/pitrou/cpython/tree/pickle5

[7]

PEP 574 Pull Request on GitHubhttps://github.com/python/cpython/pull/7076

[8]

pickle5 project on PyPIhttps://pypi.org/project/pickle5/

[9]

Pull request: Experimental zero-copy pickling in Apache Arrowhttps://github.com/apache/arrow/pull/2161

[10]

Benchmark zero-copy pickling in Apache Arrowhttps://github.com/apache/arrow/pull/2161#issuecomment-407859213

[11]

Pull request: Support pickle protocol 5 in Numpyhttps://github.com/numpy/numpy/pull/12011

[12]

Benchmark pickling Numpy arrays with different pickle protocolshttps://github.com/numpy/numpy/issues/11161#issuecomment-424035962

Copyright

This document has been placed into the public domain.

Source:https://github.com/python/peps/blob/main/peps/pep-0574.rst

Last modified:2025-02-01 08:59:27 GMT

Movatterモバイル変換