Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 307 – Extensions to the pickle protocol

PEP 307 – Extensions to the pickle protocol

Author:
Guido van Rossum, Tim Peters
Status:
Final
Type:
Standards Track
Created:
31-Jan-2003
Python-Version:
2.3
Post-History:
07-Feb-2003

Table of Contents

Introduction

Pickling new-style objects in Python 2.2 is done somewhat clumsilyand causes pickle size to bloat compared to classic classinstances. This PEP documents a new pickle protocol in Python 2.3that takes care of this and many other pickle issues.

There are two sides to specifying a new pickle protocol: the bytestream constituting pickled data must be specified, and theinterface between objects and the pickling and unpickling enginesmust be specified. This PEP focuses on API issues, although itmay occasionally touch on byte stream format details to motivate achoice. The pickle byte stream format is documented formally bythe standard library modulepickletools.py (already checked intoCVS for Python 2.3).

This PEP attempts to fully document the interface between pickledobjects and the pickling process, highlighting additions byspecifying “new in this PEP”. (The interface to invoke picklingor unpickling is not covered fully, except for the changes to theAPI for specifying the pickling protocol to picklers.)

Motivation

Pickling new-style objects causes serious pickle bloat. Forexample:

classC(object):# Omit "(object)" for classic classpassx=C()x.foo=42printlen(pickle.dumps(x,1))

The binary pickle for the classic object consumed 33 bytes, and forthe new-style object 86 bytes.

The reasons for the bloat are complex, but are mostly caused bythe fact that new-style objects use__reduce__ in order to bepicklable at all. After ample consideration we’ve concluded thatthe only way to reduce pickle sizes for new-style objects is toadd new opcodes to the pickle protocol. The net result is thatwith the new protocol, the pickle size in the above example is 35(two extra bytes are used at the start to indicate the protocolversion, although this isn’t strictly necessary).

Protocol versions

Previously, pickling (but not unpickling) distinguished betweentext mode and binary mode. By design, binary mode is asuperset of text mode, and unpicklers don’t need to know inadvance whether an incoming pickle uses text mode or binary mode.The virtual machine used for unpickling is the same regardless ofthe mode; certain opcodes simply aren’t used in text mode.

Retroactively, text mode is now called protocol 0, and binary modeprotocol 1. The new protocol is called protocol 2. In thetradition of pickling protocols, protocol 2 is a superset ofprotocol 1. But just so that future pickling protocols aren’trequired to be supersets of the oldest protocols, a new opcode isinserted at the start of a protocol 2 pickle indicating that it isusing protocol 2. To date, each release of Python has been able toread pickles written by all previous releases. Of course pickleswritten under protocolN can’t be read by versions of Pythonearlier than the one that introduced protocolN.

Several functions, methods and constructors used for pickling usedto take a positional argument named ‘bin’ which was a flag,defaulting to 0, indicating binary mode. This argument is renamedto ‘protocol’ and now gives the protocol number, still defaultingto 0.

It so happens that passing 2 for the ‘bin’ argument in previousPython versions had the same effect as passing 1. Nevertheless, aspecial case is added here: passing a negative number selects thehighest protocol version supported by a particular implementation.This works in previous Python versions, too, and so can be used toselect the highest protocol available in a way that’s both backwardand forward compatible. In addition, a new module constantHIGHEST_PROTOCOL is supplied by bothpickle andcPickle, equal tothe highest protocol number the module can read. This is cleanerthan passing -1, but cannot be used before Python 2.3.

Thepickle.py module has supported passing the ‘bin’ value as akeyword argument rather than a positional argument. (This is notrecommended, sincecPickle only accepts positional arguments, butit works…) Passing ‘bin’ as a keyword argument is deprecated,and aPendingDeprecationWarning is issued in this case. You haveto invoke the Python interpreter with-Wa or a variation on thatto seePendingDeprecationWarning messages. In Python 2.4, thewarning class may be upgraded toDeprecationWarning.

Security issues

In previous versions of Python, unpickling would do a “safetycheck” on certain operations, refusing to call functions orconstructors that weren’t marked as “safe for unpickling” byeither having an attribute__safe_for_unpickling__ set to 1, or bybeing registered in a global registry,copy_reg.safe_constructors.

This feature gives a false sense of security: nobody has ever donethe necessary, extensive, code audit to prove that unpicklinguntrusted pickles cannot invoke unwanted code, and in fact bugs inthe Python 2.2pickle.py module make it easy to circumvent thesesecurity measures.

We firmly believe that, on the Internet, it is better to know thatyou are using an insecure protocol than to trust a protocol to besecure whose implementation hasn’t been thoroughly checked. Evenhigh quality implementations of widely used protocols areroutinely found flawed; Python’s pickle implementation simplycannot make such guarantees without a much larger time investment.Therefore, as of Python 2.3, all safety checks on unpickling areofficially removed, and replaced with this warning:

Warning

Do not unpickle data received from an untrusted orunauthenticated source.

The same warning applies to previous Python versions, despite thepresence of safety checks there.

Extended__reduce__ API

There are several APIs that a class can use to control pickling.Perhaps the most popular of these are__getstate__ and__setstate__; but the most powerful one is__reduce__. (There’salso__getinitargs__, and we’re adding__getnewargs__ below.)

There are several ways to provide__reduce__ functionality: aclass can implement a__reduce__ method or a__reduce_ex__ method(see next section), or a reduce function can be declared incopy_reg (copy_reg.dispatch_table maps classes to functions). Thereturn values are interpreted exactly the same, though, and we’llrefer to these collectively as__reduce__.

Important: pickling of classic class instances does not look for a__reduce__ or__reduce_ex__ method or a reduce function in thecopy_reg dispatch table, so that a classic class cannot provide__reduce__ functionality in the sense intended here. A classicclass must use__getinitargs__ and/or__getstate__ to customizepickling. These are described below.

__reduce__ must return either a string or a tuple. If it returnsa string, this is an object whose state is not to be pickled, butinstead a reference to an equivalent object referenced by name.Surprisingly, the string returned by__reduce__ should be theobject’s local name (relative to its module); thepickle modulesearches the module namespace to determine the object’s module.

The rest of this section is concerned with the tuple returned by__reduce__. It is a variable size tuple, of length 2 through 5.The first two items (function and arguments) are required. Theremaining items are optional and may be left off from the end;givingNone for the value of an optional item acts the same asleaving it off. The last two items are new in this PEP. The itemsare, in order:

functionRequired.

A callable object (not necessarily a function) calledto create the initial version of the object; statemay be added to the object later to fully reconstructthe pickled state. This function must itself bepicklable. See the section about__newobj__ for aspecial case (new in this PEP) here.

argumentsRequired.

A tuple giving the argument list for the function.As a special case, designed for Zope 2’sExtensionClass, this may beNone; in that case,function should be a class or type, andfunction.__basicnew__() is called to create theinitial version of the object. This exception isdeprecated.

Unpickling invokesfunction(*arguments) to create an initial object,calledobj below. If the remaining items are left off, that’s the endof unpickling for this object andobj is the result. Elseobj ismodified at unpickling time by each item specified, as follows.

stateOptional.

Additional state. If this is notNone, the state ispickled, andobj.__setstate__(state) will be calledwhen unpickling. If no__setstate__ method isdefined, a default implementation is provided, whichassumes that state is a dictionary mapping instancevariable names to their values. The defaultimplementation calls

obj.__dict__.update(state)

or, if theupdate() call fails,

fork,vinstate.items():setattr(obj,k,v)
listitemsOptional, and new in this PEP.

If this is notNone, it should be an iterator (not asequence!) yielding successive list items. These listitems will be pickled, and appended to the object usingeitherobj.append(item) orobj.extend(list_of_items).This is primarily used forlist subclasses, but maybe used by other classes as long as they haveappend()andextend() methods with the appropriate signature.(Whetherappend() orextend() is used depends on whichpickle protocol version is used as well as the numberof items to append, so both must be supported.)

dictitemsOptional, and new in this PEP.

If this is notNone, it should be an iterator (not asequence!) yielding successive dictionary items, whichshould be tuples of the form(key,value). These itemswill be pickled, and stored to the object usingobj[key]=value. This is primarily used fordictsubclasses, but may be used by other classes as longas they implement__setitem__.

Note: in Python 2.2 and before, when usingcPickle, state would bepickled if present even if it isNone; the only safe way to avoidthe__setstate__ call was to return a two-tuple from__reduce__.(Butpickle.py would not pickle state if it wasNone.) In Python2.3,__setstate__ will never be called at unpickling time when__reduce__ returns a state with valueNone at pickling time.

A__reduce__ implementation that needs to work both under Python2.2 and under Python 2.3 could check the variablepickle.format_version to determine whether to use thelistitemsanddictitems features. If this value is>="2.0" then they aresupported. If not, any list or dict items should be incorporatedsomehow in the ‘state’ return value, and the__setstate__ methodshould be prepared to accept list or dict items as part of thestate (how this is done is up to the application).

The__reduce_ex__ API

It is sometimes useful to know the protocol version whenimplementing__reduce__. This can be done by implementing amethod named__reduce_ex__ instead of__reduce__.__reduce_ex__,when it exists, is called in preference over__reduce__ (you maystill provide__reduce__ for backwards compatibility). The__reduce_ex__ method will be called with a single integerargument, the protocol version.

The ‘object’ class implements both__reduce__ and__reduce_ex__;however, if a subclass overrides__reduce__ but not__reduce_ex__,the__reduce_ex__ implementation detects this and calls__reduce__.

Customizing pickling absent a__reduce__ implementation

If no__reduce__ implementation is available for a particularclass, there are three cases that need to be consideredseparately, because they are handled differently:

  1. classic class instances, all protocols
  2. new-style class instances, protocols 0 and 1
  3. new-style class instances, protocol 2

Types implemented in C are considered new-style classes. However,except for the common built-in types, these need to provide a__reduce__ implementation in order to be picklable with protocols0 or 1. Protocol 2 supports built-in types providing__getnewargs__,__getstate__ and__setstate__ as well.

Case 1: pickling classic class instances

This case is the same for all protocols, and is unchanged fromPython 2.1.

For classic classes,__reduce__ is not used. Instead, classicclasses can customize their pickling by providing methods named__getstate__,__setstate__ and__getinitargs__. Absent these, adefault pickling strategy for classic class instances isimplemented that works as long as all instance variables arepicklable. This default strategy is documented in terms ofdefault implementations of__getstate__ and__setstate__.

The primary ways to customize pickling of classic class instancesis by specifying__getstate__ and/or__setstate__ methods. It isfine if a class implements one of these but not the other, as longas it is compatible with the default version.

The__getstate__ method

The__getstate__ method should return a picklable valuerepresenting the object’s state without referencing the objectitself. If no__getstate__ method exists, a defaultimplementation is used that returnsself.__dict__.

The__setstate__ method

The__setstate__ method should take one argument; it will becalled with the value returned by__getstate__ (or its defaultimplementation).

If no__setstate__ method exists, a default implementation isprovided that assumes the state is a dictionary mapping instancevariable names to values. The default implementation tries twothings:

  • First, it tries to callself.__dict__.update(state).
  • If theupdate() call fails with aRuntimeError exception, itcallssetattr(self,key,value) for each(key,value) pair inthe state dictionary. This only happens when unpickling inrestricted execution mode (see therexec standard librarymodule).

The__getinitargs__ method

The__setstate__ method (or its default implementation) requiresthat a new object already exists so that its__setstate__ methodcan be called. The point is to create a new object that isn’tfully initialized; in particular, the class’s__init__ methodshould not be called if possible.

These are the possibilities:

  • Normally, the following trick is used: create an instance of atrivial classic class (one without any methods or instancevariables) and then use__class__ assignment to change itsclass to the desired class. This creates an instance of thedesired class with an empty__dict__ whose__init__ has notbeen called.
  • However, if the class has a method named__getinitargs__, theabove trick is not used, and a class instance is created byusing the tuple returned by__getinitargs__ as an argumentlist to the class constructor. This is done even if__getinitargs__ returns an empty tuple — a__getinitargs__method that returns() is not equivalent to not having__getinitargs__ at all.__getinitargs__must return atuple.
  • In restricted execution mode, the trick from the first bulletdoesn’t work; in this case, the class constructor is calledwith an empty argument list if no__getinitargs__ methodexists. This means that in order for a classic class to beunpicklable in restricted execution mode, it must eitherimplement__getinitargs__ or its constructor (i.e., its__init__ method) must be callable without arguments.

Case 2: pickling new-style class instances using protocols 0 or 1

This case is unchanged from Python 2.2. For better pickling ofnew-style class instances when backwards compatibility is not anissue, protocol 2 should be used; see case 3 below.

New-style classes, whether implemented in C or in Python, inherita default__reduce__ implementation from the universal base class‘object’.

This default__reduce__ implementation is not used for thosebuilt-in types for which thepickle module has built-in support.Here’s a full list of those types:

  • Concrete built-in types:NoneType,bool,int,float,complex,str,unicode,tuple,list,dict. (Complex is supported byvirtue of a__reduce__ implementation registered incopy_reg.)In Jython,PyStringMap is also included in this list.
  • Classic instances.
  • Classic class objects, Python function objects, built-infunction and method objects, and new-style type objects (==new-style class objects). These are pickled by name, not byvalue: at unpickling time, a reference to an object with thesame name (the fully qualified module name plus the variablename in that module) is substituted.

The default__reduce__ implementation will fail at pickling timefor built-in types not mentioned above, and for new-style classesimplemented in C: if they want to be picklable, they must supplya custom__reduce__ implementation under protocols 0 and 1.

For new-style classes implemented in Python, the default__reduce__ implementation (copy_reg._reduce) works as follows:

LetD be the class on the object to be pickled. First, find thenearest base class that is implemented in C (either as abuilt-in type or as a type defined by an extension class). Callthis base classB, and the class of the object to be pickledD.UnlessB is the class ‘object’, instances of classB must bepicklable, either by having built-in support (as defined in theabove three bullet points), or by having a non-default__reduce__ implementation.B must not be the same class asD(if it were, it would mean thatD is not implemented in Python).

The callable produced by the default__reduce__ iscopy_reg._reconstructor, and its arguments tuple is(D,B,basestate), wherebasestate isNone ifB is the builtinobject class, andbasestate is

basestate=B(obj)

ifB is not the builtin object class. This is geared towardpickling subclasses of builtin types, where, for example,list(some_list_subclass_instance) produces “the list part” ofthelist subclass instance.

The object is recreated at unpickling time bycopy_reg._reconstructor, like so:

obj=B.__new__(D,basestate)B.__init__(obj,basestate)

Objects using the default__reduce__ implementation can customizeit by defining__getstate__ and/or__setstate__ methods. Thesework almost the same as described for classic classes above, exceptthat if__getstate__ returns an object (of any type) whose value isconsidered false (e.g.None, or a number that is zero, or an emptysequence or mapping), this state is not pickled and__setstate__will not be called at all. If__getstate__ exists and returns atrue value, that value becomes the third element of the tuplereturned by the default__reduce__, and at unpickling time thevalue is passed to__setstate__. If__getstate__ does not exist,butobj.__dict__ exists, thenobj.__dict__ becomes the thirdelement of the tuple returned by__reduce__, and again atunpickling time the value is passed toobj.__setstate__. Thedefault__setstate__ is the same as that for classic classes,described above.

Note that this strategy ignores slots. Instances of new-styleclasses that have slots but no__getstate__ method cannot bepickled by protocols 0 and 1; the code explicitly checks forthis condition.

Note that pickling new-style class instances ignores__getinitargs__if it exists (and under all protocols).__getinitargs__ isuseful only for classic classes.

Case 3: pickling new-style class instances using protocol 2

Under protocol 2, the default__reduce__ implementation inheritedfrom the ‘object’ base class isignored. Instead, a differentdefault implementation is used, which allows more efficientpickling of new-style class instances than possible with protocols0 or 1, at the cost of backward incompatibility with Python 2.2(meaning no more than that a protocol 2 pickle cannot be unpickledbefore Python 2.3).

The customization uses three special methods:__getstate__,__setstate__ and__getnewargs__ (note that__getinitargs__ is againignored). It is fine if a class implements one or more but not allof these, as long as it is compatible with the defaultimplementations.

The__getstate__ method

The__getstate__ method should return a picklable valuerepresenting the object’s state without referencing the objectitself. If no__getstate__ method exists, a defaultimplementation is used which is described below.

There’s a subtle difference between classic and new-styleclasses here: if a classic class’s__getstate__ returnsNone,self.__setstate__(None) will be called as part of unpickling.But if a new-style class’s__getstate__ returnsNone, its__setstate__ won’t be called at all as part of unpickling.

If no__getstate__ method exists, a default state is computed.There are several cases:

  • For a new-style class that has no instance__dict__ and no__slots__, the default state isNone.
  • For a new-style class that has an instance__dict__ and no__slots__, the default state isself.__dict__.
  • For a new-style class that has an instance__dict__ and__slots__, the default state is a tuple consisting of twodictionaries:self.__dict__, and a dictionary mapping slotnames to slot values. Only slots that have a value areincluded in the latter.
  • For a new-style class that has__slots__ and no instance__dict__, the default state is a tuple whose first item isNone and whose second item is a dictionary mapping slot namesto slot values described in the previous bullet.

The__setstate__ method

The__setstate__ method should take one argument; it will becalled with the value returned by__getstate__ or with thedefault state described above if no__getstate__ method isdefined.

If no__setstate__ method exists, a default implementation isprovided that can handle the state returned by the default__getstate__, described above.

The__getnewargs__ method

Like for classic classes, the__setstate__ method (or itsdefault implementation) requires that a new object alreadyexists so that its__setstate__ method can be called.

In protocol 2, a new pickling opcode is used that causes a newobject to be created as follows:

obj=C.__new__(C,*args)

whereC is the class of the pickled object, andargs is eitherthe empty tuple, or the tuple returned by the__getnewargs__method, if defined.__getnewargs__ must return a tuple. Theabsence of a__getnewargs__ method is equivalent to the existenceof one that returns().

The__newobj__ unpickling function

When the unpickling function returned by__reduce__ (the firstitem of the returned tuple) has the name__newobj__, somethingspecial happens for pickle protocol 2. An unpickling functionnamed__newobj__ is assumed to have the following semantics:

def__newobj__(cls,*args):returncls.__new__(cls,*args)

Pickle protocol 2 special-cases an unpickling function with thisname, and emits a pickling opcode that, given ‘cls’ and ‘args’,will returncls.__new__(cls,*args) without also pickling areference to__newobj__ (this is the same pickling opcode used byprotocol 2 for a new-style class instance when no__reduce__implementation exists). This is the main reason why protocol 2pickles are much smaller than classic pickles. Of course, thepickling code cannot verify that a function named__newobj__actually has the expected semantics. If you use an unpicklingfunction named__newobj__ that returns something different, youdeserve what you get.

It is safe to use this feature under Python 2.2; there’s nothingin the recommended implementation of__newobj__ that depends onPython 2.3.

The extension registry

Protocol 2 supports a new mechanism to reduce the size of pickles.

When class instances (classic or new-style) are pickled, the fullname of the class (module name including package name, and classname) is included in the pickle. Especially for applications thatgenerate many small pickles, this is a lot of overhead that has tobe repeated in each pickle. For large pickles, when usingprotocol 1, repeated references to the same class name arecompressed using the “memo” feature; but each class name must bespelled in full at least once per pickle, and this causes a lot ofoverhead for small pickles.

The extension registry allows one to represent the most frequentlyused names by small integers, which are pickled very efficiently:an extension code in the range 1–255 requires only two bytesincluding the opcode, one in the range 256–65535 requires onlythree bytes including the opcode.

One of the design goals of the pickle protocol is to make pickles“context-free”: as long as you have installed the modulescontaining the classes referenced by a pickle, you can unpickleit, without needing to import any of those classes ahead of time.

Unbridled use of extension codes could jeopardize this desirableproperty of pickles. Therefore, the main use of extension codesis reserved for a set of codes to be standardized by somestandard-setting body. This being Python, the standard-settingbody is the PSF. From time to time, the PSF will decide on atable mapping extension codes to class names (or occasionallynames of other global objects; functions are also eligible). Thistable will be incorporated in the next Python release(s).

However, for some applications, like Zope, context-free picklesare not a requirement, and waiting for the PSF to standardizesome codes may not be practical. Two solutions are offered forsuch applications.

First, a few ranges of extension codes are reserved for privateuse. Any application can register codes in these ranges.Two applications exchanging pickles using codes in these rangesneed to have some out-of-band mechanism to agree on the mappingbetween extension codes and names.

Second, some large Python projects (e.g. Zope) can be assigned arange of extension codes outside the “private use” range that theycan assign as they see fit.

The extension registry is defined as a mapping between extensioncodes and names. When an extension code is unpickled, it ends upproducing an object, but this object is gotten by interpreting thename as a module name followed by a class (or function) name. Themapping from names to objects is cached. It is quite possiblethat certain names cannot be imported; that should not be aproblem as long as no pickle containing a reference to such nameshas to be unpickled. (The same issue already exists for directreferences to such names in pickles that use protocols 0 or 1.)

Here is the proposed initial assignment of extension code ranges:

FirstLastCountPurpose
001Reserved — will never be used
1127127Reserved for Python standard library
12819164Reserved for Zope
19223948Reserved for 3rd parties
24025516Reserved for private use (will never be assigned)
256MAXMAXReserved for future assignment

MAX stands for 2147483647, or2**31-1. This is a hard limitationof the protocol as currently defined.

At the moment, no specific extension codes have been assigned yet.

Extension registry API

The extension registry is maintained as private global variablesin thecopy_reg module. The following three functions are definedin this module to manipulate the registry:

add_extension(module,name,code)
Register an extension code. Themodule andname argumentsmust be strings;code must be anint in the inclusive range 1throughMAX. This must either register a new(module,name)pair to a new code, or be a redundant repeat of a previouscall that was not canceled by aremove_extension() call; a(module,name) pair may not be mapped to more than one code,nor may a code be mapped to more than one(module,name)pair.
remove_extension(module,name,code)
Arguments are as foradd_extension(). Remove a previouslyregistered mapping between(module,name) andcode.
clear_extension_cache()
The implementation of extension codes may use a cache to speedup loading objects that are named frequently. This cache canbe emptied (removing references to cached objects) by callingthis method.

Note that the API does not enforce the standard range assignments.It is up to applications to respect these.

The copy module

Traditionally, thecopy module has supported an extended subset ofthe pickling APIs for customizing thecopy() anddeepcopy()operations.

In particular, besides checking for a__copy__ or__deepcopy__method,copy() anddeepcopy() have always looked for__reduce__,and for classic classes, have looked for__getinitargs__,__getstate__ and__setstate__.

In Python 2.2, the default__reduce__ inherited from ‘object’ madecopying simple new-style classes possible, but slots and variousother special cases were not covered.

In Python 2.3, several changes are made to thecopy module:

  • __reduce_ex__ is supported (and always called with 2 as theprotocol version argument).
  • The four- and five-argument return values of__reduce__ aresupported.
  • Before looking for a__reduce__ method, thecopy_reg.dispatch_table is consulted, just like for pickling.
  • When the__reduce__ method is inherited from object, it is(unconditionally) replaced by a better one that uses the sameAPIs as pickle protocol 2:__getnewargs__,__getstate__, and__setstate__, handlinglist anddict subclasses, and handlingslots.

As a consequence of the latter change, certain new-style classesthat were copyable under Python 2.2 are not copyable under Python2.3. (These classes are also not picklable using pickle protocol2.) A minimal example of such a class:

classC(object):def__new__(cls,a):returnobject.__new__(cls)

The problem only occurs when__new__ is overridden and has atleast one mandatory argument in addition to the class argument.

To fix this, a__getnewargs__ method should be added that returnsthe appropriate argument tuple (excluding the class).

Pickling Python longs

Pickling and unpickling Python longs takes time quadratic inthe number of digits, in protocols 0 and 1. Under protocol 2,new opcodes support linear-time pickling and unpickling of longs.

Pickling bools

Protocol 2 introduces new opcodes for picklingTrue andFalsedirectly. Under protocols 0 and 1, bools are pickled as integers,using a trick in the representation of the integer in the pickleso that an unpickler can recognize that a bool was intended. Thattrick consumed 4 bytes per bool pickled. The new bool opcodesconsume 1 byte per bool.

Pickling small tuples

Protocol 2 introduces new opcodes for more-compact pickling oftuples of lengths 1, 2 and 3. Protocol 1 previously introducedan opcode for more-compact pickling of empty tuples.

Protocol identification

Protocol 2 introduces a new opcode, with which all protocol 2pickles begin, identifying that the pickle is protocol 2.Attempting to unpickle a protocol 2 pickle under older versionsof Python will therefore raise an “unknown opcode” exceptionimmediately.

Pickling of large lists and dicts

Protocol 1 pickles large lists and dicts “in one piece”, whichminimizes pickle size, but requires that unpickling create a tempobject as large as the object being unpickled. Part of theprotocol 2 changes break large lists and dicts into pieces of nomore than 1000 elements each, so that unpickling needn’t createa temp object larger than needed to hold 1000 elements. Thisisn’t part of protocol 2, however: the opcodes produced are stillpart of protocol 1.__reduce__ implementations that return theoptional new listitems or dictitems iterators also benefit fromthis unpickling temp-space optimization.

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0307.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2026 Movatter.jp