Python Enhancement Proposals

Python »
PEP Index »
PEP 252

PEP 252 – Making Types Look More Like Classes

Author:: Guido van Rossum <guido at python.org>
Status:

Abstract

This PEP proposes changes to the introspection API for types thatmakes them look more like classes, and their instances more likeclass instances. For example,type(x) will be equivalent tox.__class__ for most built-in types. When C isx.__class__,x.meth(a) will generally be equivalent toC.meth(x,a), andC.__dict__ contains x’s methods and other attributes.

This PEP also introduces a new approach to specifying attributes,using attribute descriptors, or descriptors for short.Descriptors unify and generalize several different commonmechanisms used for describing attributes: a descriptor candescribe a method, a typed field in the object structure, or ageneralized attribute represented by getter and setter functions.

Based on the generalized descriptor API, this PEP also introducesa way to declare class methods and static methods.

[Editor’s note: the ideas described in this PEP have been incorporatedinto Python. The PEP no longer accurately describes the implementation.]

Introduction

One of Python’s oldest language warts is the difference betweenclasses and types. For example, you can’t directly subclass thedictionary type, and the introspection interface for finding outwhat methods and instance variables an object has is different fortypes and for classes.

Healing the class/type split is a big effort, because it affectsmany aspects of how Python is implemented. This PEP concernsitself with making the introspection API for types look the sameas that for classes. Other PEPs will propose making classes lookmore like types, and subclassing from built-in types; these topicsare not on the table for this PEP.

Introspection APIs

Introspection concerns itself with finding out what attributes anobject has. Python’s very general getattr/setattr API makes itimpossible to guarantee that there always is a way to get a listof all attributes supported by a specific object, but in practicetwo conventions have appeared that together work for almost allobjects. I’ll call them the class-based introspection API and thetype-based introspection API; class API and type API for short.

The class-based introspection API is used primarily for classinstances; it is also used by Jim Fulton’s ExtensionClasses. Itassumes that all data attributes of an object x are stored in thedictionaryx.__dict__, and that all methods and class variablescan be found by inspection of x’s class, written asx.__class__.Classes have a__dict__ attribute, which yields a dictionarycontaining methods and class variables defined by the classitself, and a__bases__ attribute, which is a tuple of baseclasses that must be inspected recursively. Some assumptions hereare:

attributes defined in the instance dict override attributesdefined by the object’s class;
attributes defined in a derived class override attributesdefined in a base class;
attributes in an earlier base class (meaning occurring earlierin__bases__) override attributes in a later base class.

(The last two rules together are often summarized as theleft-to-right, depth-first rule for attribute search. This is theclassic Python attribute lookup rule. Note thatPEP 253 willpropose to change the attribute lookup order, and if accepted,this PEP will follow suit.)

The type-based introspection API is supported in one form oranother by most built-in objects. It uses two special attributes,__members__ and__methods__. The__methods__ attribute, ifpresent, is a list of method names supported by the object. The__members__ attribute, if present, is a list of data attributenames supported by the object.

The type API is sometimes combined with a__dict__ that works thesame as for instances (for example for function objects inPython 2.1,f.__dict__ contains f’s dynamic attributes, whilef.__members__ lists the names of f’s statically definedattributes).

Some caution must be exercised: some objects don’t list their“intrinsic” attributes (like__dict__ and__doc__) in__members__,while others do; sometimes attribute names occur both in__members__ or__methods__ and as keys in__dict__, in which caseit’s anybody’s guess whether the value found in__dict__ is usedor not.

The type API has never been carefully specified. It is part ofPython folklore, and most third party extensions support itbecause they follow examples that support it. Also, any type thatusesPy_FindMethod() and/orPyMember_Get() in its tp_getattrhandler supports it, because these two functions special-case theattribute names__methods__ and__members__, respectively.

Jim Fulton’s ExtensionClasses ignore the type API, and insteademulate the class API, which is more powerful. In this PEP, Ipropose to phase out the type API in favor of supporting the classAPI for all types.

One argument in favor of the class API is that it doesn’t requireyou to create an instance in order to find out which attributes atype supports; this in turn is useful for documentationprocessors. For example, the socket module exports the SocketTypeobject, but this currently doesn’t tell us what methods aredefined on socket objects. Using the class API, SocketType wouldshow exactly what the methods for socket objects are, and we caneven extract their docstrings, without creating a socket. (Sincethis is a C extension module, the source-scanning approach todocstring extraction isn’t feasible in this case.)

Specification of the class-based introspection API

Objects may have two kinds of attributes: static and dynamic. Thenames and sometimes other properties of static attributes areknowable by inspection of the object’s type or class, which isaccessible throughobj.__class__ ortype(obj). (I’m using typeand class interchangeably; a clumsy but descriptive term that fitsboth is “meta-object”.)

(XXX static and dynamic are not great terms to use here, because“static” attributes may actually behave quite dynamically, andbecause they have nothing to do with static class members in C++or Java. Barry suggests to use immutable and mutable instead, butthose words already have precise and different meanings inslightly different contexts, so I think that would still beconfusing.)

Examples of dynamic attributes are instance variables of classinstances, module attributes, etc. Examples of static attributesare the methods of built-in objects like lists and dictionaries,and the attributes of frame and code objects (f.f_code,c.co_filename, etc.). When an object with dynamic attributesexposes these through its__dict__ attribute,__dict__ is a staticattribute.

The names and values of dynamic properties are typically stored ina dictionary, and this dictionary is typically accessible asobj.__dict__. The rest of this specification is more concernedwith discovering the names and properties of static attributesthan with dynamic attributes; the latter are easily discovered byinspection ofobj.__dict__.

In the discussion below, I distinguish two kinds of objects:regular objects (like lists, ints, functions) and meta-objects.Types and classes are meta-objects. Meta-objects are also regularobjects, but we’re mostly interested in them because they arereferenced by the__class__ attribute of regular objects (or bythe__bases__ attribute of other meta-objects).

The class introspection API consists of the following elements:

the__class__ and__dict__ attributes on regular objects;
the__bases__ and__dict__ attributes on meta-objects;
precedence rules;
attribute descriptors.

Together, these not only tell us aboutall attributes defined bya meta-object, but they also help us calculate the value of aspecific attribute of a given object.

The__dict__ attribute on regular objects
A regular object may have a__dict__ attribute. If it does,this should be a mapping (not necessarily a dictionary)supporting at least__getitem__(),keys(), andhas_key(). Thisgives the dynamic attributes of the object. The keys in themapping give attribute names, and the corresponding values givetheir values.
Typically, the value of an attribute with a given name is thesame object as the value corresponding to that name as a key inthe__dict__. In other words,obj.__dict__['spam'] isobj.spam.(But see the precedence rules below; a static attribute withthe same namemay override the dictionary item.)
The__class__ attribute on regular objects
A regular object usually has a__class__ attribute. If itdoes, this references a meta-object. A meta-object can definestatic attributes for the regular object whose__class__ itis. This is normally done through the following mechanism:
The__dict__ attribute on meta-objects
A meta-object may have a__dict__ attribute, of the same formas the__dict__ attribute for regular objects (a mapping butnot necessarily a dictionary). If it does, the keys of themeta-object’s__dict__ are names of static attributes for thecorresponding regular object. The values are attributedescriptors; we’ll explain these later. An unbound method is aspecial case of an attribute descriptor.
Because a meta-object is also a regular object, the items in ameta-object’s__dict__ correspond to attributes of themeta-object; however, some transformation may be applied, andbases (see below) may define additional dynamic attributes. Inother words,mobj.spam is not alwaysmobj.__dict__['spam'].(This rule contains a loophole because for classes, ifC.__dict__['spam'] is a function,C.spam is an unbound methodobject.)
The__bases__ attribute on meta-objects
A meta-object may have a__bases__ attribute. If it does, thisshould be a sequence (not necessarily a tuple) of othermeta-objects, the bases. An absent__bases__ is equivalent toan empty sequence of bases. There must never be a cycle in therelationship between meta-objects defined by__bases__attributes; in other words, the__bases__ attributes define adirected acyclic graph, with arcs pointing from derivedmeta-objects to their base meta-objects. (It is notnecessarily a tree, since multiple classes can have the samebase class.) The__dict__ attributes of a meta-object in theinheritance graph supply attribute descriptors for the regularobject whose__class__ attribute points to the root of theinheritance tree (which is not the same as the root of theinheritance hierarchy – rather more the opposite, at thebottom given how inheritance trees are typically drawn).Descriptors are first searched in the dictionary of the rootmeta-object, then in its bases, according to a precedence rule(see the next paragraph).
Precedence rules
When two meta-objects in the inheritance graph for a givenregular object both define an attribute descriptor with thesame name, the search order is up to the meta-object. Thisallows different meta-objects to define different searchorders. In particular, classic classes use the oldleft-to-right depth-first rule, while new-style classes use amore advanced rule (see the section on method resolution orderinPEP 253).
When a dynamic attribute (one defined in a regular object’s__dict__) has the same name as a static attribute (one definedby a meta-object in the inheritance graph rooted at the regularobject’s__class__), the static attribute has precedence if itis a descriptor that defines a__set__ method (see below);otherwise (if there is no__set__ method) the dynamic attributehas precedence. In other words, for data attributes (thosewith a__set__ method), the static definition overrides thedynamic definition, but for other attributes, dynamic overridesstatic.
Rationale: we can’t have a simple rule like “static overridesdynamic” or “dynamic overrides static”, because some staticattributes indeed override dynamic attributes; for example, akey ‘__class__’ in an instance’s__dict__ is ignored in favorof the statically defined__class__ pointer, but on the otherhand most keys ininst.__dict__ override attributes defined ininst.__class__. Presence of a__set__ method on a descriptorindicates that this is a data descriptor. (Even read-only datadescriptors have a__set__ method: it always raises anexception.) Absence of a__set__ method on a descriptorindicates that the descriptor isn’t interested in interceptingassignment, and then the classic rule applies: an instancevariable with the same name as a method hides the method untilit is deleted.
Attribute descriptors
This is where it gets interesting – and messy. Attributedescriptors (descriptors for short) are stored in themeta-object’s__dict__ (or in the__dict__ of one of itsancestors), and have two uses: a descriptor can be used to getor set the corresponding attribute value on the (regular,non-meta) object, and it has an additional interface thatdescribes the attribute for documentation and introspectionpurposes.
There is little prior art in Python for designing thedescriptor’s interface, neither for getting/setting the valuenor for describing the attribute otherwise, except some trivialproperties (it’s reasonable to assume that__name__ and__doc__should be the attribute’s name and docstring). I will proposesuch an API below.
If an object found in the meta-object’s__dict__ is not anattribute descriptor, backward compatibility dictates certainminimal semantics. This basically means that if it is a Pythonfunction or an unbound method, the attribute is a method;otherwise, it is the default value for a dynamic dataattribute. Backwards compatibility also dictates that (in theabsence of a__setattr__ method) it is legal to assign to anattribute corresponding to a method, and that this creates adata attribute shadowing the method for this particularinstance. However, these semantics are only required forbackwards compatibility with regular classes.

The introspection API is a read-only API. We don’t define theeffect of assignment to any of the special attributes (__dict__,__class__ and__bases__), nor the effect of assignment to theitems of a__dict__. Generally, such assignments should beconsidered off-limits. A future PEP may define some semantics forsome such assignments. (Especially because currently instancessupport assignment to__class__ and__dict__, and classes supportassignment to__bases__ and__dict__.)

Specification of the attribute descriptor API

Attribute descriptors may have the following attributes. In theexamples, x is an object, C isx.__class__,x.meth() is a method,andx.ivar is a data attribute or instance variable. Allattributes are optional – a specific attribute may or may not bepresent on a given descriptor. An absent attribute means that thecorresponding information is not available or the correspondingfunctionality is not implemented.

__name__: the attribute name. Because of aliasing and renaming,the attribute may (additionally or exclusively) be known under adifferent name, but this is the name under which it was born.Example:C.meth.__name__=='meth'.
__doc__: the attribute’s documentation string. This may beNone.
__objclass__: the class that declared this attribute. Thedescriptor only applies to objects that are instances of thisclass (this includes instances of its subclasses). Example:C.meth.__objclass__isC.
__get__(): a function callable with one or two arguments thatretrieves the attribute value from an object. This is alsoreferred to as a “binding” operation, because it may return a“bound method” object in the case of method descriptors. Thefirst argument, X, is the object from which the attribute mustbe retrieved or to which it must be bound. When X is None, theoptional second argument, T, should be meta-object and thebinding operation may return anunbound method restricted toinstances of T. When both X and T are specified, X should be aninstance of T. Exactly what is returned by the bindingoperation depends on the semantics of the descriptor; forexample, static methods and class methods (see below) ignore theinstance and bind to the type instead.
__set__(): a function of two arguments that sets the attributevalue on the object. If the attribute is read-only, this methodmay raise a TypeError orAttributeError exception (both areallowed, because both are historically found for undefined orunsettable attributes). Example:C.ivar.set(x,y)~~x.ivar=y.

Static methods and class methods

The descriptor API makes it possible to add static methods andclass methods. Static methods are easy to describe: they behavepretty much like static methods in C++ or Java. Here’s anexample:

classC:deffoo(x,y):print"staticmethod",x,yfoo=staticmethod(foo)C.foo(1,2)c=C()c.foo(1,2)

Both the callC.foo(1,2) and the callc.foo(1,2) callfoo() withtwo arguments, and print “staticmethod 1 2”. No “self” is declared inthe definition offoo(), and no instance is required in the call.

The line “foo = staticmethod(foo)” in the class statement is thecrucial element: this makesfoo() a static method. The built-instaticmethod() wraps its function argument in a special kind ofdescriptor whose__get__() method returns the original functionunchanged. Without this, the__get__() method of standardfunction objects would have created a bound method object for‘c.foo’ and an unbound method object for ‘C.foo’.

(XXX Barry suggests to use “sharedmethod” instead of“staticmethod”, because the word static is being overloaded in somany ways already. But I’m not sure if shared conveys the rightmeaning.)

Class methods use a similar pattern to declare methods thatreceive an implicit first argument that is theclass for whichthey are invoked. This has no C++ or Java equivalent, and is notquite the same as what class methods are in Smalltalk, but mayserve a similar purpose. According to Armin Rigo, they aresimilar to “virtual class methods” in Borland Pascal dialectDelphi. (Python also has real metaclasses, and perhaps methodsdefined in a metaclass have more right to the name “class method”;but I expect that most programmers won’t be using metaclasses.)Here’s an example:

classC:deffoo(cls,y):print"classmethod",cls,yfoo=classmethod(foo)C.foo(1)c=C()c.foo(1)

Both the callC.foo(1) and the callc.foo(1) end up callingfoo()withtwo arguments, and print “classmethod __main__.C 1”. Thefirst argument offoo() is implied, and it is the class, even ifthe method was invoked via an instance. Now let’s continue theexample:

classD(C):passD.foo(1)d=D()d.foo(1)

This prints “classmethod __main__.D 1” both times; in other words,the class passed as the first argument offoo() is the classinvolved in the call, not the class involved in the definition offoo().

But notice this:

classE(C):deffoo(cls,y):# override C.fooprint"E.foo() called"C.foo(y)foo=classmethod(foo)E.foo(1)e=E()e.foo(1)

In this example, the call toC.foo() fromE.foo() will see class Cas its first argument, not class E. This is to be expected, sincethe call specifies the class C. But it stresses the differencebetween these class methods and methods defined in metaclasses,where an upcall to a metamethod would pass the target class as anexplicit first argument. (If you don’t understand this, don’tworry, you’re not alone.) Note that callingcls.foo(y) would be amistake – it would cause infinite recursion. Also note that youcan’t specify an explicit ‘cls’ argument to a class method. Ifyou want this (e.g. the__new__ method inPEP 253 requires this),use a static method with a class as its explicit first argumentinstead.

C API

XXX The following is VERY rough text that I wrote with a differentaudience in mind; I’ll have to go through this to edit it more.XXX It also doesn’t go into enough detail for the C API.

A built-in type can declare special data attributes in two ways:using a struct memberlist (defined in structmember.h) or a structgetsetlist (defined in descrobject.h). The struct memberlist isan old mechanism put to new use: each attribute has a descriptorrecord including its name, an enum giving its type (various Ctypes are supported as well asPyObject*), an offset from thestart of the instance, and a read-only flag.

The struct getsetlist mechanism is new, and intended for casesthat don’t fit in that mold, because they either requireadditional checking, or are plain calculated attributes. Eachattribute here has a name, a getter C function pointer, a setter Cfunction pointer, and a context pointer. The function pointersare optional, so that for example setting the setter functionpointer toNULL makes a read-only attribute. The context pointeris intended to pass auxiliary information to generic getter/setterfunctions, but I haven’t found a need for this yet.

Note that there is also a similar mechanism to declare built-inmethods: these arePyMethodDef structures, which contain a nameand a C function pointer (and some flags for the callingconvention).

Traditionally, built-in types have had to define their owntp_getattro andtp_setattro slot functions to make these attributedefinitions work (PyMethodDef and struct memberlist are quiteold). There are convenience functions that take an array ofPyMethodDef or memberlist structures, an object, and an attributename, and return or set the attribute if found in the list, orraise an exception if not found. But these convenience functionshad to be explicitly called by thetp_getattro ortp_setattromethod of the specific type, and they did a linear search of thearray usingstrcmp() to find the array element describing therequested attribute.

I now have a brand spanking new generic mechanism that improvesthis situation substantially.

Pointers to arrays ofPyMethodDef, memberlist, getsetliststructures are part of the new type object (tp_methods,tp_members,tp_getset).
At type initialization time (inPyType_InitDict()), for eachentry in those three arrays, a descriptor object is created andplaced in a dictionary that belongs to the type (tp_dict).
Descriptors are very lean objects that mostly point to thecorresponding structure. An implementation detail is that alldescriptors share the same object type, and a discriminatorfield tells what kind of descriptor it is (method, member, orgetset).
As explained inPEP 252, descriptors have aget() method thattakes an object argument and returns that object’s attribute;descriptors for writable attributes also have aset() methodthat takes an object and a value and set that object’sattribute. Note that theget() object also serves as abind()operation for methods, binding the unbound method implementationto the object.
Instead of providing their own tp_getattro and tp_setattroimplementation, almost all built-in objects now placePyObject_GenericGetAttr and (if they have any writableattributes)PyObject_GenericSetAttr in theirtp_getattro andtp_setattro slots. (Or, they can leave theseNULL, and inheritthem from the default base object, if they arrange for anexplicit call toPyType_InitDict() for the type before the firstinstance is created.)
In the simplest case,PyObject_GenericGetAttr() does exactly onedictionary lookup: it looks up the attribute name in the type’sdictionary (obj->ob_type->tp_dict). Upon success, there are twopossibilities: the descriptor has a get method, or it doesn’t.For speed, the get and set methods are type slots:tp_descr_getandtp_descr_set. If thetp_descr_get slot is non-NULL, it iscalled, passing the object as its only argument, and the returnvalue from this call is the result of the getattr operation. Ifthetp_descr_get slot isNULL, as a fallback the descriptoritself is returned (compare class attributes that are notmethods but simple values).
PyObject_GenericSetAttr() works very similar but uses thetp_descr_set slot and calls it with the object and the newattribute value; if thetp_descr_set slot isNULL, anAttributeError is raised.
But now for a more complicated case. The approach describedabove is suitable for most built-in objects such as lists,strings, numbers. However, some object types have a dictionaryin each instance that can store arbitrary attributes. In fact,when you use a class statement to subtype an existing built-intype, you automatically get such a dictionary (unless youexplicitly turn it off, using another advanced feature,__slots__). Let’s call this the instance dict, to distinguishit from the type dict.
In the more complicated case, there’s a conflict between namesstored in the instance dict and names stored in the type dict.If both dicts have an entry with the same key, which one shouldwe return? Looking at classic Python for guidance, I findconflicting rules: for class instances, the instance dictoverrides the class dict,except for the special attributes(like__dict__ and__class__), which have priority over theinstance dict.
I resolved this with the following set of rules, implemented inPyObject_GenericGetAttr():
1. Look in the type dict. If you find adata descriptor, useitsget() method to produce the result. This takes care ofspecial attributes like__dict__ and__class__.
2. Look in the instance dict. If you find anything, that’s it.(This takes care of the requirement that normally theinstance dict overrides the class dict.)
3. Look in the type dict again (in reality this uses the savedresult from step 1, of course). If you find a descriptor,use itsget() method; if you find something else, that’s it;if it’s not there, raiseAttributeError.
This requires a classification of descriptors as data andnondata descriptors. The current implementation quite sensiblyclassifies member and getset descriptors as data (even if theyare read-only!) and method descriptors as nondata.Non-descriptors (like function pointers or plain values) arealso classified as non-data (!).
This scheme has one drawback: in what I assume to be the mostcommon case, referencing an instance variable stored in theinstance dict, it doestwo dictionary lookups, whereas theclassic scheme did a quick test for attributes starting with twounderscores plus a single dictionary lookup. (Although theimplementation is sadly structured asinstance_getattr() callinginstance_getattr1() callinginstance_getattr2() which finallycallsPyDict_GetItem(), and the underscore test callsPyString_AsString() rather than inlining this. I wonder ifoptimizing the snot out of this might not be a good idea tospeed up Python 2.2, if we weren’t going to rip it all out. :-)
A benchmark verifies that in fact this is as fast as classicinstance variable lookup, so I’m no longer worried.
Modification for dynamic types: step 1 and 3 look in thedictionary of the type and all its base classes (in MROsequence, or course).

Discussion

XXX

Examples

Let’s look at lists. In classic Python, the method names oflists were available as the __methods__ attribute of list objects:

>>>[].__methods__['append', 'count', 'extend', 'index', 'insert', 'pop','remove', 'reverse', 'sort']>>>

Under the new proposal, the __methods__ attribute no longer exists:

>>>[].__methods__Traceback (most recent call last):  File"<stdin>", line1, in?AttributeError:'list' object has no attribute '__methods__'>>>

Instead, you can get the same information from the list type:

>>>T=[].__class__>>>T<type 'list'>>>>dir(T)# like T.__dict__.keys(), but sorted['__add__', '__class__', '__contains__', '__eq__', '__ge__','__getattr__', '__getitem__', '__getslice__', '__gt__','__iadd__', '__imul__', '__init__', '__le__', '__len__','__lt__', '__mul__', '__ne__', '__new__', '__radd__','__repr__', '__rmul__', '__setitem__', '__setslice__', 'append','count', 'extend', 'index', 'insert', 'pop', 'remove','reverse', 'sort']>>>

The new introspection API gives more information than the old one:in addition to the regular methods, it also shows the methods thatare normally invoked through special notations, e.g.__iadd__(+=),__len__ (len),__ne__ (!=).You can invoke any method from this list directly:

>>>a=['tic','tac']>>>T.__len__(a)# same as len(a)2>>>T.append(a,'toe')# same as a.append('toe')>>>a['tic', 'tac', 'toe']>>>

This is just like it is for user-defined classes.

Notice a familiar yet surprising name in the list:__init__. Thisis the domain ofPEP 253.

Backwards compatibility

XXX

Warnings and Errors

XXX

Implementation

A partial implementation of this PEP is available from CVS as abranch named “descr-branch”. To experiment with thisimplementation, proceed to check out Python from CVS according tothe instructions athttp://sourceforge.net/cvs/?group_id=5470 butadd the arguments “-r descr-branch” to the cvs checkout command.(You can also start with an existing checkout and do “cvs update-r descr-branch”.) For some examples of the features describedhere, see the file Lib/test/test_descr.py.

Note: the code in this branch goes way beyond this PEP; it is alsothe experimentation area forPEP 253 (Subtyping Built-in Types).

References

XXX

Copyright

This document has been placed in the public domain.

Source:https://github.com/python/peps/blob/main/peps/pep-0252.rst

Last modified:2025-02-01 08:55:40 GMT

Movatterモバイル変換

PEP 252 – Making Types Look More Like Classes