Extension Types¶
Introduction¶
Note
This page uses two different syntax variants:
Cython specific
cdef
syntax, which was designed to make type declarationsconcise and easily readable from a C/C++ perspective.Pure Python syntax which allows static Cython type declarations inpure Python code,followingPEP-484 type hintsandPEP 526 variable annotations.
To make use of C data types in Python syntax, you need to import the special
cython
module in the Python module that you want to compile, e.g.importcython
If you use the pure Python syntax we strongly recommend you use a recentCython 3 release, since significant improvements have been made herecompared to the 0.29.x releases.
As well as creating normal user-defined classes with the Python classstatement, Cython also lets you create new built-in Python types, known asextension types. You define an extension type using thecdef
classstatement or decorating the class with the@cclass
decorator. Here’s an example:
@cython.cclassclassShrubbery:width:cython.intheight:cython.intdef__init__(self,w,h):self.width=wself.height=hdefdescribe(self):print("This shrubbery is",self.width,"by",self.height,"cubits.")
cdefclassShrubbery:cdefintwidthcdefintheightdef__init__(self,w,h):self.width=wself.height=hdefdescribe(self):print("This shrubbery is",self.width,"by",self.height,"cubits.")
As you can see, a Cython extension type definition looks a lot like a Pythonclass definition. Within it, you use thedef
statement to define methods thatcan be called from Python code. You can even define many of the specialmethods such as__init__()
as you would in Python.
The main difference is that you can define attributes using
the
cdef
statement,the
cython.declare()
function orthe annotation of an attribute name.
@cython.cclassclassShrubbery:width=cython.declare(cython.int)height:cython.int
cdefclassShrubbery:cdefintwidthcdefintheight
The attributes may be Python objects (either generic or of aparticular extension type), or they may be of any C data type. So you can useextension types to wrap arbitrary C data structures and provide a Python-likeinterface to them.
Static Attributes¶
Attributes of an extension type are stored directly in the object’s C struct.The set of attributes is fixed at compile time; you can’t add attributes to anextension type instance at run time simply by assigning to them, as you couldwith a Python class instance. However, you can explicitly enable supportfor dynamically assigned attributes, or subclass the extension type with a normalPython class, which then supports arbitrary attribute assignments.SeeDynamic Attributes.
There are two ways that attributes of an extension type can be accessed: byPython attribute lookup, or by direct access to the C struct from Cython code.Python code is only able to access attributes of an extension type by thefirst method, but Cython code can use either method.
By default, extension type attributes are only accessible by direct access,not Python access, which means that they are not accessible from Python code.To make them accessible from Python code, you need to declare them aspublic
orreadonly
. For example:
importcython@cython.cclassclassShrubbery:width=cython.declare(cython.int,visibility='public')height=cython.declare(cython.int,visibility='public')depth=cython.declare(cython.float,visibility='readonly')
cdefclassShrubbery:cdefpublicintwidth,heightcdefreadonlyfloatdepth
makes the width and height attributes readable and writable from Python code,and the depth attribute readable but not writable.
Note
You can only expose simple C types, such as ints, floats, andstrings, for Python access. You can also expose Python-valued attributes.
Dynamic Attributes¶
It is not possible to add attributes to an extension type at runtime by default.You have two ways of avoiding this limitation, both add an overhead whena method is called from Python code. Especially when calling hybrid methods declaredwithcpdef
in .pyx files or with the@ccall
decorator.
The first approach is to create a Python subclass:
@cython.cclassclassAnimal:number_of_legs:cython.intdef__cinit__(self,number_of_legs:cython.int):self.number_of_legs=number_of_legsclassExtendableAnimal(Animal):# Note that we use class, not cdef classpassdog=ExtendableAnimal(4)dog.has_tail=True
cdefclassAnimal:cdefintnumber_of_legsdef__init__(self,intnumber_of_legs):self.number_of_legs=number_of_legsclassExtendableAnimal(Animal):# Note that we use class, not cdef classpassdog=ExtendableAnimal(4)dog.has_tail=True
Declaring a__dict__
attribute is the second way of enabling dynamic attributes:
@cython.cclassclassAnimal:number_of_legs:cython.int__dict__:dictdef__cinit__(self,number_of_legs:cython.int):self.number_of_legs=number_of_legsdog=Animal(4)dog.has_tail=True
cdefclassAnimal:cdefintnumber_of_legscdefdict__dict__def__init__(self,intnumber_of_legs):self.number_of_legs=number_of_legsdog=Animal(4)dog.has_tail=True
Type declarations¶
Before you can directly access the attributes of an extension type, the Cythoncompiler must know that you have an instance of that type, and not just ageneric Python object. It knows this already in the case of theself
parameter of the methods of that type, but in other cases you will have to usea type declaration.
For example, in the following function:
@cython.cfuncdefwiden_shrubbery(sh,extra_width):# BADsh.width=sh.width+extra_width
cdefwiden_shrubbery(sh,extra_width):# BADsh.width=sh.width+extra_width
because thesh
parameter hasn’t been given a type, the width attributewill be accessed by a Python attribute lookup. If the attribute has beendeclaredpublic
orreadonly
then this will work, but itwill be very inefficient. If the attribute is private, it will not work at all– the code will compile, but an attribute error will be raised at run time.
The solution is to declaresh
as being of typeShrubbery
, asfollows:
importcythonfromcython.cimports.my_moduleimportShrubbery@cython.cfuncdefwiden_shrubbery(sh:Shrubbery,extra_width):sh.width=sh.width+extra_width
frommy_modulecimportShrubberycdefwiden_shrubbery(Shrubberysh,extra_width):sh.width=sh.width+extra_width
Now the Cython compiler knows thatsh
has a C attribute calledwidth
and will generate code to access it directly and efficiently.The same consideration applies to local variables, for example:
importcythonfromcython.cimports.my_moduleimportShrubbery@cython.cfuncdefanother_shrubbery(sh1:Shrubbery)->Shrubbery:sh2:Shrubberysh2=Shrubbery()sh2.width=sh1.widthsh2.height=sh1.heightreturnsh2
frommy_modulecimportShrubberycdefShrubberyanother_shrubbery(Shrubberysh1):cdefShrubberysh2sh2=Shrubbery()sh2.width=sh1.widthsh2.height=sh1.heightreturnsh2
Note
Here, wecimport the classShrubbery
(using thecimport
statementor importing from specialcython.cimports
package), and this is necessaryto declare the type at compile time. To be able to cimport an extension type,we split the class definition into two parts, one in a definition file andthe other in the corresponding implementation file. You should readSharing Extension Types to learn to do that.
Type Testing and Casting¶
Suppose I have a methodquest()
which returns an object of typeShrubbery
.To access its width I could write:
sh:Shrubbery=quest()print(sh.width)
cdefShrubberysh=quest()print(sh.width)
which requires the use of a local variable and performs a type test on assignment.If youknow the return value ofquest()
will be of typeShrubbery
you can use a cast to write:
print(cython.cast(Shrubbery,quest()).width)
print((<Shrubbery>quest()).width)
This may be dangerous ifquest()
is not actually aShrubbery
, as itwill try to access width as a C struct member which may not exist. At the C level,rather than raising anAttributeError
, either an nonsensical result will bereturned (interpreting whatever data is at that address as an int) or a segfaultmay result from trying to access invalid memory. Instead, one can write:
print(cython.cast(Shrubbery,quest(),typecheck=True).width)
print((<Shrubbery?>quest()).width)
which performs a type check (possibly raising aTypeError
) before making thecast and allowing the code to proceed.
To explicitly test the type of an object, use theisinstance()
builtin function.For known builtin or extension types, Cython translates these into afast and safe type check that ignores changes tothe object’s__class__
attribute etc., so that after a successfulisinstance()
test, code can rely on the expected C structure of theextension type and its C-level attributes (stored in the object’s C struct) andcdef
/@cfunc
methods.
Extension types and None¶
Cython handlesNone
values differently in C-like type declarations and when Python annotations are used.
Incdef
declarations and C-like function argument declarations (func(listx)
),when you declare an argument or C variable as having an extension or Python builtin type,Cython will allow it to take on the valueNone
as well as values of itsdeclared type. This is analogous to the way a C pointer can take on the valueNULL
, and you need to exercise the same caution because of it. There is noproblem as long as you are performing Python operations on it, because fulldynamic type checking will be applied. However, when you access C attributesof an extension type (as in the widen_shrubbery function above), it’s up toyou to make sure the reference you’re using is notNone
– in theinterests of efficiency, Cython does not check this.
With the C-like declaration syntax, you need to be particularly careful whenexposing Python functions which take extension types as arguments:
defwiden_shrubbery(Shrubberysh,extra_width):# This issh.width=sh.width+extra_width# dangerous!
The users of our module could crash it by passingNone
for thesh
parameter.
As in Python, whenever it is unclear whether a variable can beNone
,but the code requires a non-None value, an explicit check can help:
defwiden_shrubbery(Shrubberysh,extra_width):ifshisNone:raiseTypeErrorsh.width=sh.width+extra_width
but since this is anticipated to be such a frequent requirement, Cython languageprovides a more convenient way. Parameters of a Python function declared as anextension type can have anotNone
clause:
defwiden_shrubbery(ShrubberyshnotNone,extra_width):sh.width=sh.width+extra_width
Now the function will automatically check thatsh
isnotNone
alongwith checking that it has the right type.
When annotations are used, the behaviour follows the Python typing semantics ofPEP-484 instead.The valueNone
is not allowed when a variable is annotated only with its plain type:
defwiden_shrubbery(sh:Shrubbery,extra_width):# TypeError is raisedsh.width=sh.width+extra_width# when sh is None
To also allowNone
,typing.Optional[]
must be used explicitly.For function arguments, this is also automatically allowed when they have adefault argument ofNone`, e.g.func(x:list=None)
does not requiretyping.Optional
:
importtypingdefwiden_shrubbery(sh:typing.Optional[Shrubbery],extra_width):ifshisNone:# We want to raise a custom exception in case of a None value.raiseValueErrorsh.width=sh.width+extra_width
The upside of using annotations here is that they are safe by default becauseyou need to explicitly allowNone
values for them.
Note
ThenotNone
andtyping.Optional
can only be used in Python functions (defined withdef
and without@cython.cfunc
decorator) and not C functions(defined withcdef
or decorated using@cython.cfunc
). Ifyou need to check whether a parameter to a C function isNone
, you willneed to do it yourself.
Note
Some more things:
The
self
parameter of a method of an extension type is guaranteed never tobeNone
.When comparing a value with
None
, keep in mind that, ifx
is a Pythonobject,xisNone
andxisnotNone
are very efficient because theytranslate directly to C pointer comparisons, whereasx==None
andx!=None
, or simply usingx
as a boolean value (as inifx:...
)will invoke Python operations and therefore be much slower.typing.Union[tp,None]
andtp|None
can be used as alternatives totyping.Optional
Special methods¶
Although the principles are similar, there are substantial differences betweenmany of the__xxx__()
special methods of extension types and their Pythoncounterparts. There is aseparate page devoted to this subject, and you shouldread it carefully before attempting to use any special methods in yourextension types.
Properties¶
You can declare properties in an extension class using the same syntax as in ordinary Python code:
@cython.cclassclassSpam:@propertydefcheese(self):# This is called when the property is read....@cheese.setterdefcheese(self,value):# This is called when the property is written....@cheese.deleterdefcheese(self):# This is called when the property is deleted.
cdefclassSpam:@propertydefcheese(self):# This is called when the property is read....@cheese.setterdefcheese(self,value):# This is called when the property is written....@cheese.deleterdefcheese(self):# This is called when the property is deleted.
There is also a special (deprecated) legacy syntax for defining properties in an extension class:
cdefclassSpam:propertycheese:"A doc string can go here."def__get__(self):# This is called when the property is read....def__set__(self,value):# This is called when the property is written....def__del__(self):# This is called when the property is deleted.
The__get__()
,__set__()
and__del__()
methods are alloptional; if they are omitted, an exception will be raised when thecorresponding operation is attempted.
Here’s a complete example. It defines a property which adds to a list eachtime it is written to, returns the list when it is read, and empties the listwhen it is deleted:
importcython@cython.cclassclassCheeseShop:cheeses:objectdef__cinit__(self):self.cheeses=[]@propertydefcheese(self):returnf"We don't have: {self.cheeses}"@cheese.setterdefcheese(self,value):self.cheeses.append(value)@cheese.deleterdefcheese(self):delself.cheeses[:]# Test inputfromcheesyimportCheeseShopshop=CheeseShop()print(shop.cheese)shop.cheese="camembert"print(shop.cheese)shop.cheese="cheddar"print(shop.cheese)delshop.cheeseprint(shop.cheese)
cdefclassCheeseShop:cdefobjectcheesesdef__cinit__(self):self.cheeses=[]@propertydefcheese(self):returnf"We don't have: {self.cheeses}"@cheese.setterdefcheese(self,value):self.cheeses.append(value)@cheese.deleterdefcheese(self):delself.cheeses[:]# Test inputfromcheesyimportCheeseShopshop=CheeseShop()print(shop.cheese)shop.cheese="camembert"print(shop.cheese)shop.cheese="cheddar"print(shop.cheese)delshop.cheeseprint(shop.cheese)
# Test outputWe don't have: []We don't have: ['camembert']We don't have: ['camembert', 'cheddar']We don't have: []
C methods¶
Extension types can have C methods as well as Python methods. Like Cfunctions, C methods are declared using
C methods are “virtual”, and may be overridden in derived extension types.In addition,cpdef
/@ccall
methods can even be overridden by Pythonmethods when called as C method. This adds a little to their calling overheadcompared to acdef
/@cfunc
method:
importcython@cython.cclassclassParrot:@cython.cfuncdefdescribe(self)->cython.void:print("This parrot is resting.")@cython.cclassclassNorwegian(Parrot):@cython.cfuncdefdescribe(self)->cython.void:Parrot.describe(self)print("Lovely plumage!")cython.declare(p1=Parrot,p2=Parrot)p1=Parrot()p2=Norwegian()print("p2:")p2.describe()
cdefclassParrot:cdefvoiddescribe(self):print("This parrot is resting.")cdefclassNorwegian(Parrot):cdefvoiddescribe(self):Parrot.describe(self)print("Lovely plumage!")cdefParrotp1,p2p1=Parrot()p2=Norwegian()print("p2:")p2.describe()
# Outputp1:This parrot is resting.p2:This parrot is resting.Lovely plumage!
The above example also illustrates that a C method can call an inherited Cmethod using the usual Python technique, i.e.:
Parrot.describe(self)
cdef
/@ccall
methods can be declared static by using the@staticmethod
decorator.This can be especially useful for constructing classes that take non-Python compatible types:
importcythonfromcython.cimports.libc.stdlibimportfree@cython.cclassclassOwnedPointer:ptr:cython.p_voiddef__dealloc__(self):ifself.ptrisnotcython.NULL:free(self.ptr)@staticmethod@cython.cfuncdefcreate(ptr:cython.p_void):p=OwnedPointer()p.ptr=ptrreturnp
fromlibc.stdlibcimportfreecdefclassOwnedPointer:cdefvoid*ptrdef__dealloc__(self):ifself.ptrisnotNULL:free(self.ptr)@staticmethodcdefcreate(void*ptr):p=OwnedPointer()p.ptr=ptrreturnp
Note
Cython currently does not support decoratingcdef
/@ccall
methods withthe@classmethod
decorator.
Subclassing¶
If an extension type inherits from other types, the first base class must bea built-in type or another extension type:
@cython.cclassclassParrot:...@cython.cclassclassNorwegian(Parrot):...
cdefclassParrot:...cdefclassNorwegian(Parrot):...
A complete definition of the base type must be available to Cython, so if thebase type is a built-in type, it must have been previously declared as anextern extension type. If the base type is defined in another Cython module, itmust either be declared as an extern extension type or imported using thecimport
statement or importing from the specialcython.cimports
package.
Multiple inheritance is supported, however the second and subsequent baseclasses must be an ordinary Python class (not an extension type or a built-intype).
Cython extension types can also be subclassed in Python. A Python class caninherit from multiple extension types provided that the usual Python rules formultiple inheritance are followed (i.e. the C layouts of all the base classesmust be compatible).
There is a way to prevent extension types frombeing subtyped in Python. This is done via thefinal
directive,usually set on an extension type or C method using a decorator:
importcython@cython.final@cython.cclassclassParrot:defdescribe(self):pass@cython.cclassclassLizard:@cython.final@cython.cfuncdefdone(self):pass
cimportcython@cython.finalcdefclassParrot:defdescribe(self):passcdefclassLizard:@cython.finalcdefdone(self):pass
Trying to create a Python subclass from a final type or overriding a final method will raiseaTypeError
at runtime. Cython will also prevent subtyping afinal type or overriding a final method inside of the same module, i.e. creatingan extension type that uses a final type as its base type will fail at compile time.Note, however, that this restriction does not currently propagate toother extension modules, so Cython is unable to prevent final extension typesfrom being subtyped at the C level by foreign code.
Forward-declaring extension types¶
Extension types can be forward-declared, likestruct
andunion
types. This is usually not necessary and violates theDRY principle (Don’t Repeat Yourself).
If you are forward-declaring an extension type that has a base class, you mustspecify the base class in both the forward declaration and its subsequentdefinition, for example,:
cdefclassA(B)...cdefclassA(B):# attributes and methods
Fast instantiation¶
Cython provides two ways to speed up the instantiation of extension types.The first one is a direct call to the__new__()
special static method,as known from Python. For an extension typePenguin
, you could usethe following code:
importcython@cython.cclassclassPenguin:food:objectdef__cinit__(self,food):self.food=fooddef__init__(self,food):print("eating!")normal_penguin=Penguin('fish')fast_penguin=Penguin.__new__(Penguin,'wheat')# note: not calling __init__() !
cdefclassPenguin:cdefobjectfooddef__cinit__(self,food):self.food=fooddef__init__(self,food):print("eating!")normal_penguin=Penguin('fish')fast_penguin=Penguin.__new__(Penguin,'wheat')# note: not calling __init__() !
Note that the path through__new__()
willnot call the type’s__init__()
method (again, as known from Python). Thus, in the exampleabove, the first instantiation will printeating!
, but the second willnot. This is only one of the reasons why the__cinit__()
method issafer than the normal__init__()
method for initialising extension typesand bringing them into a correct and safe state.See theInitialisation Methods Section aboutthe differences.
The second performance improvement applies to types that are often createdand deleted in a row, so that they can benefit from a freelist. Cythonprovides the decorator@cython.freelist(N)
for this, which creates astatically sized freelist ofN
instances for a given type. Example:
importcython@cython.freelist(8)@cython.cclassclassPenguin:food:objectdef__cinit__(self,food):self.food=foodpenguin=Penguin('fish 1')penguin=Nonepenguin=Penguin('fish 2')# does not need to allocate memory!
cimportcython@cython.freelist(8)cdefclassPenguin:cdefobjectfooddef__cinit__(self,food):self.food=foodpenguin=Penguin('fish 1')penguin=Nonepenguin=Penguin('fish 2')# does not need to allocate memory!
Instantiation from existing C/C++ pointers¶
It is quite common to want to instantiate an extension class from an existing(pointer to a) data structure, often as returned by external C/C++ functions.
As extension classes can only accept Python objects as arguments in theirconstructors, this necessitates the use of factory functions or factory methods. For example:
importcythonfromcython.cimports.libc.stdlibimportmalloc,free# Example C structmy_c_struct=cython.struct(a=cython.int,b=cython.int,)@cython.cclassclassWrapperClass:"""A wrapper class for a C/C++ data structure"""_ptr:cython.pointer[my_c_struct]ptr_owner:cython.bintdef__cinit__(self):self.ptr_owner=Falsedef__dealloc__(self):# De-allocate if not null and flag is setifself._ptrisnotcython.NULLandself.ptr_ownerisTrue:free(self._ptr)self._ptr=cython.NULLdef__init__(self):# Prevent accidental instantiation from normal Python code# since we cannot pass a struct pointer into a Python constructor.raiseTypeError("This class cannot be instantiated directly.")# Extension class properties@propertydefa(self):returnself._ptr.aifself._ptrisnotcython.NULLelseNone@propertydefb(self):returnself._ptr.bifself._ptrisnotcython.NULLelseNone@staticmethod@cython.cfuncdeffrom_ptr(_ptr:cython.pointer[my_c_struct],owner:cython.bint=False)->WrapperClass:"""Factory function to create WrapperClass objects from given my_c_struct pointer. Setting ``owner`` flag to ``True`` causes the extension type to ``free`` the structure pointed to by ``_ptr`` when the wrapper object is deallocated."""# Fast call to __new__() that bypasses the __init__() constructor.wrapper:WrapperClass=WrapperClass.__new__(WrapperClass)wrapper._ptr=_ptrwrapper.ptr_owner=ownerreturnwrapper@staticmethod@cython.cfuncdefnew_struct()->WrapperClass:"""Factory function to create WrapperClass objects with newly allocated my_c_struct"""_ptr:cython.pointer[my_c_struct]=cython.cast(cython.pointer[my_c_struct],malloc(cython.sizeof(my_c_struct)))if_ptriscython.NULL:raiseMemoryError_ptr.a=0_ptr.b=0returnWrapperClass.from_ptr(_ptr,owner=True)
fromlibc.stdlibcimportmalloc,free# Example C structctypedefstructmy_c_struct:intaintbcdefclassWrapperClass:"""A wrapper class for a C/C++ data structure"""cdefmy_c_struct *_ptrcdefbintptr_ownerdef__cinit__(self):self.ptr_owner=Falsedef__dealloc__(self):# De-allocate if not null and flag is setifself._ptrisnotNULLandself.ptr_ownerisTrue:free(self._ptr)self._ptr=NULLdef__init__(self):# Prevent accidental instantiation from normal Python code# since we cannot pass a struct pointer into a Python constructor.raiseTypeError("This class cannot be instantiated directly.")# Extension class properties@propertydefa(self):returnself._ptr.aifself._ptrisnotNULLelseNone@propertydefb(self):returnself._ptr.bifself._ptrisnotNULLelseNone@staticmethodcdefWrapperClassfrom_ptr(my_c_struct*_ptr,bintowner=False):"""Factory function to create WrapperClass objects from given my_c_struct pointer. Setting ``owner`` flag to ``True`` causes the extension type to ``free`` the structure pointed to by ``_ptr`` when the wrapper object is deallocated."""# Fast call to __new__() that bypasses the __init__() constructor.cdefWrapperClasswrapper=WrapperClass.__new__(WrapperClass)wrapper._ptr=_ptrwrapper.ptr_owner=ownerreturnwrapper@staticmethodcdefWrapperClassnew_struct():"""Factory function to create WrapperClass objects with newly allocated my_c_struct"""cdefmy_c_struct *_ptr=<my_c_struct*>malloc(sizeof(my_c_struct))if_ptrisNULL:raiseMemoryError_ptr.a=0_ptr.b=0returnWrapperClass.from_ptr(_ptr,owner=True)
To then create aWrapperClass
object from an existingmy_c_struct
pointer,WrapperClass.from_ptr(ptr)
can be used in Cython code. To allocatea new structure and wrap it at the same time,WrapperClass.new_struct
can beused instead.
It is possible to create multiple Python objects all from the same pointerwhich point to the same in-memory data, if that is wanted, though care must betaken when de-allocating as can be seen above.Additionally, theptr_owner
flag can be used to control whichWrapperClass
object owns the pointer and is responsible for de-allocation -this is set toFalse
by default in the example and can be enabled by callingfrom_ptr(ptr,owner=True)
.
The GIL mustnot be released in__dealloc__
either, or another lock usedif it is, in such cases or race conditions can occur with multiplede-allocations.
Being a part of the object constructor, the__cinit__
method has a Pythonsignature, which makes it unable to accept amy_c_struct
pointer as anargument.
Attempts to use pointers in a Python signature will result in errors like:
Cannotconvert'my_c_struct *'toPythonobject
This is because Cython cannot automatically convert a pointer to a Pythonobject, unlike with native types likeint
.
Note that for native types, Cython will copy the value and create a new Pythonobject while in the above case, data is not copied and deallocating memory isa responsibility of the extension class.
Making extension types weak-referenceable¶
By default, extension types do not support having weak references made tothem. You can enable weak referencing by declaring a C attribute of typeobject called__weakref__
. For example:
@cython.cclassclassExplodingAnimal:"""This animal will self-destruct when it is no longer strongly referenced."""__weakref__:object
cdefclassExplodingAnimal:"""This animal will self-destruct when it is no longer strongly referenced."""cdefobject__weakref__
Controlling deallocation and garbage collection in CPython¶
Note
This section only applies to the usual CPython implementationof Python. Other implementations like PyPy work differently.
Introduction¶
First of all, it is good to understand that there are two ways totrigger deallocation of Python objects in CPython:CPython uses reference counting for all objects and any object with areference count of zero is immediately deallocated. This is the mostcommon way of deallocating an object. For example, consider
>>>x="foo">>>x="bar"
After executing the second line, the string"foo"
is no longer referenced,so it is deallocated. This is done using thePyTypeObject.tp_dealloc
slot, which can becustomized in Cython by implementing__dealloc__
.
The second mechanism is the cyclic garbage collector.This is meant to resolve cyclic reference cycles such as
>>>classObject:...pass>>>defmake_cycle():...x=Object()...y=[x]...x.attr=y
When callingmake_cycle
, a reference cycle is created sincex
referencesy
and vice versa. Even though neitherx
ory
are accessible aftermake_cycle
returns, both have a reference countof 1, so they are not immediately deallocated. At regular times, the garbagecollector runs, which will notice the reference cycle(using thePyTypeObject.tp_traverse
slot) and break it.Breaking a reference cycle means taking an object in the cycleand removing all references from it to other Python objects (we call thisclearing an object). Clearing is almost the same as deallocating, exceptthat the actual object is not yet freed. Forx
in the example above,the attributes ofx
would be removed fromx
.
Note that it suffices to clear just one object in the reference cycle,since there is no longer a cycle after clearing one object. Once the cycleis broken, the usual refcount-based deallocation will actually remove theobjects from memory. Clearing is implemented in thePyTypeObject.tp_clear
slot.As we just explained, it is sufficient that one object in the cycleimplementsPyTypeObject.tp_clear
.
Enabling the deallocation trashcan¶
In CPython, it is possible to create deeply recursive objects. For example:
>>>L=None>>>foriinrange(2**20):...L=[L]
Now imagine that we delete the finalL
. ThenL
deallocatesL[0]
, which deallocatesL[0][0]
and so on until we reach arecursion depth of2**20
. This deallocation is done in C and sucha deep recursion will likely overflow the C call stack, crashing Python.
CPython invented a mechanism for this called thetrashcan. It limits therecursion depth of deallocations by delaying some deallocations.
By default, Cython extension types do not use the trashcan but it can beenabled by setting thetrashcan
directive toTrue
. For example:
importcython@cython.trashcan(True)@cython.cclassclassObject:__dict__:dict
cimportcython@cython.trashcan(True)cdefclassObject:cdefdict__dict__
Trashcan usage is inherited by subclasses(unless explicitly disabled by@cython.trashcan(False)
).Some builtin types likelist
use the trashcan, so subclasses of ituse the trashcan by default.
Disabling cycle breaking (tp_clear
)¶
By default, each extension type will support the cyclic garbage collector ofCPython. If any Python objects can be referenced, Cython will automaticallygenerate thePyTypeObject.tp_traverse
andPyTypeObject.tp_clear
slots. This is usually what youwant.
There is at least one reason why this might not be what you want: If you needto cleanup some external resources in the__dealloc__
special function andyour object happened to be in a reference cycle, the garbage collector mayhave triggered a call toPyTypeObject.tp_clear
to clear the object(seeIntroduction).
In that case, any object references have vanished when__dealloc__
is called. Now your cleanup code lost access to the objects it has to clean up.To fix this, you can disable clearing instances of a specific class by usingtheno_gc_clear
directive:
@cython.no_gc_clear@cython.cclassclassDBCursor:conn:DBConnectionraw_cursor:cython.pointer[DBAPI_Cursor]# ...def__dealloc__(self):DBAPI_close_cursor(self.conn.raw_conn,self.raw_cursor)
@cython.no_gc_clearcdefclassDBCursor:cdefDBConnectionconncdefDBAPI_Cursor *raw_cursor# ...def__dealloc__(self):DBAPI_close_cursor(self.conn.raw_conn,self.raw_cursor)
This example tries to close a cursor via a database connection when the Pythonobject is destroyed. TheDBConnection
object is kept alive by the referencefromDBCursor
. But if a cursor happens to be in a reference cycle, thegarbage collector may delete the database connection reference,which makes it impossible to clean up the cursor.
If you useno_gc_clear
, it is important that any given reference cyclecontains at least one objectwithoutno_gc_clear
. Otherwise, the cyclecannot be broken, which is a memory leak.
Disabling cyclic garbage collection¶
In rare cases, extension types can be guaranteed not to participate in cycles,but the compiler won’t be able to prove this. This would be the case ifthe class can never reference itself, even indirectly.In that case, you can manually disable cycle collection by using theno_gc
directive, but beware that doing so when in fact the extension typecan participate in cycles could cause memory leaks:
@cython.no_gc@cython.cclassclassUserInfo:name:straddresses:tuple
@cython.no_gccdefclassUserInfo:cdefstrnamecdeftupleaddresses
If you can be sure addresses will contain only references to strings,the above would be safe, and it may yield a significant speedup, depending onyour usage pattern.
Controlling pickling¶
By default, Cython will generate a__reduce__()
method to allow picklingan extension type if and only if each of its members are convertible to Pythonand it has no__cinit__
method.To require this behavior (i.e. throw an error at compile time if a classcannot be pickled) decorate the class with@cython.auto_pickle(True)
.One can also annotate with@cython.auto_pickle(False)
to get the oldbehavior of not generating a__reduce__
method in any case.
Manually implementing a__reduce__
or__reduce_ex__
method will alsodisable this auto-generation and can be used to support pickling of morecomplicated types.
Public and external extension types¶
Extension types can be declared extern or public. An extern extension typedeclaration makes an extension type defined in external C code available to aCython module. A public extension type declaration makes an extension typedefined in a Cython module available to external C code.
Note
Cython currently does not support Extension types declared as extern or publicin Pure Python mode. This is not considered an issue since public/extern extensiontypes are most commonly declared in.pxd files and not in.py files.
External extension types¶
An extern extension type allows you to gain access to the internals of Pythonobjects defined in the Python core or in a non-Cython extension module.
Note
In previous versions of Pyrex, extern extension types were also used toreference extension types defined in another Pyrex module. While you can stilldo that, Cython provides a better mechanism for this. SeeSharing Declarations Between Cython Modules.
Here is an example which will let you get at the C-level members of thebuilt-in complex object:
cdefexternfrom"complexobject.h":structPy_complex:doublerealdoubleimagctypedefclass__builtin__.complex[objectPyComplexObject]:cdefPy_complexcval# A function which uses the above typedefspam(complexc):print("Real:",c.cval.real)print("Imag:",c.cval.imag)
Note
Some important things:
In this example,
ctypedef
class has been used. This isbecause, in the Python header files, thePyComplexObject
struct isdeclared with:typedefstruct{...}PyComplexObject;
At runtime, a check will be performed when importing the Cythonc-extension module that
__builtin__.complex
’sPyTypeObject.tp_basicsize
matchessizeof(`PyComplexObject)
. This check can fail if the Cythonc-extension module was compiled with one version of thecomplexobject.h
header but imported into a Python with a changedheader. This check can be tweaked by usingcheck_size
in the namespecification clause.As well as the name of the extension type, the module in which its typeobject can be found is also specified. See the implicit importing sectionbelow.
When declaring an external extension type, you don’t declare anymethods. Declaration of methods is not required in order to call them,because the calls are Python method calls. Also, as with
struct
andunion
, if your extension classdeclaration is inside acdef
extern from block, you only need todeclare those C members which you wish to access.
Name specification clause¶
The part of the class declaration in square brackets is a special feature onlyavailable for extern or public extension types. The full form of this clauseis:
[objectobject_struct_name,typetype_object_name,check_sizecs_option]
Where:
object_struct_name
is the name to assume for the type’s C struct.type_object_name
is the name to assume for the type’s staticallydeclared type object.cs_option
iswarn
(the default),error
, orignore
and is onlyused for external extension types. Iferror
, thesizeof(object_struct)
that was found at compile time must match the type’s runtimePyTypeObject.tp_basicsize
exactly, otherwise the module import will fail with an error. Ifwarn
orignore
, theobject_struct
is allowed to be smaller than the type’sPyTypeObject.tp_basicsize
, which indicates the runtime type may be part of an updatedmodule, and that the external module’s developers extended the object in abackward-compatible fashion (only adding new fields to the end of the object).Ifwarn
, a warning will be emitted in this case.
The clauses can be written in any order.
If the extension type declaration is inside acdef
extern fromblock, the object clause is required, because Cython must be able to generatecode that is compatible with the declarations in the header file. Otherwise,for extern extension types, the object clause is optional.
For public extension types, the object and type clauses are both required,because Cython must be able to generate code that is compatible with external Ccode.
Attribute name matching and aliasing¶
Sometimes the type’s C struct as specified inobject_struct_name
may usedifferent labels for the fields than those in thePyTypeObject
. This caneasily happen in hand-coded C extensions where thePyTypeObject_Foo
has agetter method, but the name does not match the name in thePyFooObject
. InNumPy, for instance, python-leveldtype.itemsize
is a getter for the Cstruct fieldelsize
. Cython supports aliasing field names so that one canwritedtype.itemsize
in Cython code which will be compiled into directaccess of the C struct field, without going through a C-API equivalent ofdtype.__getattr__('itemsize')
.
For example, we may have an extension modulefoo_extension
:
cdefclassFoo:cdefpublicintfield0,field1,field2;def__init__(self,f0,f1,f2):self.field0=f0self.field1=f1self.field2=f2
but a C struct in a filefoo_nominal.h
:
typedefstruct{PyObject_HEADintf0;intf1;intf2;}FooStructNominal;
Note that the struct usesf0
,f1
,f2
but they arefield0
,field1
, andfield2
inFoo
. We are given this situation, includinga header file with that struct, and we wish to write a function to sum thevalues. If we write an extension modulewrapper
:
cdefexternfrom"foo_nominal.h":ctypedefclassfoo_extension.Foo[objectFooStructNominal]:cdef:intfield0intfield1intfield2defsum(Foof):returnf.field0+f.field1+f.field2
thenwrapper.sum(f)
(wheref=foo_extension.Foo(1,2,3)
) will stilluse the C-API equivalent of:
returnf.__getattr__('field0')+f.__getattr__('field1')+f.__getattr__('field1')
instead of the desired C equivalent ofreturnf->f0+f->f1+f->f2
. We canalias the fields by using:
cdefexternfrom"foo_nominal.h":ctypedefclassfoo_extension.Foo[objectFooStructNominal]:cdef:intfield0"f0"intfield1"f1"intfield2"f2"defsum(Foof)except-1:returnf.field0+f.field1+f.field2
and now Cython will replace the slow__getattr__
with direct C access tothe FooStructNominal fields. This is useful when directly processing Pythoncode. No changes to Python need be made to achieve significant speedups, eventhough the field names in Python and C are different. Of course, one shouldmake sure the fields are equivalent.
C inline properties¶
Similar to Python property attributes, Cython provides a way to declare C-levelproperties on external extension types. This is often used to shadow Pythonattributes through faster C level data access, but can also be used to add certainfunctionality to existing types when using them from Cython. The declarationsmust usecdef inline.
For example, the abovecomplex
type could also be declared like this:
cdefexternfrom"complexobject.h":structPy_complex:doublerealdoubleimagctypedefclass__builtin__.complex[objectPyComplexObject]:cdefPy_complexcval@propertycdefinlinedoublereal(self):returnself.cval.real@propertycdefinlinedoubleimag(self):returnself.cval.imagdefcprint(complexc):print(f"{c.real :.4f}{c.imag :+.4f}j")# uses C calls to the above property methods.
Implicit importing¶
Cython requires you to include a module name in an extern extension classdeclaration, for example,:
cdefexternclassMyModule.Spam:...
The type object will be implicitly imported from the specified module andbound to the corresponding name in this module. In other words, in thisexample an implicit:
fromMyModuleimportSpam
statement will be executed at module load time.
The module name can be a dotted name to refer to a module inside a packagehierarchy, for example,:
cdefexternclassMy.Nested.Package.Spam:...
You can also specify an alternative name under which to import the type usingan as clause, for example,:
cdefexternclassMy.Nested.Package.SpamasYummy:...
which corresponds to the implicit import statement:
fromMy.Nested.PackageimportSpamasYummy
Type names vs. constructor names¶
Inside a Cython module, the name of an extension type serves two distinctpurposes. When used in an expression, it refers to a module-level globalvariable holding the type’s constructor (i.e. its type-object). However, itcan also be used as a C type name to declare variables, arguments and returnvalues of that type.
When you declare:
cdefexternclassMyModule.Spam:...
the name Spam serves both these roles. There may be other names by which youcan refer to the constructor, but only Spam can be used as a type name. Forexample, if you were to explicitly import MyModule, you could useMyModule.Spam()
to create a Spam instance, but you wouldn’t be able to useMyModule.Spam
as a type name.
When an as clause is used, the name specified in the as clause also takes overboth roles. So if you declare:
cdefexternclassMyModule.SpamasYummy:...
then Yummy becomes both the type name and a name for the constructor. Again,there are other ways that you could get hold of the constructor, but onlyYummy is usable as a type name.
Public extension types¶
An extension type can be declared public, in which case a.h
file isgenerated containing declarations for its object struct and type object. Byincluding the.h
file in external C code that you write, that code canaccess the attributes of the extension type.
Dataclass extension types¶
Cython supports extension types that behave like the dataclasses defined inthe Python 3.7+ standard library. The main benefit of using a dataclass isthat it can auto-generate simple__init__
,__repr__
and comparisonfunctions. The Cython implementation behaves as much like the Pythonstandard library implementation as possible and therefore the documentationhere only briefly outlines the differences - if you plan on using themthen please readthe documentation for the standard library module.
Dataclasses can be declared using the@dataclasses.dataclass
decorator on a Cython extension type (types markedcdef
or created with thecython.cclass
decorator). Alternatively the@cython.dataclasses.dataclass
decorator can be applied to any class to both turn it into an extension type anda dataclass. Ifyou need to define special properties on a field then usedataclasses.field
(orcython.dataclasses.field
will work too)
importcythontry:importtypingimportdataclassesexceptImportError:pass# The modules don't actually have to exists for Cython to use them as annotations@dataclasses.dataclass@cython.cclassclassMyDataclass:# fields can be declared using annotationsa:cython.int=0b:double=dataclasses.field(default_factory=lambda:10,repr=False)c:str='hello'# typing.InitVar and typing.ClassVar also workd:dataclasses.InitVar[double]=5e:typing.ClassVar[list]=[]
cimportcythontry:importtypingimportdataclassesexceptImportError:pass# The modules don't actually have to exists for Cython to use them as annotations@dataclasses.dataclasscdefclassMyDataclass:# fields can be declared using annotationsa:cython.int=0b:cython.double=dataclasses.field(default_factory=lambda:10,repr=False)# fields can also be declared using `cdef`:cdefstrc# add `readonly` or `public` to if `c` needs to be accessible from Pythonc="hello"# assignment of default value on a separate line# note: `@dataclass(frozen)` is not enforced on `cdef` attributes# typing.InitVar and typing.ClassVar also workd:dataclasses.InitVar[cython.double]=5e:typing.ClassVar[list]=[]
You may use C-level types such as structs, pointers, or C++ classes.However, you may find these types are not compatible with the auto-generatedspecial methods - for example if they cannot be converted from a Pythontype they cannot be passed to a constructor, and so you must use adefault_factory
to initialize them. Like with the Python implementation, you can also controlwhich special functions an attribute is used in usingfield()
.