Porting Extension Modules to Python 3¶
- author
Benjamin Peterson
Abstract
Although changing the C-API was not one of Python 3’s objectives,the many Python-level changes made leaving Python 2’s API intactimpossible. In fact, some changes such asint() andlong() unification are more obvious on the C level. Thisdocument endeavors to document incompatibilities and how they canbe worked around.
Conditional compilation¶
The easiest way to compile only some code for Python 3 is to checkifPY_MAJOR_VERSION is greater than or equal to 3.
#if PY_MAJOR_VERSION >= 3#define IS_PY3K#endif
API functions that are not present can be aliased to their equivalents withinconditional blocks.
Changes to Object APIs¶
Python 3 merged together some types with similar functions while cleanlyseparating others.
str/unicode Unification¶
Python 3’sstr() type is equivalent to Python 2’sunicode(); the Cfunctions are calledPyUnicode_* for both. The old 8-bit string type has becomebytes(), with C functions calledPyBytes_*. Python 2.6 and later provide a compatibility header,bytesobject.h, mappingPyBytes names toPyString ones. For bestcompatibility with Python 3,PyUnicode should be used for textual data andPyBytes for binary data. It’s also important to remember thatPyBytes andPyUnicode in Python 3 are not interchangeable likePyString andPyUnicode are in Python 2. The following exampleshows best practices with regards toPyUnicode,PyString,andPyBytes.
#include"stdlib.h"#include"Python.h"#include"bytesobject.h"/* text example */staticPyObject*say_hello(PyObject*self,PyObject*args){PyObject*name,*result;if(!PyArg_ParseTuple(args,"U:say_hello",&name))returnNULL;result=PyUnicode_FromFormat("Hello, %S!",name);returnresult;}/* just a forward */staticchar*do_encode(PyObject*);/* bytes example */staticPyObject*encode_object(PyObject*self,PyObject*args){char*encoded;PyObject*result,*myobj;if(!PyArg_ParseTuple(args,"O:encode_object",&myobj))returnNULL;encoded=do_encode(myobj);if(encoded==NULL)returnNULL;result=PyBytes_FromString(encoded);free(encoded);returnresult;}
long/int Unification¶
Python 3 has only one integer type,int(). But it actuallycorresponds to Python 2’slong() type—theint() typeused in Python 2 was removed. In the C-API,PyInt_* functionsare replaced by theirPyLong_* equivalents.
Module initialization and state¶
Python 3 has a revamped extension module initialization system. (SeePEP 3121.) Instead of storing module state in globals, they shouldbe stored in an interpreter specific structure. Creating modules thatact correctly in both Python 2 and Python 3 is tricky. The followingsimple example demonstrates how.
#include"Python.h"structmodule_state{PyObject*error;};#if PY_MAJOR_VERSION >= 3#define GETSTATE(m) ((struct module_state*)PyModule_GetState(m))#else#define GETSTATE(m) (&_state)staticstructmodule_state_state;#endifstaticPyObject*error_out(PyObject*m){structmodule_state*st=GETSTATE(m);PyErr_SetString(st->error,"something bad happened");returnNULL;}staticPyMethodDefmyextension_methods[]={{"error_out",(PyCFunction)error_out,METH_NOARGS,NULL},{NULL,NULL}};#if PY_MAJOR_VERSION >= 3staticintmyextension_traverse(PyObject*m,visitprocvisit,void*arg){Py_VISIT(GETSTATE(m)->error);return0;}staticintmyextension_clear(PyObject*m){Py_CLEAR(GETSTATE(m)->error);return0;}staticstructPyModuleDefmoduledef={PyModuleDef_HEAD_INIT,"myextension",NULL,sizeof(structmodule_state),myextension_methods,NULL,myextension_traverse,myextension_clear,NULL};#define INITERROR return NULLPyMODINIT_FUNCPyInit_myextension(void)#else#define INITERROR returnvoidinitmyextension(void)#endif{#if PY_MAJOR_VERSION >= 3PyObject*module=PyModule_Create(&moduledef);#elsePyObject*module=Py_InitModule("myextension",myextension_methods);#endifif(module==NULL)INITERROR;structmodule_state*st=GETSTATE(module);st->error=PyErr_NewException("myextension.Error",NULL,NULL);if(st->error==NULL){Py_DECREF(module);INITERROR;}#if PY_MAJOR_VERSION >= 3returnmodule;#endif}
CObject replaced with Capsule¶
TheCapsule object was introduced in Python 3.1 and 2.7 to replaceCObject. CObjects were useful,but theCObject API was problematic: it didn’t permit distinguishingbetween valid CObjects, which allowed mismatched CObjects to crash theinterpreter, and some of its APIs relied on undefined behavior in C.(For further reading on the rationale behind Capsules, please seebpo-5630.)
If you’re currently using CObjects, and you want to migrate to 3.1 or newer,you’ll need to switch to Capsules.CObject was deprecated in 3.1 and 2.7 and completely removed inPython 3.2. If you only support 2.7, or 3.1 and above, youcan simply switch toCapsule. If you need to support Python 3.0,or versions of Python earlier than 2.7,you’ll have to support both CObjects and Capsules.(Note that Python 3.0 is no longer supported, and it is not recommendedfor production use.)
The following example header filecapsulethunk.h maysolve the problem for you. Simply write your code against theCapsule API and include this header file afterPython.h. Your code will automatically use Capsulesin versions of Python with Capsules, and switch to CObjectswhen Capsules are unavailable.
capsulethunk.h simulates Capsules using CObjects. However,CObject provides no place to store the capsule’s “name”. As aresult the simulatedCapsule objects created bycapsulethunk.hbehave slightly differently from real Capsules. Specifically:
The name parameter passed in to
PyCapsule_New()is ignored.The name parameter passed in to
PyCapsule_IsValid()andPyCapsule_GetPointer()is ignored, and no error checkingof the name is performed.
PyCapsule_GetName()always returns NULL.
PyCapsule_SetName()always raises an exception andreturns failure. (Since there’s no way to store a namein a CObject, noisy failure ofPyCapsule_SetName()was deemed preferable to silent failure here. If this isinconvenient, feel free to modify your localcopy as you see fit.)
You can findcapsulethunk.h in the Python source distributionasDoc/includes/capsulethunk.h. We also include it here foryour convenience:
#ifndef __CAPSULETHUNK_H#define __CAPSULETHUNK_H#if ( (PY_VERSION_HEX < 0x02070000) \ || ((PY_VERSION_HEX >= 0x03000000) \ && (PY_VERSION_HEX < 0x03010000)) )#define __PyCapsule_GetField(capsule, field, default_value) \ ( PyCapsule_CheckExact(capsule) \ ? (((PyCObject *)capsule)->field) \ : (default_value) \ ) \#define __PyCapsule_SetField(capsule, field, value) \ ( PyCapsule_CheckExact(capsule) \ ? (((PyCObject *)capsule)->field = value), 1 \ : 0 \ ) \#define PyCapsule_Type PyCObject_Type#define PyCapsule_CheckExact(capsule) (PyCObject_Check(capsule))#define PyCapsule_IsValid(capsule, name) (PyCObject_Check(capsule))#define PyCapsule_New(pointer, name, destructor) \ (PyCObject_FromVoidPtr(pointer, destructor))#define PyCapsule_GetPointer(capsule, name) \ (PyCObject_AsVoidPtr(capsule))/* Don't call PyCObject_SetPointer here, it fails if there's a destructor */#define PyCapsule_SetPointer(capsule, pointer) \ __PyCapsule_SetField(capsule, cobject, pointer)#define PyCapsule_GetDestructor(capsule) \ __PyCapsule_GetField(capsule, destructor)#define PyCapsule_SetDestructor(capsule, dtor) \ __PyCapsule_SetField(capsule, destructor, dtor)/* * Sorry, there's simply no place * to store a Capsule "name" in a CObject. */#define PyCapsule_GetName(capsule) NULLstaticintPyCapsule_SetName(PyObject*capsule,constchar*unused){unused=unused;PyErr_SetString(PyExc_NotImplementedError,"can't use PyCapsule_SetName with CObjects");return1;}#define PyCapsule_GetContext(capsule) \ __PyCapsule_GetField(capsule, descr)#define PyCapsule_SetContext(capsule, context) \ __PyCapsule_SetField(capsule, descr, context)staticvoid*PyCapsule_Import(constchar*name,intno_block){PyObject*object=NULL;void*return_value=NULL;char*trace;size_tname_length=(strlen(name)+1)*sizeof(char);char*name_dup=(char*)PyMem_MALLOC(name_length);if(!name_dup){returnNULL;}memcpy(name_dup,name,name_length);trace=name_dup;while(trace){char*dot=strchr(trace,'.');if(dot){*dot++='\0';}if(object==NULL){if(no_block){object=PyImport_ImportModuleNoBlock(trace);}else{object=PyImport_ImportModule(trace);if(!object){PyErr_Format(PyExc_ImportError,"PyCapsule_Import could not ""import module\"%s\"",trace);}}}else{PyObject*object2=PyObject_GetAttrString(object,trace);Py_DECREF(object);object=object2;}if(!object){gotoEXIT;}trace=dot;}if(PyCObject_Check(object)){PyCObject*cobject=(PyCObject*)object;return_value=cobject->cobject;}else{PyErr_Format(PyExc_AttributeError,"PyCapsule_Import\"%s\" is not valid",name);}EXIT:Py_XDECREF(object);if(name_dup){PyMem_FREE(name_dup);}returnreturn_value;}#endif/* #if PY_VERSION_HEX < 0x02070000 */#endif/* __CAPSULETHUNK_H */
Other options¶
If you are writing a new extension module, you might considerCython. It translates a Python-like language to C. Theextension modules it creates are compatible with Python 3 and Python 2.
