Introduction

The Application Programmer’s Interface to Python gives C and C++ programmersaccess to the Python interpreter at a variety of levels. The API is equallyusable from C++, but for brevity it is generally referred to as the Python/CAPI. There are two fundamentally different reasons for using the Python/C API.The first reason is to writeextension modules for specific purposes; theseare C modules that extend the Python interpreter. This is probably the mostcommon use. The second reason is to use Python as a component in a largerapplication; this technique is generally referred to asembedding Pythonin an application.

Writing an extension module is a relatively well-understood process, where a“cookbook” approach works well. There are several tools that automate theprocess to some extent. While people have embedded Python in otherapplications since its early existence, the process of embedding Python isless straightforward than writing an extension.

Many API functions are useful independent of whether you’re embedding orextending Python; moreover, most applications that embed Python will need toprovide a custom extension as well, so it’s probably a good idea to becomefamiliar with writing an extension before attempting to embed Python in a realapplication.

Coding standards

If you’re writing C code for inclusion in CPython, youmust follow theguidelines and standards defined inPEP 7. These guidelines applyregardless of the version of Python you are contributing to. Following theseconventions is not necessary for your own third party extension modules,unless you eventually expect to contribute them to Python.

Include Files

All function, type and macro definitions needed to use the Python/C API areincluded in your code by the following line:

#define PY_SSIZE_T_CLEAN#include<Python.h>

This implies inclusion of the following standard headers:<stdio.h>,<string.h>,<errno.h>,<limits.h>,<assert.h> and<stdlib.h>(if available).

Note

Since Python may define some pre-processor definitions which affect the standardheaders on some systems, youmust includePython.h before any standardheaders are included.

It is recommended to always definePY_SSIZE_T_CLEAN before includingPython.h. SeeParsing arguments and building values for a description of this macro.

All user visible names defined by Python.h (except those defined by the includedstandard headers) have one of the prefixesPy or_Py. Names beginningwith_Py are for internal use by the Python implementation and should not beused by extension writers. Structure member names do not have a reserved prefix.

Note

User code should never define names that begin withPy or_Py. Thisconfuses the reader, and jeopardizes the portability of the user code tofuture Python versions, which may define additional names beginning with oneof these prefixes.

The header files are typically installed with Python. On Unix, these arelocated in the directoriesprefix/include/pythonversion/ andexec_prefix/include/pythonversion/, whereprefix andexec_prefix are defined by the corresponding parameters to Python’sconfigure script andversion is'%d.%d'%sys.version_info[:2]. On Windows, the headers are installedinprefix/include, whereprefix is the installationdirectory specified to the installer.

To include the headers, place both directories (if different) on your compiler’ssearch path for includes. Donot place the parent directories on the searchpath and then use#include<pythonX.Y/Python.h>; this will break onmulti-platform builds since the platform independent headers underprefix include the platform specific headers fromexec_prefix.

C++ users should note that although the API is defined entirely using C, theheader files properly declare the entry points to beextern"C". As a result,there is no need to do anything special to use the API from C++.

Useful macros

Several useful macros are defined in the Python header files. Many aredefined closer to where they are useful (e.g.Py_RETURN_NONE).Others of a more general utility are defined here. This is not necessarily acomplete listing.

PyMODINIT_FUNC

Declare an extension modulePyInit initialization function. The functionreturn type isPyObject*. The macro declares any special linkagedeclarations required by the platform, and for C++ declares the function asextern"C".

The initialization function must be namedPyInit_name, wherename is the name of the module, and should be the only non-staticitem defined in the module file. Example:

staticstructPyModuleDefspam_module={PyModuleDef_HEAD_INIT,.m_name="spam",...};PyMODINIT_FUNCPyInit_spam(void){returnPyModule_Create(&spam_module);}
Py_ABS(x)

Return the absolute value ofx.

Added in version 3.3.

Py_ALWAYS_INLINE

Ask the compiler to always inline a static inline function. The compiler canignore it and decide to not inline the function.

It can be used to inline performance critical static inline functions whenbuilding Python in debug mode with function inlining disabled. For example,MSC disables function inlining when building in debug mode.

Marking blindly a static inline function with Py_ALWAYS_INLINE can result inworse performances (due to increased code size for example). The compiler isusually smarter than the developer for the cost/benefit analysis.

If Python isbuilt in debug mode (if thePy_DEBUGmacro is defined), thePy_ALWAYS_INLINE macro does nothing.

It must be specified before the function return type. Usage:

staticinlinePy_ALWAYS_INLINEintrandom(void){return4;}

Added in version 3.11.

Py_CHARMASK(c)

Argument must be a character or an integer in the range [-128, 127] or [0,255]. This macro returnsc cast to anunsignedchar.

Py_DEPRECATED(version)

Use this for deprecated declarations. The macro must be placed before thesymbol name.

Example:

Py_DEPRECATED(3.8)PyAPI_FUNC(int)Py_OldFunction(void);

Changed in version 3.8:MSVC support was added.

Py_GETENV(s)

Likegetenv(s), but returnsNULL if-E was passed on thecommand line (seePyConfig.use_environment).

Py_MAX(x,y)

Return the maximum value betweenx andy.

Added in version 3.3.

Py_MEMBER_SIZE(type,member)

Return the size of a structure (type)member in bytes.

Added in version 3.6.

Py_MIN(x,y)

Return the minimum value betweenx andy.

Added in version 3.3.

Py_NO_INLINE

Disable inlining on a function. For example, it reduces the C stackconsumption: useful on LTO+PGO builds which heavily inline code (seebpo-33720).

Usage:

Py_NO_INLINEstaticintrandom(void){return4;}

Added in version 3.11.

Py_STRINGIFY(x)

Convertx to a C string. E.g.Py_STRINGIFY(123) returns"123".

Added in version 3.4.

Py_UNREACHABLE()

Use this when you have a code path that cannot be reached by design.For example, in thedefault: clause in aswitch statement for whichall possible values are covered incase statements. Use this in placeswhere you might be tempted to put anassert(0) orabort() call.

In release mode, the macro helps the compiler to optimize the code, andavoids a warning about unreachable code. For example, the macro isimplemented with__builtin_unreachable() on GCC in release mode.

A use forPy_UNREACHABLE() is following a call a function thatnever returns but that is not declared_Py_NO_RETURN.

If a code path is very unlikely code but can be reached under exceptionalcase, this macro must not be used. For example, under low memory conditionor if a system call returns a value out of the expected range. In thiscase, it’s better to report the error to the caller. If the error cannotbe reported to caller,Py_FatalError() can be used.

Added in version 3.7.

Py_UNUSED(arg)

Use this for unused arguments in a function definition to silence compilerwarnings. Example:intfunc(inta,intPy_UNUSED(b)){returna;}.

Added in version 3.4.

PyDoc_STRVAR(name,str)

Creates a variable with namename that can be used in docstrings.If Python is built without docstrings, the value will be empty.

UsePyDoc_STRVAR for docstrings to support buildingPython without docstrings, as specified inPEP 7.

Example:

PyDoc_STRVAR(pop_doc,"Remove and return the rightmost element.");staticPyMethodDefdeque_methods[]={// ...{"pop",(PyCFunction)deque_pop,METH_NOARGS,pop_doc},// ...}
PyDoc_STR(str)

Creates a docstring for the given input string or an empty stringif docstrings are disabled.

UsePyDoc_STR in specifying docstrings to supportbuilding Python without docstrings, as specified inPEP 7.

Example:

staticPyMethodDefpysqlite_row_methods[]={{"keys",(PyCFunction)pysqlite_row_keys,METH_NOARGS,PyDoc_STR("Returns the keys of the row.")},{NULL,NULL}};

Objects, Types and Reference Counts

Most Python/C API functions have one or more arguments as well as a return valueof typePyObject*. This type is a pointer to an opaque data typerepresenting an arbitrary Python object. Since all Python object types aretreated the same way by the Python language in most situations (e.g.,assignments, scope rules, and argument passing), it is only fitting that theyshould be represented by a single C type. Almost all Python objects live on theheap: you never declare an automatic or static variable of typePyObject, only pointer variables of typePyObject* can bedeclared. The sole exception are the type objects; since these must never bedeallocated, they are typically staticPyTypeObject objects.

All Python objects (even Python integers) have atype and areference count. An object’s type determines what kind of object it is(e.g., an integer, a list, or a user-defined function; there are many more asexplained inThe standard type hierarchy). For each of the well-known types there is a macroto check whether an object is of that type; for instance,PyList_Check(a) istrue if (and only if) the object pointed to bya is a Python list.

Reference Counts

The reference count is important because today’s computers have a finite(and often severely limited) memory size; it counts how many differentplaces there are that have astrong reference to an object.Such a place could be another object, or a global (or static) C variable,or a local variable in some C function.When the laststrong reference to an object is released(i.e. its reference count becomes zero), the object is deallocated.If it contains references to other objects, those references are released.Those other objects may be deallocated in turn, if there are no morereferences to them, and so on. (There’s an obvious problem withobjects that reference each other here; for now, the solutionis “don’t do that.”)

Reference counts are always manipulated explicitly. The normal way isto use the macroPy_INCREF() to take a new reference to anobject (i.e. increment its reference count by one),andPy_DECREF() to release that reference (i.e. decrement thereference count by one). ThePy_DECREF() macrois considerably more complex than the incref one, since it must check whetherthe reference count becomes zero and then cause the object’s deallocator to becalled. The deallocator is a function pointer contained in the object’s typestructure. The type-specific deallocator takes care of releasing referencesfor other objects contained in the object if this is a compoundobject type, such as a list, as well as performing any additional finalizationthat’s needed. There’s no chance that the reference count can overflow; atleast as many bits are used to hold the reference count as there are distinctmemory locations in virtual memory (assumingsizeof(Py_ssize_t)>=sizeof(void*)).Thus, the reference count increment is a simple operation.

It is not necessary to hold astrong reference (i.e. incrementthe reference count) for every local variable that contains a pointerto an object. In theory, the object’sreference count goes up by one when the variable is made to point to it and itgoes down by one when the variable goes out of scope. However, these twocancel each other out, so at the end the reference count hasn’t changed. Theonly real reason to use the reference count is to prevent the object from beingdeallocated as long as our variable is pointing to it. If we know that thereis at least one other reference to the object that lives at least as long asour variable, there is no need to take a newstrong reference(i.e. increment the reference count) temporarily.An important situation where this arises is in objects that are passed asarguments to C functions in an extension module that are called from Python;the call mechanism guarantees to hold a reference to every argument for theduration of the call.

However, a common pitfall is to extract an object from a list and hold on to itfor a while without taking a new reference. Some other operation mightconceivably remove the object from the list, releasing that reference,and possibly deallocating it. The real danger is that innocent-lookingoperations may invoke arbitrary Python code which could do this; there is a codepath which allows control to flow back to the user from aPy_DECREF(), soalmost any operation is potentially dangerous.

A safe approach is to always use the generic operations (functions whose namebegins withPyObject_,PyNumber_,PySequence_ orPyMapping_).These operations always create a newstrong reference(i.e. increment the reference count) of the object they return.This leaves the caller with the responsibility to callPy_DECREF() whenthey are done with the result; this soon becomes second nature.

Reference Count Details

The reference count behavior of functions in the Python/C API is best explainedin terms ofownership of references. Ownership pertains to references, neverto objects (objects are not owned: they are always shared). “Owning areference” means being responsible for calling Py_DECREF on it when thereference is no longer needed. Ownership can also be transferred, meaning thatthe code that receives ownership of the reference then becomes responsible foreventually releasing it by callingPy_DECREF() orPy_XDECREF()when it’s no longer needed—or passing on this responsibility (usually to itscaller). When a function passes ownership of a reference on to its caller, thecaller is said to receive anew reference. When no ownership is transferred,the caller is said toborrow the reference. Nothing needs to be done for aborrowed reference.

Conversely, when a calling function passes in a reference to an object, thereare two possibilities: the functionsteals a reference to the object, or itdoes not.Stealing a reference means that when you pass a reference to afunction, that function assumes that it now owns that reference, and you are notresponsible for it any longer.

Few functions steal references; the two notable exceptions arePyList_SetItem() andPyTuple_SetItem(), which steal a referenceto the item (but not to the tuple or list into which the item is put!). Thesefunctions were designed to steal a reference because of a common idiom forpopulating a tuple or list with newly created objects; for example, the code tocreate the tuple(1,2,"three") could look like this (forgetting abouterror handling for the moment; a better way to code this is shown below):

PyObject*t;t=PyTuple_New(3);PyTuple_SetItem(t,0,PyLong_FromLong(1L));PyTuple_SetItem(t,1,PyLong_FromLong(2L));PyTuple_SetItem(t,2,PyUnicode_FromString("three"));

Here,PyLong_FromLong() returns a new reference which is immediatelystolen byPyTuple_SetItem(). When you want to keep using an objectalthough the reference to it will be stolen, usePy_INCREF() to grabanother reference before calling the reference-stealing function.

Incidentally,PyTuple_SetItem() is theonly way to set tuple items;PySequence_SetItem() andPyObject_SetItem() refuse to do thissince tuples are an immutable data type. You should only usePyTuple_SetItem() for tuples that you are creating yourself.

Equivalent code for populating a list can be written usingPyList_New()andPyList_SetItem().

However, in practice, you will rarely use these ways of creating and populatinga tuple or list. There’s a generic function,Py_BuildValue(), that cancreate most common objects from C values, directed by aformat string.For example, the above two blocks of code could be replaced by the following(which also takes care of the error checking):

PyObject*tuple,*list;tuple=Py_BuildValue("(iis)",1,2,"three");list=Py_BuildValue("[iis]",1,2,"three");

It is much more common to usePyObject_SetItem() and friends with itemswhose references you are only borrowing, like arguments that were passed in tothe function you are writing. In that case, their behaviour regarding referencesis much saner, since you don’t have to take a new reference just so youcan give that reference away (“have it be stolen”). For example, this functionsets all items of a list (actually, any mutable sequence) to a given item:

intset_all(PyObject*target,PyObject*item){Py_ssize_ti,n;n=PyObject_Length(target);if(n<0)return-1;for(i=0;i<n;i++){PyObject*index=PyLong_FromSsize_t(i);if(!index)return-1;if(PyObject_SetItem(target,index,item)<0){Py_DECREF(index);return-1;}Py_DECREF(index);}return0;}

The situation is slightly different for function return values. While passinga reference to most functions does not change your ownership responsibilitiesfor that reference, many functions that return a reference to an object giveyou ownership of the reference. The reason is simple: in many cases, thereturned object is created on the fly, and the reference you get is the onlyreference to the object. Therefore, the generic functions that return objectreferences, likePyObject_GetItem() andPySequence_GetItem(),always return a new reference (the caller becomes the owner of the reference).

It is important to realize that whether you own a reference returned by afunction depends on which function you call only —the plumage (the type ofthe object passed as an argument to the function)doesn’t enter into it!Thus, if you extract an item from a list usingPyList_GetItem(), youdon’t own the reference — but if you obtain the same item from the same listusingPySequence_GetItem() (which happens to take exactly the samearguments), you do own a reference to the returned object.

Here is an example of how you could write a function that computes the sum ofthe items in a list of integers; once usingPyList_GetItem(), and onceusingPySequence_GetItem().

longsum_list(PyObject*list){Py_ssize_ti,n;longtotal=0,value;PyObject*item;n=PyList_Size(list);if(n<0)return-1;/* Not a list */for(i=0;i<n;i++){item=PyList_GetItem(list,i);/* Can't fail */if(!PyLong_Check(item))continue;/* Skip non-integers */value=PyLong_AsLong(item);if(value==-1&&PyErr_Occurred())/* Integer too big to fit in a C long, bail out */return-1;total+=value;}returntotal;}
longsum_sequence(PyObject*sequence){Py_ssize_ti,n;longtotal=0,value;PyObject*item;n=PySequence_Length(sequence);if(n<0)return-1;/* Has no length */for(i=0;i<n;i++){item=PySequence_GetItem(sequence,i);if(item==NULL)return-1;/* Not a sequence, or other failure */if(PyLong_Check(item)){value=PyLong_AsLong(item);Py_DECREF(item);if(value==-1&&PyErr_Occurred())/* Integer too big to fit in a C long, bail out */return-1;total+=value;}else{Py_DECREF(item);/* Discard reference ownership */}}returntotal;}

Types

There are few other data types that play a significant role in the Python/CAPI; most are simple C types such asint,long,double andchar*. A few structure types are used todescribe static tables used to list the functions exported by a module or thedata attributes of a new object type, and another is used to describe the valueof a complex number. These will be discussed together with the functions thatuse them.

typePy_ssize_t
Part of theStable ABI.

A signed integral type such thatsizeof(Py_ssize_t)==sizeof(size_t).C99 doesn’t define such a thing directly (size_t is an unsigned integral type).SeePEP 353 for details.PY_SSIZE_T_MAX is the largest positive valueof typePy_ssize_t.

Exceptions

The Python programmer only needs to deal with exceptions if specific errorhandling is required; unhandled exceptions are automatically propagated to thecaller, then to the caller’s caller, and so on, until they reach the top-levelinterpreter, where they are reported to the user accompanied by a stacktraceback.

For C programmers, however, error checking always has to be explicit. Allfunctions in the Python/C API can raise exceptions, unless an explicit claim ismade otherwise in a function’s documentation. In general, when a functionencounters an error, it sets an exception, discards any object references thatit owns, and returns an error indicator. If not documented otherwise, thisindicator is eitherNULL or-1, depending on the function’s return type.A few functions return a Boolean true/false result, with false indicating anerror. Very few functions return no explicit error indicator or have anambiguous return value, and require explicit testing for errors withPyErr_Occurred(). These exceptions are always explicitly documented.

Exception state is maintained in per-thread storage (this is equivalent tousing global storage in an unthreaded application). A thread can be in one oftwo states: an exception has occurred, or not. The functionPyErr_Occurred() can be used to check for this: it returns a borrowedreference to the exception type object when an exception has occurred, andNULL otherwise. There are a number of functions to set the exception state:PyErr_SetString() is the most common (though not the most general)function to set the exception state, andPyErr_Clear() clears theexception state.

The full exception state consists of three objects (all of which can beNULL): the exception type, the corresponding exception value, and thetraceback. These have the same meanings as the Python result ofsys.exc_info(); however, they are not the same: the Python objects representthe last exception being handled by a Pythontryexcept statement, while the C level exception state only exists whilean exception is being passed on between C functions until it reaches the Pythonbytecode interpreter’s main loop, which takes care of transferring it tosys.exc_info() and friends.

Note that starting with Python 1.5, the preferred, thread-safe way to access theexception state from Python code is to call the functionsys.exc_info(),which returns the per-thread exception state for Python code. Also, thesemantics of both ways to access the exception state have changed so that afunction which catches an exception will save and restore its thread’s exceptionstate so as to preserve the exception state of its caller. This prevents commonbugs in exception handling code caused by an innocent-looking functionoverwriting the exception being handled; it also reduces the often unwantedlifetime extension for objects that are referenced by the stack frames in thetraceback.

As a general principle, a function that calls another function to perform sometask should check whether the called function raised an exception, and if so,pass the exception state on to its caller. It should discard any objectreferences that it owns, and return an error indicator, but it shouldnot setanother exception — that would overwrite the exception that was just raised,and lose important information about the exact cause of the error.

A simple example of detecting exceptions and passing them on is shown in thesum_sequence() example above. It so happens that this example doesn’tneed to clean up any owned references when it detects an error. The followingexample function shows some error cleanup. First, to remind you why you likePython, we show the equivalent Python code:

defincr_item(dict,key):try:item=dict[key]exceptKeyError:item=0dict[key]=item+1

Here is the corresponding C code, in all its glory:

intincr_item(PyObject*dict,PyObject*key){/* Objects all initialized to NULL for Py_XDECREF */PyObject*item=NULL,*const_one=NULL,*incremented_item=NULL;intrv=-1;/* Return value initialized to -1 (failure) */item=PyObject_GetItem(dict,key);if(item==NULL){/* Handle KeyError only: */if(!PyErr_ExceptionMatches(PyExc_KeyError))gotoerror;/* Clear the error and use zero: */PyErr_Clear();item=PyLong_FromLong(0L);if(item==NULL)gotoerror;}const_one=PyLong_FromLong(1L);if(const_one==NULL)gotoerror;incremented_item=PyNumber_Add(item,const_one);if(incremented_item==NULL)gotoerror;if(PyObject_SetItem(dict,key,incremented_item)<0)gotoerror;rv=0;/* Success *//* Continue with cleanup code */error:/* Cleanup code, shared by success and failure path *//* Use Py_XDECREF() to ignore NULL references */Py_XDECREF(item);Py_XDECREF(const_one);Py_XDECREF(incremented_item);returnrv;/* -1 for error, 0 for success */}

This example represents an endorsed use of thegoto statement in C!It illustrates the use ofPyErr_ExceptionMatches() andPyErr_Clear() to handle specific exceptions, and the use ofPy_XDECREF() to dispose of owned references that may beNULL (note the'X' in the name;Py_DECREF() would crash when confronted with aNULL reference). It is important that the variables used to hold ownedreferences are initialized toNULL for this to work; likewise, the proposedreturn value is initialized to-1 (failure) and only set to success afterthe final call made is successful.

Embedding Python

The one important task that only embedders (as opposed to extension writers) ofthe Python interpreter have to worry about is the initialization, and possiblythe finalization, of the Python interpreter. Most functionality of theinterpreter can only be used after the interpreter has been initialized.

The basic initialization function isPy_Initialize(). This initializesthe table of loaded modules, and creates the fundamental modulesbuiltins,__main__, andsys. It alsoinitializes the module search path (sys.path).

Py_Initialize() does not set the “script argument list” (sys.argv).If this variable is needed by Python code that will be executed later, settingPyConfig.argv andPyConfig.parse_argv must be set: seePython Initialization Configuration.

On most systems (in particular, on Unix and Windows, although the details areslightly different),Py_Initialize() calculates the module search pathbased upon its best guess for the location of the standard Python interpreterexecutable, assuming that the Python library is found in a fixed locationrelative to the Python interpreter executable. In particular, it looks for adirectory namedlib/pythonX.Y relative to the parent directorywhere the executable namedpython is found on the shell command searchpath (the environment variablePATH).

For instance, if the Python executable is found in/usr/local/bin/python, it will assume that the libraries are in/usr/local/lib/pythonX.Y. (In fact, this particular path is alsothe “fallback” location, used when no executable file namedpython isfound alongPATH.) The user can override this behavior by setting theenvironment variablePYTHONHOME, or insert additional directories infront of the standard path by settingPYTHONPATH.

The embedding application can steer the search by settingPyConfig.program_namebefore callingPy_InitializeFromConfig(). Note thatPYTHONHOME still overrides this andPYTHONPATH is stillinserted in front of the standard path. An application that requires totalcontrol has to provide its own implementation ofPy_GetPath(),Py_GetPrefix(),Py_GetExecPrefix(), andPy_GetProgramFullPath() (all defined inModules/getpath.c).

Sometimes, it is desirable to “uninitialize” Python. For instance, theapplication may want to start over (make another call toPy_Initialize()) or the application is simply done with its use ofPython and wants to free memory allocated by Python. This can be accomplishedby callingPy_FinalizeEx(). The functionPy_IsInitialized() returnstrue if Python is currently in the initialized state. More information aboutthese functions is given in a later chapter. Notice thatPy_FinalizeEx()doesnot free all memory allocated by the Python interpreter, e.g. memoryallocated by extension modules currently cannot be released.

Debugging Builds

Python can be built with several macros to enable extra checks of theinterpreter and extension modules. These checks tend to add a large amount ofoverhead to the runtime so they are not enabled by default.

A full list of the various types of debugging builds is in the fileMisc/SpecialBuilds.txt in the Python source distribution. Builds areavailable that support tracing of reference counts, debugging the memoryallocator, or low-level profiling of the main interpreter loop. Only the mostfrequently used builds will be described in the remainder of this section.

Py_DEBUG

Compiling the interpreter with thePy_DEBUG macro defined produceswhat is generally meant bya debug build of Python.Py_DEBUG is enabled in the Unix build by adding--with-pydebug to the./configure command.It is also implied by the presence of thenot-Python-specific_DEBUG macro. WhenPy_DEBUG is enabledin the Unix build, compiler optimization is disabled.

In addition to the reference count debugging described below, extra checks areperformed, seePython Debug Build.

DefiningPy_TRACE_REFS enables reference tracing(see theconfigure--with-trace-refsoption).When defined, a circular doubly linked list of active objects is maintained by adding two extrafields to everyPyObject. Total allocations are tracked as well. Uponexit, all existing references are printed. (In interactive mode this happensafter every statement run by the interpreter.)

Please refer toMisc/SpecialBuilds.txt in the Python source distributionfor more detailed information.