Movatterモバイル変換


[0]ホーム

URL:


ContentsMenuExpandLight modeDark modeAuto light/dark, in light modeAuto light/dark, in dark modeSkip to content
Python Packaging User Guide
Python Packaging User Guide
Back to top

Packaging binary extensions

Page Status:

Incomplete

Last Reviewed:

2013-12-08

One of the features of the CPython reference interpreter is that, inaddition to allowing the execution of Python code, it also exposes a richC API for use by other software. One of the most common uses of this C APIis to create importable C extensions that allow things which aren’talways easy to achieve in pure Python code.

An overview of binary extensions

Use cases

The typical use cases for binary extensions break down into just threeconventional categories:

  • accelerator modules: these modules are completely self-contained, andare created solely to run faster than the equivalent pure Python coderuns in CPython. Ideally, accelerator modules will always have a purePython equivalent to use as a fallback if the accelerated version isn’tavailable on a given system. The CPython standard library makes extensiveuse of accelerator modules.Example: When importingdatetime, Python falls back to thedatetime.pymodule if the C implementation (_datetimemodule.c)is not available.

  • wrapper modules: these modules are created to expose existing C interfacesto Python code. They may either expose the underlying C interface directly,or else expose a more “Pythonic” API that makes use of Python languagefeatures to make the API easier to use. The CPython standard library makesextensive use of wrapper modules.Example:functools.pyis a Python module wrapper for_functoolsmodule.c.

  • low-level system access: these modules are created to access lower levelfeatures of the CPython runtime, the operating system, or the underlyinghardware. Through platform specific code, extension modules may achievethings that aren’t possible in pure Python code. A number of CPythonstandard library modules are written in C in order to access interpreterinternals that aren’t exposed at the language level.Example:sys, which comes fromsysmodule.c.

    One particularly notable feature of C extensions is that, when they don’tneed to call back into the interpreter runtime, they can release CPython’sglobal interpreter lock around long-running operations (regardless ofwhether those operations are CPU or IO bound).

Not all extension modules will fit neatly into the above categories. Theextension modules included with NumPy, for example, span all three use cases- they move inner loops to C for speed reasons, wrap external librarieswritten in C, FORTRAN and other languages, and use low level systeminterfaces for both CPython and the underlying operation system to supportconcurrent execution of vectorised operations and to tightly control theexact memory layout of created objects.

Disadvantages

The main disadvantage of using binary extensions is the fact that it makessubsequent distribution of the software more difficult. One of theadvantages of using Python is that it is largely cross platform, and thelanguages used to write extension modules (typically C or C++, but reallyany language that can bind to the CPython C API) typically require thatcustom binaries be created for different platforms.

This means that binary extensions:

  • require that end users be able to either build them from source, or elsethat someone publish pre-built binaries for common platforms

  • may not be compatible with different builds of the CPython referenceinterpreter

  • often will not work correctly with alternative interpreters such as PyPy,IronPython or Jython

  • if handcoded, make maintenance more difficult by requiring thatmaintainers be familiar not only with Python, but also with the languageused to create the binary extension, as well as with the details of theCPython C API.

  • if a pure Python fallback implementation is provided, make maintenancemore difficult by requiring that changes be implemented in two places,and introducing additional complexity in the test suite to ensure bothversions are always executed.

Another disadvantage of relying on binary extensions is that alternativeimport mechanisms (such as the ability to import modules directly fromzipfiles) often won’t work for extension modules (as the dynamic loadingmechanisms on most platforms can only load libraries from disk).

Alternatives to handcoded accelerator modules

When extension modules are just being used to make code run faster (afterprofiling has identified the code where the speed increase is worthadditional maintenance effort), a number of other alternatives shouldalso be considered:

  • look for existing optimised alternatives. The CPython standard libraryincludes a number of optimised data structures and algorithms (especiallyin the builtins and thecollections anditertools modules). ThePython Package Index also offers additional alternatives. Sometimes, theappropriate choice of standard library or third party module can avoid theneed to create your own accelerator module.

  • for long running applications, the JIT compiledPyPy interpreter may offer a suitable alternative to the standardCPython runtime. The main barrier to adopting PyPy is typically relianceon other binary extension modules - while PyPy does emulate the CPythonC API, modules that rely on that cause problems for the PyPy JIT, and theemulation layer can often expose latent defects in extension modules thatCPython currently tolerates (frequently around reference counting errors -an object having one live reference instead of two often won’t breakanything, but no references instead of one is a major problem).

  • Cython is a mature static compiler that cancompile most Python code to C extension modules. The initial compilationprovides some speed increases (by bypassing the CPython interpreter layer),and Cython’s optional static typing features can offer additionalopportunities for speed increases. Using Cython still carries thedisadvantages associated with using binary extensions,but has the benefit of having a reduced barrier to entry for Pythonprogrammers (relative to other languages like C or C++).

  • Numba is a newer tool, created by membersof the scientific Python community, that aims to leverage LLVM to allowselective compilation of pieces of a Python application to nativemachine code at runtime. It requires that LLVM be available on thesystem where the code is running, but can provide significant speedincreases, especially for operations that are amenable to vectorisation.

Alternatives to handcoded wrapper modules

The C ABI (Application Binary Interface) is a common standard for sharingfunctionality between multiple applications. One of the strengths of theCPython C API (Application Programming Interface) is allowing Python usersto tap into that functionality. However, wrapping modules by hand is quitetedious, so a number of other alternative approaches should be considered.

The approaches described below don’t simplify the distribution case at all,but theycan significantly reduce the maintenance burden of keepingwrapper modules up to date.

  • In addition to being useful for the creation of accelerator modules,Cython is also widely used for creating wrappermodules for C or C++ APIs. It involves wrapping the interfaces byhand, which gives a wide range of freedom in designing and optimisingthe wrapper code, but may not be a good choice for wrapping verylarge APIs quickly. See thelist of third-party toolsfor automatic wrapping with Cython. It also supports performance-orientedPython implementations that provide a CPython-like C-API, such as PyPyand Pyston.

  • pybind11 is a pure C++11 librarythat provides a clean C++ interface to the CPython (and PyPy) C API. Itdoes not require a pre-processing step; it is written entirely intemplated C++. Helpers are included for Setuptools or CMake builds. Itwas based onBoost.Python,but doesn’t require the Boost libraries or BJam.

  • cffi is a project created by some of the PyPydevelopers to make it straightforward for developers that already knowboth Python and C to expose their C modules to Python applications. Italso makes it relatively straightforward to wrap a C module based on itsheader files, even if you don’t know C yourself.

    One of the key advantages ofcffi is that it is compatible with thePyPy JIT, allowing CFFI wrapper modules to participate fully in PyPy’stracing JIT optimisations.

  • SWIG is a wrapper interface generator thatallows a variety of programming languages, including Python, to interfacewith C and C++ code.

  • The standard library’sctypes module, while useful for getting accessto C level interfaces when header information isn’t available, suffersfrom the fact that it operates solely at the C ABI level, and thus hasno automatic consistency checking between the interface actually beingexported by the library and the one declared in the Python code. Bycontrast, the above alternatives are all able to operate at the CAPIlevel, using C header files to ensure consistency between the interfaceexported by the library being wrapped and the one expected by the Pythonwrapper module. Whilecffican operate directly at the C ABI level,it suffers from the same interface inconsistency problems asctypeswhen it is used that way.

Alternatives for low level system access

For applications that need low level system access (regardless of thereason), a binary extension module oftenis the best way to go about it.This is particularly true for low level access to the CPython runtimeitself, since some operations (like releasing the Global Interpreter Lock)are simply invalid when the interpreter is running code, even if a modulelikectypes orcffi is used to obtain access to the relevant CAPI interfaces.

For cases where the extension module is manipulating the underlyingoperating system or hardware (rather than the CPython runtime), it maysometimes be better to just write an ordinary C library (or a library inanother systems programming language like C++ or Rust that can export a Ccompatible ABI), and then use one of the wrapping techniques describedabove to make the interface available as an importable Python module.

Implementing binary extensions

The CPythonExtending and Embeddingguide includes an introduction to writing acustom extension module in C.

FIXME: Elaborate that all this is one of the reasons why you probablydon’t want to handcode your extension modules :)

Extension module lifecycle

FIXME: This section needs to be fleshed out.

Implications of shared static state and subinterpreters

FIXME: This section needs to be fleshed out.

Implications of the GIL

FIXME: This section needs to be fleshed out.

Memory allocation APIs

FIXME: This section needs to be fleshed out.

ABI Compatibility

The CPython C API does not guarantee ABI stability between minor releases(3.2, 3.3, 3.4, etc.). This means that, typically, if you build anextension module against one version of Python, it is only guaranteed towork with the same minor version of Python and not with any other minorversions.

Python 3.2 introduced the Limited API, with is a well-defined subset ofPython’s C API. The symbols needed for the Limited API form the“Stable ABI” which is guaranteed to be compatible across all Python 3.xversions. Wheels containing extensions built against the stable ABI usetheabi3 ABI tag, to reflect that they’re compatible with all Python3.x versions.

CPython’sC API stability page providesdetailed information about the API / ABI stability guarantees, how to usethe Limited API and the exact contents of the “Limited API”.

Building binary extensions

FIXME: Cover the build-backends available for building extensions.

Building extensions for multiple platforms

If you plan to distribute your extension, you should providewheels for all the platforms you intend to support. Theseare usually built on continuous integration (CI) systems. There are toolsto help you build highly redistributable binaries from CI; these includecibuildwheel andmultibuild.

For most extensions, you will need to build wheels for all the platformsyou intend to support. This means that the number of wheels you need tobuild is the product of:

count(Pythonminorversions)*count(OS)*count(architectures)

Using CPython’sStable ABI can help significantlyreduce the number of wheels you need to provide, since a single wheel on aplatform can be used with all Python minor versions; eliminating one dimensionof the matrix. It also removes the need to generate new wheels for each newminor version of Python.

Binary extensions for Windows

Before it is possible to build a binary extension, it is necessary to ensurethat you have a suitable compiler available. On Windows, Visual C is used tobuild the official CPython interpreter, and should be used to build compatiblebinary extensions. To set up a build environment for binary extensions, installVisual Studio Community Edition- any recent version is fine.

One caveat: if you use Visual Studio 2019 or later, your extension will dependon an “extra” file,VCRUNTIME140_1.dll, in addition to theVCRUNTIME140.dll that all previous versions back to 2015 depend on. Thiswill add an extra requirement to using your extension on versions of CPythonthat do not include this extra file. To avoid this, you can add thecompile-time argument/d2FH4-. Recent versions of Python may include thisfile.

Building for Python prior to 3.5 is discouraged, because older versions ofVisual Studio are no longer available from Microsoft. If you do need to buildfor older versions, you can setDISTUTILS_USE_SDK=1 andMSSdk=1 toforce a the currently activated version of MSVC to be found, and you shouldexercise care when designing your extension not to malloc/free memory acrossdifferent libraries, avoid relying on changed data structures, and so on. Toolsfor generating extension modules usually avoid these things for you.

Binary extensions for Linux

Linux binaries must use a sufficiently old glibc to be compatible with olderdistributions. Themanylinux Dockerimages provide a build environment with a glibc old enough to support mostcurrent Linux distributions on common architectures.

Binary extensions for macOS

Binary compatibility on macOS is determined by the target minimum deploymentsystem, e.g.10.9, which is often specified with theMACOSX_DEPLOYMENT_TARGET environmental variable when building binaries onmacOS. When building with setuptools / distutils, the deployment target isspecified with the flag--plat-name, e.g.macosx-10.9-x86_64. Forcommon deployment targets for macOS Python distributions, see theMacPythonSpinning Wheels wiki.

Publishing binary extensions

Publishing binary extensions through PyPI uses the same upload mechanisms aspublishing pure Python packages. You build a wheel file for your extensionusing the build-backend and upload it to PyPI usingtwine.

Avoid binary-only releases

It is strongly recommended that you publish your binary extensions aswell as the source code that was used to build them. This allows users tobuild the extension from source if they need to. Notably, this is requiredfor certain Linux distributions that build from source within theirown build systems for the distro package repositories.

Weak linking

FIXME: This section needs to be fleshed out.

Additional resources

Cross-platform development and distribution of extension modules is a complex topic,so this guide focuses primarily on providing pointers to various tools that automatedealing with the underlying technical challenges. The additional resources in thissection are instead intended for developers looking to understand more about theunderlying binary interfaces that those systems rely on at runtime.

Cross-platform wheel generation with scikit-build

Thescikit-build packagehelps abstract cross-platform build operations and provides additional capabilitieswhen creating binary extension packages. Additional documentation is also available ontheC runtime, compiler, and build system generator for Pythonbinary extension modules.

Introduction to C/C++ extension modules

For a more in depth explanation of how extension modules are used by CPython ona Debian system, see the following articles:

Additional considerations for binary wheels

Thepypackaging-native website hasadditional coverage of packaging Python packages with native code. It aims toprovide an overview of the most important packaging issues for such projects,with in-depth explanations and references.

Examples of topics covered are non-Python compiled dependencies (“nativedependencies”), the importance of the ABI (Application Binary Interface) ofnative code, dependency on SIMD code and cross compilation.

On this page

[8]ページ先頭

©2009-2026 Movatter.jp