numpy.distutils user guide#

Warning

numpy.distutils is deprecated, and will be removed forPython >= 3.12. For more details, seeStatus of numpy.distutils and migration advice

SciPy structure#

Currently SciPy project consists of two packages:

  • NumPy — it provides packages like:

    • numpy.distutils - extension to Python distutils

    • numpy.f2py - a tool to bind Fortran/C codes to Python

    • numpy._core - future replacement of Numeric and numarray packages

    • numpy.lib - extra utility functions

    • numpy.testing - numpy-style tools for unit testing

    • etc

  • SciPy — a collection of scientific tools for Python.

The aim of this document is to describe how to add new tools to SciPy.

Requirements for SciPy packages#

SciPy consists of Python packages, called SciPy packages, that areavailable to Python users via thescipy namespace. Each SciPy packagemay contain other SciPy packages. And so on. Therefore, the SciPydirectory tree is a tree of packages with arbitrary depth and width.Any SciPy package may depend on NumPy packages but the dependence on otherSciPy packages should be kept minimal or zero.

A SciPy package contains, in addition to its sources, the followingfiles and directories:

  • setup.py — building script

  • __init__.py — package initializer

  • tests/ — directory of unittests

Their contents are described below.

Thesetup.py file#

In order to add a Python package to SciPy, its build script (setup.py)must meet certain requirements. The most important requirement is that thepackage define aconfiguration(parent_package='',top_path=None) functionwhich returns a dictionary suitable for passing tonumpy.distutils.core.setup(..). To simplify the construction ofthis dictionary,numpy.distutils.misc_util provides theConfiguration class, described below.

SciPy pure Python package example#

Below is an example of a minimalsetup.py file for a pure SciPy package:

#!/usr/bin/env python3defconfiguration(parent_package='',top_path=None):fromnumpy.distutils.misc_utilimportConfigurationconfig=Configuration('mypackage',parent_package,top_path)returnconfigif__name__=="__main__":fromnumpy.distutils.coreimportsetup#setup(**configuration(top_path='').todict())setup(configuration=configuration)

The arguments of theconfiguration function specify the name ofparent SciPy package (parent_package) and the directory locationof the mainsetup.py script (top_path). These arguments,along with the name of the current package, should be passed to theConfiguration constructor.

TheConfiguration constructor has a fourth optional argument,package_path, that can be used when package files are located ina different location than the directory of thesetup.py file.

RemainingConfiguration arguments are all keyword arguments that willbe used to initialize attributes ofConfigurationinstance. Usually, these keywords are the same as the ones thatsetup(..) function would expect, for example,packages,ext_modules,data_files,include_dirs,libraries,headers,scripts,package_dir, etc. However, the directspecification of these keywords is not recommended as the content ofthese keyword arguments will not be processed or checked for theconsistency of SciPy building system.

Finally,Configuration has.todict() method that returns allthe configuration data as a dictionary suitable for passing on to thesetup(..) function.

Configuration instance attributes#

In addition to attributes that can be specified via keyword argumentstoConfiguration constructor,Configuration instance (let usdenote asconfig) has the following attributes that can be usefulin writing setup scripts:

  • config.name - full name of the current package. The names of parentpackages can be extracted asconfig.name.split('.').

  • config.local_path - path to the location of currentsetup.py file.

  • config.top_path - path to the location of mainsetup.py file.

Configuration instance methods#

  • config.todict() — returns configuration dictionary suitable forpassing tonumpy.distutils.core.setup(..) function.

  • config.paths(*paths)---applies``glob.glob(..) to items ofpaths if necessary. Fixespaths item that is relative toconfig.local_path.

  • config.get_subpackage(subpackage_name,subpackage_path=None) —returns a list of subpackage configurations. Subpackage is looked in thecurrent directory under the namesubpackage_name but the pathcan be specified also via optionalsubpackage_path argument.Ifsubpackage_name is specified asNone then the subpackagename will be taken the basename ofsubpackage_path.Any* used for subpackage names are expanded as wildcards.

  • config.add_subpackage(subpackage_name,subpackage_path=None) —add SciPy subpackage configuration to the current one. The meaningand usage of arguments is explained above, seeconfig.get_subpackage() method.

  • config.add_data_files(*files) — prependfiles todata_fileslist. Iffiles item is a tuple then its first element definesthe suffix of where data files are copied relative to package installationdirectory and the second element specifies the path to datafiles. By default data files are copied under package installationdirectory. For example,

    config.add_data_files('foo.dat',('fun',['gun.dat','nun/pun.dat','/tmp/sun.dat']),'bar/car.dat'.'/full/path/to/can.dat',)

    will install data files to the following locations

    <installationpathofconfig.namepackage>/foo.datfun/gun.datpun.datsun.datbar/car.datcan.dat

    Path to data files can be a function taking no arguments andreturning path(s) to data files – this is a useful when data filesare generated while building the package. (XXX: explain the stepwhen this function are called exactly)

  • config.add_data_dir(data_path) — add directorydata_pathrecursively todata_files. The whole directory tree starting atdata_path will be copied under package installation directory.Ifdata_path is a tuple then its first element definesthe suffix of where data files are copied relative to package installationdirectory and the second element specifies the path to data directory.By default, data directory are copied under package installationdirectory under the basename ofdata_path. For example,

    config.add_data_dir('fun')# fun/ contains foo.dat bar/car.datconfig.add_data_dir(('sun','fun'))config.add_data_dir(('gun','/full/path/to/fun'))

    will install data files to the following locations

    <installationpathofconfig.namepackage>/fun/foo.datbar/car.datsun/foo.datbar/car.datgun/foo.datbar/car.dat
  • config.add_include_dirs(*paths) — prependpaths toinclude_dirs list. This list will be visible to all extensionmodules of the current package.

  • config.add_headers(*files) — prependfiles toheaderslist. By default, headers will be installed under<prefix>/include/pythonX.X/<config.name.replace('.','/')>/directory. Iffiles item is a tuple then it’s first argumentspecifies the installation suffix relative to<prefix>/include/pythonX.X/ path. This is a Python distutilsmethod; its use is discouraged for NumPy and SciPy in favour ofconfig.add_data_files(*files).

  • config.add_scripts(*files) — prependfiles toscriptslist. Scripts will be installed under<prefix>/bin/ directory.

  • config.add_extension(name,sources,**kw) — create and add anExtension instance toext_modules list. The first argumentname defines the name of the extension module that will beinstalled underconfig.name package. The second argument isa list of sources.add_extension method takes also keywordarguments that are passed on to theExtension constructor.The list of allowed keywords is the following:include_dirs,define_macros,undef_macros,library_dirs,libraries,runtime_library_dirs,extra_objects,extra_compile_args,extra_link_args,export_symbols,swig_opts,depends,language,f2py_options,module_dirs,extra_info,extra_f77_compile_args,extra_f90_compile_args.

    Note thatconfig.paths method is applied to all lists thatmay contain paths.extra_info is a dictionary or a listof dictionaries that content will be appended to keyword arguments.The listdepends contains paths to files or directoriesthat the sources of the extension module depend on. If any pathin thedepends list is newer than the extension module, thenthe module will be rebuilt.

    The list of sources may contain functions (‘source generators’)with a patterndef<funcname>(ext,build_dir):return<source(s)orNone>. Iffuncname returnsNone, no sourcesare generated. And if theExtension instance has no sourcesafter processing all source generators, no extension module willbe built. This is the recommended way to conditionally defineextension modules. Source generator functions are called by thebuild_src sub-command ofnumpy.distutils.

    For example, here is a typical source generator function:

    defgenerate_source(ext,build_dir):importosfromdistutils.dep_utilimportnewertarget=os.path.join(build_dir,'somesource.c')ifnewer(target,__file__):# create target filereturntarget

    The first argument contains the Extension instance that can beuseful to access its attributes likedepends,sources,etc. lists and modify them during the building process.The second argument gives a path to a build directory that mustbe used when creating files to a disk.

  • config.add_library(name,sources,**build_info) — add alibrary tolibraries list. Allowed keywords arguments aredepends,macros,include_dirs,extra_compiler_args,f2py_options,extra_f77_compile_args,extra_f90_compile_args. See.add_extension() method formore information on arguments.

  • config.have_f77c() — return True if Fortran 77 compiler isavailable (read: a simple Fortran 77 code compiled successfully).

  • config.have_f90c() — return True if Fortran 90 compiler isavailable (read: a simple Fortran 90 code compiled successfully).

  • config.get_version() — return version string of the current package,None if version information could not be detected. This methodsscans files__version__.py,<packagename>_version.py,version.py,__svn_version__.py for string variablesversion,__version__,<packagename>_version.

  • config.make_svn_version_py() — appends a data function todata_files list that will generate__svn_version__.py fileto the current package directory. The file will be removed fromthe source directory when Python exits.

  • config.get_build_temp_dir() — return a path to a temporarydirectory. This is the place where one should build temporaryfiles.

  • config.get_distribution() — return distutilsDistributioninstance.

  • config.get_config_cmd() — returnsnumpy.distutils configcommand instance.

  • config.get_info(*names)

Conversion of.src files using templates#

NumPy distutils supports automatic conversion of source files named<somefile>.src. This facility can be used to maintain very similarcode blocks requiring only simple changes between blocks. During thebuild phase of setup, if a template file named <somefile>.src isencountered, a new file named <somefile> is constructed from thetemplate and placed in the build directory to be used instead. Twoforms of template conversion are supported. The first form occurs forfiles named <file>.ext.src where ext is a recognized Fortranextension (f, f90, f95, f77, for, ftn, pyf). The second form is usedfor all other cases.

Fortran files#

This template converter will replicate allfunction andsubroutine blocks in the file with names that contain ‘<…>’according to the rules in ‘<…>’. The number of comma-separated wordsin ‘<…>’ determines the number of times the block is repeated. Whatthese words are indicates what that repeat rule, ‘<…>’, should bereplaced with in each block. All of the repeat rules in a block mustcontain the same number of comma-separated words indicating the numberof times that block should be repeated. If the word in the repeat ruleneeds a comma, leftarrow, or rightarrow, then prepend it with abackslash ‘ '. If a word in the repeat rule matches ‘ \<index>’ thenit will be replaced with the <index>-th word in the same repeatspecification. There are two forms for the repeat rule: named andshort.

Named repeat rule#

A named repeat rule is useful when the same set of repeats must beused several times in a block. It is specified using <rule1=item1,item2, item3,…, itemN>, where N is the number of times the blockshould be repeated. On each repeat of the block, the entireexpression, ‘<…>’ will be replaced first with item1, and then withitem2, and so forth until N repeats are accomplished. Once a namedrepeat specification has been introduced, the same repeat rule may beusedin the current block by referring only to the name(i.e. <rule1>).

Short repeat rule#

A short repeat rule looks like <item1, item2, item3, …, itemN>. Therule specifies that the entire expression, ‘<…>’ should be replacedfirst with item1, and then with item2, and so forth until N repeatsare accomplished.

Pre-defined names#

The following predefined named repeat rules are available:

  • <prefix=s,d,c,z>

  • <_c=s,d,c,z>

  • <_t=real, double precision, complex, double complex>

  • <ftype=real, double precision, complex, double complex>

  • <ctype=float, double, complex_float, complex_double>

  • <ftypereal=float, double precision, \0, \1>

  • <ctypereal=float, double, \0, \1>

Other files#

Non-Fortran files use a separate syntax for defining template blocksthat should be repeated using a variable expansion similar to thenamed repeat rules of the Fortran-specific repeats.

NumPy Distutils preprocesses C source files (extension:.c.src) writtenin a custom templating language to generate C code. The@ symbol isused to wrap macro-style variables to empower a string substitution mechanismthat might describe (for instance) a set of data types.

The template language blocks are delimited by/**beginrepeatand/**endrepeat**/ lines, which may also be nested usingconsecutively numbered delimiting lines such as/**beginrepeat1and/**endrepeat1**/:

  1. /**beginrepeat on a line by itself marks the beginning ofa segment that should be repeated.

  2. Named variable expansions are defined using#name=item1,item2,item3,...,itemN# and placed on successive lines. These variables arereplaced in each repeat block with corresponding word. All namedvariables in the same repeat block must define the same number ofwords.

  3. In specifying the repeat rule for a named variable,item*N is short-hand foritem,item,...,item repeated N times. In addition,parenthesis in combination with*N can be used for grouping severalitems that should be repeated. Thus,#name=(item1,item2)*4# isequivalent to#name=item1,item2,item1,item2,item1,item2,item1,item2#.

  4. */ on a line by itself marks the end of the variable expansionnaming. The next line is the first line that will be repeated usingthe named rules.

  5. Inside the block to be repeated, the variables that should be expandedare specified as@name@.

  6. /**endrepeat**/ on a line by itself marks the previous lineas the last line of the block to be repeated.

  7. A loop in the NumPy C source code may have a@TYPE@ variable, targetedfor string substitution, which is preprocessed to a number of otherwiseidentical loops with several strings such asINT,LONG,UINT,ULONG. The@TYPE@ style syntax thus reduces code duplication andmaintenance burden by mimicking languages that have generic type support.

The above rules may be clearer in the following template source example:

 1/* TIMEDELTA to non-float types */ 2 3/**begin repeat 4  * 5  * #TOTYPE = BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG, 6  *           LONGLONG, ULONGLONG, DATETIME, 7  *           TIMEDELTA# 8  * #totype = npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint, 9  *           npy_long, npy_ulong, npy_longlong, npy_ulonglong,10  *           npy_datetime, npy_timedelta#11  */1213/**begin repeat114  *15  * #FROMTYPE = TIMEDELTA#16  * #fromtype = npy_timedelta#17  */18staticvoid19@FROMTYPE@_to_@TOTYPE@(void *input, void *output, npy_intp n,20void*NPY_UNUSED(aip),void*NPY_UNUSED(aop))21{22const@fromtype@ *ip = input;23@totype@ *op = output;2425while(n--){26*op++=(@totype@)*ip++;27}28}29/**end repeat1**/3031/**end repeat**/

The preprocessing of generically-typed C source files (whether in NumPyproper or in any third party package using NumPy Distutils) is performedbyconv_template.py.The type-specific C files generated (extension:.c)by these modules during the build process are ready to be compiled. Thisform of generic typing is also supported for C header files (preprocessedto produce.h files).

Useful functions innumpy.distutils.misc_util#

  • get_numpy_include_dirs() — return a list of NumPy baseinclude directories. NumPy base include directories containheader files such asnumpy/arrayobject.h,numpy/funcobject.hetc. For installed NumPy the returned list has length 1but when building NumPy the list may contain more directories,for example, a path toconfig.h file thatnumpy/base/setup.py file generates and is used bynumpyheader files.

  • append_path(prefix,path) — smart appendpath toprefix.

  • gpaths(paths,local_path='') — apply glob to paths and prependlocal_path if needed.

  • njoin(*path) — join pathname components + convert/-separated pathtoos.sep-separated path and resolve..,. from paths.Ex.njoin('a',['b','./c'],'..','g')->os.path.join('a','b','g').

  • minrelpath(path) — resolves dots inpath.

  • rel_path(path,parent_path) — returnpath relative toparent_path.

  • defget_cmd(cmdname,_cache={}) — returnsnumpy.distutilscommand instance.

  • all_strings(lst)

  • has_f_sources(sources)

  • has_cxx_sources(sources)

  • filter_sources(sources) — returnc_sources,cxx_sources,f_sources,fmodule_sources

  • get_dependencies(sources)

  • is_local_src_dir(directory)

  • get_ext_source_files(ext)

  • get_script_files(scripts)

  • get_lib_source_files(lib)

  • get_data_files(data)

  • dot_join(*args) — join non-zero arguments with a dot.

  • get_frame(level=0) — return frame object from call stack with given level.

  • cyg2win32(path)

  • mingw32() — returnTrue when using mingw32 environment.

  • terminal_has_colors(),red_text(s),green_text(s),yellow_text(s),blue_text(s),cyan_text(s)

  • get_path(mod_name,parent_path=None) — return path of a modulerelative to parent_path when given. Handles also__main__ and__builtin__ modules.

  • allpath(name) — replaces/ withos.sep inname.

  • cxx_ext_match,fortran_ext_match,f90_ext_match,f90_module_name_match

numpy.distutils.system_info module#

  • get_info(name,notfound_action=0)

  • combine_paths(*args,**kws)

  • show_all()

numpy.distutils.cpuinfo module#

  • cpuinfo

numpy.distutils.log module#

  • set_verbosity(v)

numpy.distutils.exec_command module#

  • get_pythonexe()

  • find_executable(exe,path=None)

  • exec_command(command,execute_in='',use_shell=None,use_tee=None,**env)

The__init__.py file#

The header of a typical SciPy__init__.py is:

"""Package docstring, typically with a brief description and function listing."""# import functions into module namespacefrom.subpackageimport*...__all__=[sforsindir()ifnots.startswith('_')]fromnumpy.testingimportTestertest=Tester().testbench=Tester().bench

Extra features in NumPy Distutils#

Specifying config_fc options for libraries in setup.py script#

It is possible to specify config_fc options in setup.py scripts.For example, using:

config.add_library('library',sources=[...],config_fc={'noopt':(__file__,1)})

will compile thelibrary sources without optimization flags.

It’s recommended to specify only those config_fc options in such a waythat are compiler independent.

Getting extra Fortran 77 compiler options from source#

Some old Fortran codes need special compiler options in order towork correctly. In order to specify compiler options per sourcefile,numpy.distutils Fortran compiler looks for the followingpattern:

CF77FLAGS(<fcompilertype>)=<fcompilerf77flags>

in the first 20 lines of the source and use thef77flags forspecified type of the fcompiler (the first characterC is optional).

TODO: This feature can be easily extended for Fortran 90 codes aswell. Let us know if you would need such a feature.