numpy.distutils user guide#
Warning
numpy.distutils is deprecated, and will be removed forPython >= 3.12. For more details, seeStatus of numpy.distutils and migration advice
SciPy structure#
Currently SciPy project consists of two packages:
NumPy — it provides packages like:
numpy.distutils - extension to Python distutils
numpy.f2py - a tool to bind Fortran/C codes to Python
numpy._core - future replacement of Numeric and numarray packages
numpy.lib - extra utility functions
numpy.testing - numpy-style tools for unit testing
etc
SciPy — a collection of scientific tools for Python.
The aim of this document is to describe how to add new tools to SciPy.
Requirements for SciPy packages#
SciPy consists of Python packages, called SciPy packages, that areavailable to Python users via thescipy namespace. Each SciPy packagemay contain other SciPy packages. And so on. Therefore, the SciPydirectory tree is a tree of packages with arbitrary depth and width.Any SciPy package may depend on NumPy packages but the dependence on otherSciPy packages should be kept minimal or zero.
A SciPy package contains, in addition to its sources, the followingfiles and directories:
setup.py— building script__init__.py— package initializertests/— directory of unittests
Their contents are described below.
Thesetup.py file#
In order to add a Python package to SciPy, its build script (setup.py)must meet certain requirements. The most important requirement is that thepackage define aconfiguration(parent_package='',top_path=None) functionwhich returns a dictionary suitable for passing tonumpy.distutils.core.setup(..). To simplify the construction ofthis dictionary,numpy.distutils.misc_util provides theConfiguration class, described below.
SciPy pure Python package example#
Below is an example of a minimalsetup.py file for a pure SciPy package:
#!/usr/bin/env python3defconfiguration(parent_package='',top_path=None):fromnumpy.distutils.misc_utilimportConfigurationconfig=Configuration('mypackage',parent_package,top_path)returnconfigif__name__=="__main__":fromnumpy.distutils.coreimportsetup#setup(**configuration(top_path='').todict())setup(configuration=configuration)
The arguments of theconfiguration function specify the name ofparent SciPy package (parent_package) and the directory locationof the mainsetup.py script (top_path). These arguments,along with the name of the current package, should be passed to theConfiguration constructor.
TheConfiguration constructor has a fourth optional argument,package_path, that can be used when package files are located ina different location than the directory of thesetup.py file.
RemainingConfiguration arguments are all keyword arguments that willbe used to initialize attributes ofConfigurationinstance. Usually, these keywords are the same as the ones thatsetup(..) function would expect, for example,packages,ext_modules,data_files,include_dirs,libraries,headers,scripts,package_dir, etc. However, the directspecification of these keywords is not recommended as the content ofthese keyword arguments will not be processed or checked for theconsistency of SciPy building system.
Finally,Configuration has.todict() method that returns allthe configuration data as a dictionary suitable for passing on to thesetup(..) function.
Configuration instance attributes#
In addition to attributes that can be specified via keyword argumentstoConfiguration constructor,Configuration instance (let usdenote asconfig) has the following attributes that can be usefulin writing setup scripts:
config.name- full name of the current package. The names of parentpackages can be extracted asconfig.name.split('.').config.local_path- path to the location of currentsetup.pyfile.config.top_path- path to the location of mainsetup.pyfile.
Configuration instance methods#
config.todict()— returns configuration dictionary suitable forpassing tonumpy.distutils.core.setup(..)function.config.paths(*paths)---applies``glob.glob(..)to items ofpathsif necessary. Fixespathsitem that is relative toconfig.local_path.config.get_subpackage(subpackage_name,subpackage_path=None)—returns a list of subpackage configurations. Subpackage is looked in thecurrent directory under the namesubpackage_namebut the pathcan be specified also via optionalsubpackage_pathargument.Ifsubpackage_nameis specified asNonethen the subpackagename will be taken the basename ofsubpackage_path.Any*used for subpackage names are expanded as wildcards.config.add_subpackage(subpackage_name,subpackage_path=None)—add SciPy subpackage configuration to the current one. The meaningand usage of arguments is explained above, seeconfig.get_subpackage()method.config.add_data_files(*files)— prependfilestodata_fileslist. Iffilesitem is a tuple then its first element definesthe suffix of where data files are copied relative to package installationdirectory and the second element specifies the path to datafiles. By default data files are copied under package installationdirectory. For example,config.add_data_files('foo.dat',('fun',['gun.dat','nun/pun.dat','/tmp/sun.dat']),'bar/car.dat'.'/full/path/to/can.dat',)
will install data files to the following locations
<installationpathofconfig.namepackage>/foo.datfun/gun.datpun.datsun.datbar/car.datcan.dat
Path to data files can be a function taking no arguments andreturning path(s) to data files – this is a useful when data filesare generated while building the package. (XXX: explain the stepwhen this function are called exactly)
config.add_data_dir(data_path)— add directorydata_pathrecursively todata_files. The whole directory tree starting atdata_pathwill be copied under package installation directory.Ifdata_pathis a tuple then its first element definesthe suffix of where data files are copied relative to package installationdirectory and the second element specifies the path to data directory.By default, data directory are copied under package installationdirectory under the basename ofdata_path. For example,config.add_data_dir('fun')# fun/ contains foo.dat bar/car.datconfig.add_data_dir(('sun','fun'))config.add_data_dir(('gun','/full/path/to/fun'))
will install data files to the following locations
<installationpathofconfig.namepackage>/fun/foo.datbar/car.datsun/foo.datbar/car.datgun/foo.datbar/car.dat
config.add_include_dirs(*paths)— prependpathstoinclude_dirslist. This list will be visible to all extensionmodules of the current package.config.add_headers(*files)— prependfilestoheaderslist. By default, headers will be installed under<prefix>/include/pythonX.X/<config.name.replace('.','/')>/directory. Iffilesitem is a tuple then it’s first argumentspecifies the installation suffix relative to<prefix>/include/pythonX.X/path. This is a Python distutilsmethod; its use is discouraged for NumPy and SciPy in favour ofconfig.add_data_files(*files).config.add_scripts(*files)— prependfilestoscriptslist. Scripts will be installed under<prefix>/bin/directory.config.add_extension(name,sources,**kw)— create and add anExtensioninstance toext_moduleslist. The first argumentnamedefines the name of the extension module that will beinstalled underconfig.namepackage. The second argument isa list of sources.add_extensionmethod takes also keywordarguments that are passed on to theExtensionconstructor.The list of allowed keywords is the following:include_dirs,define_macros,undef_macros,library_dirs,libraries,runtime_library_dirs,extra_objects,extra_compile_args,extra_link_args,export_symbols,swig_opts,depends,language,f2py_options,module_dirs,extra_info,extra_f77_compile_args,extra_f90_compile_args.Note that
config.pathsmethod is applied to all lists thatmay contain paths.extra_infois a dictionary or a listof dictionaries that content will be appended to keyword arguments.The listdependscontains paths to files or directoriesthat the sources of the extension module depend on. If any pathin thedependslist is newer than the extension module, thenthe module will be rebuilt.The list of sources may contain functions (‘source generators’)with a pattern
def<funcname>(ext,build_dir):return<source(s)orNone>. IffuncnamereturnsNone, no sourcesare generated. And if theExtensioninstance has no sourcesafter processing all source generators, no extension module willbe built. This is the recommended way to conditionally defineextension modules. Source generator functions are called by thebuild_srcsub-command ofnumpy.distutils.For example, here is a typical source generator function:
defgenerate_source(ext,build_dir):importosfromdistutils.dep_utilimportnewertarget=os.path.join(build_dir,'somesource.c')ifnewer(target,__file__):# create target filereturntarget
The first argument contains the Extension instance that can beuseful to access its attributes like
depends,sources,etc. lists and modify them during the building process.The second argument gives a path to a build directory that mustbe used when creating files to a disk.config.add_library(name,sources,**build_info)— add alibrary tolibrarieslist. Allowed keywords arguments aredepends,macros,include_dirs,extra_compiler_args,f2py_options,extra_f77_compile_args,extra_f90_compile_args. See.add_extension()method formore information on arguments.config.have_f77c()— return True if Fortran 77 compiler isavailable (read: a simple Fortran 77 code compiled successfully).config.have_f90c()— return True if Fortran 90 compiler isavailable (read: a simple Fortran 90 code compiled successfully).config.get_version()— return version string of the current package,Noneif version information could not be detected. This methodsscans files__version__.py,<packagename>_version.py,version.py,__svn_version__.pyfor string variablesversion,__version__,<packagename>_version.config.make_svn_version_py()— appends a data function todata_fileslist that will generate__svn_version__.pyfileto the current package directory. The file will be removed fromthe source directory when Python exits.config.get_build_temp_dir()— return a path to a temporarydirectory. This is the place where one should build temporaryfiles.config.get_distribution()— return distutilsDistributioninstance.config.get_config_cmd()— returnsnumpy.distutilsconfigcommand instance.config.get_info(*names)—
Conversion of.src files using templates#
NumPy distutils supports automatic conversion of source files named<somefile>.src. This facility can be used to maintain very similarcode blocks requiring only simple changes between blocks. During thebuild phase of setup, if a template file named <somefile>.src isencountered, a new file named <somefile> is constructed from thetemplate and placed in the build directory to be used instead. Twoforms of template conversion are supported. The first form occurs forfiles named <file>.ext.src where ext is a recognized Fortranextension (f, f90, f95, f77, for, ftn, pyf). The second form is usedfor all other cases.
Fortran files#
This template converter will replicate allfunction andsubroutine blocks in the file with names that contain ‘<…>’according to the rules in ‘<…>’. The number of comma-separated wordsin ‘<…>’ determines the number of times the block is repeated. Whatthese words are indicates what that repeat rule, ‘<…>’, should bereplaced with in each block. All of the repeat rules in a block mustcontain the same number of comma-separated words indicating the numberof times that block should be repeated. If the word in the repeat ruleneeds a comma, leftarrow, or rightarrow, then prepend it with abackslash ‘ '. If a word in the repeat rule matches ‘ \<index>’ thenit will be replaced with the <index>-th word in the same repeatspecification. There are two forms for the repeat rule: named andshort.
Named repeat rule#
A named repeat rule is useful when the same set of repeats must beused several times in a block. It is specified using <rule1=item1,item2, item3,…, itemN>, where N is the number of times the blockshould be repeated. On each repeat of the block, the entireexpression, ‘<…>’ will be replaced first with item1, and then withitem2, and so forth until N repeats are accomplished. Once a namedrepeat specification has been introduced, the same repeat rule may beusedin the current block by referring only to the name(i.e. <rule1>).
Short repeat rule#
A short repeat rule looks like <item1, item2, item3, …, itemN>. Therule specifies that the entire expression, ‘<…>’ should be replacedfirst with item1, and then with item2, and so forth until N repeatsare accomplished.
Pre-defined names#
The following predefined named repeat rules are available:
<prefix=s,d,c,z>
<_c=s,d,c,z>
<_t=real, double precision, complex, double complex>
<ftype=real, double precision, complex, double complex>
<ctype=float, double, complex_float, complex_double>
<ftypereal=float, double precision, \0, \1>
<ctypereal=float, double, \0, \1>
Other files#
Non-Fortran files use a separate syntax for defining template blocksthat should be repeated using a variable expansion similar to thenamed repeat rules of the Fortran-specific repeats.
NumPy Distutils preprocesses C source files (extension:.c.src) writtenin a custom templating language to generate C code. The@ symbol isused to wrap macro-style variables to empower a string substitution mechanismthat might describe (for instance) a set of data types.
The template language blocks are delimited by/**beginrepeatand/**endrepeat**/ lines, which may also be nested usingconsecutively numbered delimiting lines such as/**beginrepeat1and/**endrepeat1**/:
/**beginrepeaton a line by itself marks the beginning ofa segment that should be repeated.Named variable expansions are defined using
#name=item1,item2,item3,...,itemN#and placed on successive lines. These variables arereplaced in each repeat block with corresponding word. All namedvariables in the same repeat block must define the same number ofwords.In specifying the repeat rule for a named variable,
item*Nis short-hand foritem,item,...,itemrepeated N times. In addition,parenthesis in combination with*Ncan be used for grouping severalitems that should be repeated. Thus,#name=(item1,item2)*4#isequivalent to#name=item1,item2,item1,item2,item1,item2,item1,item2#.*/on a line by itself marks the end of the variable expansionnaming. The next line is the first line that will be repeated usingthe named rules.Inside the block to be repeated, the variables that should be expandedare specified as
@name@./**endrepeat**/on a line by itself marks the previous lineas the last line of the block to be repeated.A loop in the NumPy C source code may have a
@TYPE@variable, targetedfor string substitution, which is preprocessed to a number of otherwiseidentical loops with several strings such asINT,LONG,UINT,ULONG. The@TYPE@style syntax thus reduces code duplication andmaintenance burden by mimicking languages that have generic type support.
The above rules may be clearer in the following template source example:
1/* TIMEDELTA to non-float types */ 2 3/**begin repeat 4 * 5 * #TOTYPE = BYTE, UBYTE, SHORT, USHORT, INT, UINT, LONG, ULONG, 6 * LONGLONG, ULONGLONG, DATETIME, 7 * TIMEDELTA# 8 * #totype = npy_byte, npy_ubyte, npy_short, npy_ushort, npy_int, npy_uint, 9 * npy_long, npy_ulong, npy_longlong, npy_ulonglong,10 * npy_datetime, npy_timedelta#11 */1213/**begin repeat114 *15 * #FROMTYPE = TIMEDELTA#16 * #fromtype = npy_timedelta#17 */18staticvoid19@FROMTYPE@_to_@TOTYPE@(void *input, void *output, npy_intp n,20void*NPY_UNUSED(aip),void*NPY_UNUSED(aop))21{22const@fromtype@ *ip = input;23@totype@ *op = output;2425while(n--){26*op++=(@totype@)*ip++;27}28}29/**end repeat1**/3031/**end repeat**/
The preprocessing of generically-typed C source files (whether in NumPyproper or in any third party package using NumPy Distutils) is performedbyconv_template.py.The type-specific C files generated (extension:.c)by these modules during the build process are ready to be compiled. Thisform of generic typing is also supported for C header files (preprocessedto produce.h files).
Useful functions innumpy.distutils.misc_util#
get_numpy_include_dirs()— return a list of NumPy baseinclude directories. NumPy base include directories containheader files such asnumpy/arrayobject.h,numpy/funcobject.hetc. For installed NumPy the returned list has length 1but when building NumPy the list may contain more directories,for example, a path toconfig.hfile thatnumpy/base/setup.pyfile generates and is used bynumpyheader files.append_path(prefix,path)— smart appendpathtoprefix.gpaths(paths,local_path='')— apply glob to paths and prependlocal_pathif needed.njoin(*path)— join pathname components + convert/-separated pathtoos.sep-separated path and resolve..,.from paths.Ex.njoin('a',['b','./c'],'..','g')->os.path.join('a','b','g').minrelpath(path)— resolves dots inpath.rel_path(path,parent_path)— returnpathrelative toparent_path.defget_cmd(cmdname,_cache={})— returnsnumpy.distutilscommand instance.all_strings(lst)has_f_sources(sources)has_cxx_sources(sources)filter_sources(sources)— returnc_sources,cxx_sources,f_sources,fmodule_sourcesget_dependencies(sources)is_local_src_dir(directory)get_ext_source_files(ext)get_script_files(scripts)get_lib_source_files(lib)get_data_files(data)dot_join(*args)— join non-zero arguments with a dot.get_frame(level=0)— return frame object from call stack with given level.cyg2win32(path)mingw32()— returnTruewhen using mingw32 environment.terminal_has_colors(),red_text(s),green_text(s),yellow_text(s),blue_text(s),cyan_text(s)get_path(mod_name,parent_path=None)— return path of a modulerelative to parent_path when given. Handles also__main__and__builtin__modules.allpath(name)— replaces/withos.sepinname.cxx_ext_match,fortran_ext_match,f90_ext_match,f90_module_name_match
numpy.distutils.system_info module#
get_info(name,notfound_action=0)combine_paths(*args,**kws)show_all()
numpy.distutils.cpuinfo module#
cpuinfo
numpy.distutils.log module#
set_verbosity(v)
numpy.distutils.exec_command module#
get_pythonexe()find_executable(exe,path=None)exec_command(command,execute_in='',use_shell=None,use_tee=None,**env)
The__init__.py file#
The header of a typical SciPy__init__.py is:
"""Package docstring, typically with a brief description and function listing."""# import functions into module namespacefrom.subpackageimport*...__all__=[sforsindir()ifnots.startswith('_')]fromnumpy.testingimportTestertest=Tester().testbench=Tester().bench
Extra features in NumPy Distutils#
Specifying config_fc options for libraries in setup.py script#
It is possible to specify config_fc options in setup.py scripts.For example, using:
config.add_library('library',sources=[...],config_fc={'noopt':(__file__,1)})
will compile thelibrary sources without optimization flags.
It’s recommended to specify only those config_fc options in such a waythat are compiler independent.
Getting extra Fortran 77 compiler options from source#
Some old Fortran codes need special compiler options in order towork correctly. In order to specify compiler options per sourcefile,numpy.distutils Fortran compiler looks for the followingpattern:
CF77FLAGS(<fcompilertype>)=<fcompilerf77flags>
in the first 20 lines of the source and use thef77flags forspecified type of the fcompiler (the first characterC is optional).
TODO: This feature can be easily extended for Fortran 90 codes aswell. Let us know if you would need such a feature.