nvmath-python Bindings #

Overview#

Warning

All Python bindings documented in this section areexperimental and subject to futurechanges. Use it at your own risk.

Low-level Python bindings for C APIs from NVIDIA Math Libraries are exposed under thecorresponding modules innvmath.bindings. To access the Python bindings, use themodules for the corresponding libraries. Under the hood, nvmath-python handles the run-timelinking to the libraries for you lazily.

The currently supported libraries along with the corresponding module names are listed asfollows:

Library name	Python access
cuBLAS	`nvmath.bindings.cublas`
cuBLASLt	`nvmath.bindings.cublasLt`
cuBLASMp	`nvmath.bindings.cublasMp`
cuDSS	`nvmath.bindings.cudss`
cuFFT	`nvmath.bindings.cufft`
cuRAND	`nvmath.bindings.curand`
cuSOLVER	`nvmath.bindings.cusolver`
cuSOLVERDn	`nvmath.bindings.cusolverDn`
cuSPARSE	`nvmath.bindings.cusparse`
NVPL BLAS	`nvmath.bindings.nvpl.blas`
NVPL FFT	`nvmath.bindings.nvpl.fft`

Support for more libraries will be added in the future.

Naming & Calling Convention#

Inside each of the modules, all public APIs of the corresponding NVIDIA Math library areexposed following thePEP 8 style guide along with the following changes:

All library name prefixes are stripped
The function names are broken by words and follow the camel case
The first letter in each word in the enum names are capitalized
Each enum’s name prefix is stripped from its values’ names
Whenever applicable, the outputs are stripped away from the function arguments andreturned directly as Python objects
Pointers are passed as Pythonint
Exceptions are raised instead of returning the C error code

Below is a non-exhaustive list of examples of such C-to-Python mappings:

Function:cublasDgemm ->cublas.dgemm().
Function:curandSetGeneratorOrdering ->curand.set_generator_ordering()
Enum type:cublasLtMatmulTile_t ->cublasLt.MatmulTile
Enum type:cufftXtSubFormat ->cufft.XtSubFormat
Enum value name:CUSOLVER_EIG_MODE_NOVECTOR ->cusolver.EigMode.NOVECTOR
Enum value name:CUSPARSE_STATUS_MATRIX_TYPE_NOT_SUPPORTED ->cusparse.Status.MATRIX_TYPE_NOT_SUPPORTED
Returns: The outputs ofcusolverDnXpotrf_bufferSize are the workspace sizes on deviceand host, which are wrapped as a 2-tuple in the correspondingcusolverDn.xpotrf_buffer_size() Python API.

There may be exceptions for the above rules, but they would be self-evident and will beproperly documented. In the next section we discuss pointer passing in Python.

Memory management#

Pointer and data lifetime#

Unlike in C/C++, Python does not provide low-level primitives to allocate/deallocate hostmemory (nor device memory). In order to make the C APIs work with Python, it is importantthat memory management is properly done through Python proxy objects. In nvmath-python, weask users to address such needs using NumPy (for host memory) and CuPy (for device memory).

Note

It is also possible to usearray.array (plusmemoryview as needed) tomanage host memory. However it is more laborious compared to usingnumpy.ndarray, especially when it comes to array manipulation and computation.

Note

It is also possible to useCUDA Python to manage device memory, but as of CUDA 11there is no simple, pythonic way to modify the contents stored on GPU, which requirescustom kernels. CuPy is a lightweight, NumPy-compatible array library that addressesthis need.

To pass data from Python to C, using pointer addresses (as Pythonint) of variousobjects is required. We illustrate this using NumPy/CuPy arrays as follows:

# create a host buffer to hold 5 intbuf=numpy.empty((5,),dtype=numpy.int32)# pass buf's pointer to the wrapper# buf could get modified in-place if the function writes to itmy_func(...,buf.ctypes.data,...)# examine/use buf's dataprint(buf)# create a device buffer to hold 10 doublebuf=cupy.empty((10,),dtype=cupy.float64)# pass buf's pointer to the wrapper# buf could get modified in-place if the function writes to itmy_func(...,buf.data.ptr,...)# examine/use buf's dataprint(buf)# create an untyped device buffer of 128 bytesbuf=cupy.cuda.alloc(128)# pass buf's pointer to the wrapper# buf could get modified in-place if the function writes to itmy_func(...,buf.ptr,...)# buf is automatically destroyed when going out of scope

The underlying assumption is that the arrays must be contiguous inmemory (unless the C interface allows for specifying the array strides).

As a consequence, all C structs in NVIDIA Math libraries (including handles and descriptors)arenot exposed as Python classes; that is, they do not have their own types and aresimply cast to plain Pythonint for passing around. Any downstream consumer shouldcreate a wrapper class to hold the pointer address if so desired. In other words, users havefull control (and responsibility) for managing thepointer lifetime.

However, in certain cases we are able to convert Python objects for users (ifreadonly,host arrays are needed) so as to alleviate users’ burden. For example, in functions thatrequire a sequence or a nested sequence, the following operations are equivalent:

# passing a host buffer of int type can be done like thisbuf=numpy.array([0,1,3,5,6],dtype=numpy.int32)my_func(...,buf.ctypes.data,...)# or just thisbuf=[0,1,3,5,6]my_func(...,buf,...)# the underlying data type is determined by the C API

which is particularly useful when users need to pass multiple sequences or nested sequencesto C (For example,nvmath.bindings.cufft.plan_many()).

Note

Some functions require their arguments to be in the device memory. You need to passdevice memory (for example,cupy.ndarray) to such arguments. nvmath-pythonneither validates the memory pointers nor implicitly transfers the data.Passing host memory where device memory is expected (and vice versa) results inundefined behavior.

API Reference#

This reference describes all nvmath-python’s math primitives.

On this page

Movatterモバイル変換

nvmath-python Bindings#

Overview#

Naming & Calling Convention#

Memory management#

Pointer and data lifetime#

API Reference#

nvmath-python Bindings #