Typed Memoryviews

Note

This page uses two different syntax variants:

  • Cython specificcdef syntax, which was designed to make type declarationsconcise and easily readable from a C/C++ perspective.

  • Pure Python syntax which allows static Cython type declarations inpure Python code,followingPEP-484 type hintsandPEP 526 variable annotations.

    To make use of C data types in Python syntax, you need to import the specialcython module in the Python module that you want to compile, e.g.

    importcython

    If you use the pure Python syntax we strongly recommend you use a recentCython 3 release, since significant improvements have been made herecompared to the 0.29.x releases.

Typed memoryviews allow efficient access to memory buffers, such as thoseunderlying NumPy arrays, without incurring any Python overhead.Memoryviews are similar to the current NumPy array buffer support(np.ndarray[np.float64_t,ndim=2]), butthey have more features and cleaner syntax.

Memoryviews are more general than the old NumPy array buffer support, becausethey can handle a wider variety of sources of array data. For example, they canhandle C arrays and the Cython array type (Cython arrays).

A memoryview can be used in any context (function parameters, module-level, cdefclass attribute, etc) and can be obtained from nearly any object thatexposes writable buffer through thePEP 3118 buffer interface.

Quickstart

If you are used to working with NumPy, the following examples should get youstarted with Cython memory views.

fromcython.cimports.cython.viewimportarrayascvarrayimportnumpyasnp# Memoryview on a NumPy arraynarr=np.arange(27,dtype=np.dtype("i")).reshape((3,3,3))narr_view=cython.declare(cython.int[:,:,:],narr)# Memoryview on a C arraycarr=cython.declare(cython.int[3][3][3])carr_view=cython.declare(cython.int[:,:,:],carr)# Memoryview on a Cython arraycyarr=cvarray(shape=(3,3,3),itemsize=cython.sizeof(cython.int),format="i")cyarr_view=cython.declare(cython.int[:,:,:],cyarr)# Show the sum of all the arrays before altering itprint(f"NumPy sum of the NumPy array before assignments: {narr.sum()}")# We can copy the values from one memoryview into another using a single# statement, by either indexing with ... or (NumPy-style) with a colon.carr_view[...]=narr_viewcyarr_view[:]=narr_view# NumPy-style syntax for assigning a single value to all elements.narr_view[:,:,:]=3# Just to distinguish the arrayscarr_view[0,0,0]=100cyarr_view[0,0,0]=1000# Assigning into the memoryview on the NumPy array alters the latterprint(f"NumPy sum of NumPy array after assignments: {narr.sum()}")# A function using a memoryview does not usually need the GIL@cython.nogil@cython.ccalldefsum3d(arr:cython.int[:,:,:])->cython.int:i:cython.size_tj:cython.size_tk:cython.size_tI:cython.size_tJ:cython.size_tK:cython.size_ttotal:cython.int=0I=arr.shape[0]J=arr.shape[1]K=arr.shape[2]foriinrange(I):forjinrange(J):forkinrange(K):total+=arr[i,j,k]returntotal# A function accepting a memoryview knows how to use a NumPy array,# a C array, a Cython array...print(f"Memoryview sum of NumPy array is {sum3d(narr)}")print(f"Memoryview sum of C array is {sum3d(carr)}")print(f"Memoryview sum of Cython array is {sum3d(cyarr)}")# ... and of course, a memoryview.print(f"Memoryview sum of C memoryview is {sum3d(carr_view)}")

This code should give the following output:

NumPysumoftheNumPyarraybeforeassignments:351NumPysumofNumPyarrayafterassignments:81MemoryviewsumofNumPyarrayis81MemoryviewsumofCarrayis451MemoryviewsumofCythonarrayis1351MemoryviewsumofCmemoryviewis451

Using memoryviews

Syntax

Memory views use Python slicing syntax in a similar way as NumPy.

To create a complete view on a one-dimensionalint buffer:

view1D:cython.int[:]=exporting_object

A complete 3D view:

view3D:cython.int[:,:,:]=exporting_object

They also work conveniently as function arguments:

defprocess_3d_buffer(view:cython.int[:,:,:]):...

The CythonnotNone declaration for the argument automatically rejectsNone values as input, which would otherwise be allowed. The reason whyNone is allowed by default is that it is conveniently used for returnarguments. On the other hand, when pure python mode is used,None valueis rejected by default. It is allowed only when type is declared asOptional:

importnumpyasnpimporttypingdefprocess_buffer(input_view:cython.int[:,:],output_view:typing.Optional[cython.int[:,:]]=None):ifoutput_viewisNone:# Creating a default view, e.g.output_view=np.empty_like(input_view)# process 'input_view' into 'output_view'returnoutput_viewprocess_buffer(None,None)

Cython will reject incompatible buffers automatically, e.g. passing athree dimensional buffer into a function that requires a twodimensional buffer will raise aValueError.

To use a memory view on a numpy array with a custom dtype, you’ll need todeclare an equivalent packed struct that mimics the dtype:

importnumpyasnpCUSTOM_DTYPE=np.dtype([('x',np.uint8),('y',np.float32),])a=np.zeros(100,dtype=CUSTOM_DTYPE)cdefpackedstructcustom_dtype_struct:# The struct needs to be packed since by default numpy dtypes aren't# alignedunsignedcharxfloatydefsum(custom_dtype_struct[:]a):cdef:unsignedcharsum_x=0floatsum_y=0.foriinrange(a.shape[0]):sum_x+=a[i].xsum_y+=a[i].yreturnsum_x,sum_y

Note

Pure python mode currently does not support packed structs

Indexing

In Cython, index access on memory views is automatically translatedinto memory addresses. The following code requests a two-dimensionalmemory view of Cint typed items and indexes into it:

buf:cython.int[:,:]=exporting_objectprint(buf[1,2])

Negative indices work as well, counting from the end of the respectivedimension:

print(buf[-1,-2])

The following function loops over each dimension of a 2D array andadds 1 to each item:

importnumpyasnpdefadd_one(buf:cython.int[:,:]):forxinrange(buf.shape[0]):foryinrange(buf.shape[1]):buf[x,y]+=1# exporting_object must be a Python object# implementing the buffer interface, e.g. a numpy array.exporting_object=np.zeros((10,20),dtype=np.intc)add_one(exporting_object)

Indexing and slicing can be done with or without the GIL. It basically workslike NumPy. If indices are specified for every dimension you will get an elementof the base type (e.g.int). Otherwise, you will get a new view. An Ellipsismeans you get consecutive slices for every unspecified dimension:

importnumpyasnpdefmain():exporting_object=np.arange(0,15*10*20,dtype=np.intc).reshape((15,10,20))my_view:cython.int[:,:,:]=exporting_object# These are all equivalentmy_view[10]my_view[10,:,:]my_view[10,...]

Copying

Memory views can be copied in place:

importnumpyasnpdefmain():to_view:cython.int[:,:,:]=np.empty((20,15,30),dtype=np.intc)from_view:cython.int[:,:,:]=np.ones((20,15,30),dtype=np.intc)# copy the elements in from_view to to_viewto_view[...]=from_view# orto_view[:]=from_view# orto_view[:,:,:]=from_view

They can also be copied with thecopy() andcopy_fortran() methods; seeC and Fortran contiguous copies.

Transposing

In most cases (see below), the memoryview can be transposed in the same way thatNumPy slices can be transposed:

importnumpyasnpdefmain():array=np.arange(20,dtype=np.intc).reshape((2,10))c_contig:cython.int[:,::1]=arrayf_contig:cython.int[::1,:]=c_contig.T

This gives a new, transposed, view on the data.

Transposing requires that all dimensions of the memoryview have adirect access memory layout (i.e., there are no indirections through pointers).SeeSpecifying more general memory layouts for details.

Newaxis

As for NumPy, new axes can be introduced by indexing an array withNone :

myslice:cython.double[:]=np.linspace(0,10,num=50)# 2D array with shape (1, 50)myslice[None]# ormyslice[None,:]# 2D array with shape (50, 1)myslice[:,None]# 3D array with shape (1, 10, 1)myslice[None,10:-20:2,None]

One may mix new axis indexing with all other forms of indexing and slicing.See also anexample.

Read-only views

Note

Pure python mode currently does not support read-only views.

Since Cython 0.28, the memoryview item type can be declared asconst tosupport read-only buffers as input:

importnumpyasnpcdefconstdouble[:]myslice# const item type => read-only viewa=np.linspace(0,10,num=50)a.setflags(write=False)myslice=a

Using a non-const memoryview with a binary Python string produces a runtime error.You can solve this issue with aconst memoryview:

cdefbintis_y_in(constunsignedchar[:]string_view):cdefintiforiinrange(string_view.shape[0]):ifstring_view[i]==b'y':returnTruereturnFalseprint(is_y_in(b'hello world'))# Falseprint(is_y_in(b'hello Cython'))# True

Note that this does notrequire the input buffer to be read-only:

a=np.linspace(0,10,num=50)myslice=a# read-only view of a writable buffer

Writable buffers are still accepted byconst views, but read-onlybuffers are not accepted for non-const, writable views:

cdefdouble[:]myslice# a normal read/write memory viewa=np.linspace(0,10,num=50)a.setflags(write=False)myslice=a# ERROR: requesting writable memory view from read-only buffer!

Comparison to the old buffer support

You will probably prefer memoryviews to the older syntax because:

  • The syntax is cleaner

  • Memoryviews do not usually need the GIL (seeMemoryviews and the GIL)

  • Memoryviews are considerably faster

For example, this is the old syntax equivalent of thesum3d function above:

cpdefintold_sum3d(object[int,ndim=3,mode='strided']arr):cdefintI,J,K,total=0I=arr.shape[0]J=arr.shape[1]K=arr.shape[2]foriinrange(I):forjinrange(J):forkinrange(K):total+=arr[i,j,k]returntotal

Note that we can’t usenogil for the buffer version of the function as wecould for the memoryview version ofsum3d above, because buffer objectsare Python objects. However, even if we don’t usenogil with thememoryview, it is significantly faster. This is a output from an IPythonsession after importing both versions:

In[2]:importnumpyasnpIn[3]:arr=np.zeros((40,40,40),dtype=int)In[4]:timeit-r15old_sum3d(arr)1000loops,bestof15:298usperloopIn[5]:timeit-r15sum3d(arr)1000loops,bestof15:219usperloop

Python buffer support

Cython memoryviews support nearly all objects exporting the interface of Pythonnew style buffers. This is the buffer interface described inPEP 3118.NumPy arrays support this interface, as doCython arrays. The“nearly all” is because the Python buffer interface allows theelements in thedata array to themselves be pointers; Cython memoryviews do not yet supportthis.

Memory layout

The buffer interface allows objects to identify the underlying memory in avariety of ways. With the exception of pointers for data elements, Cythonmemoryviews support all Python new-type buffer layouts. It can be useful to knowor specify memory layout if the memory has to be in a particular format for anexternal routine, or for code optimization.

Background

The concepts are as follows: there is data access and data packing. Data accessmeans either direct (no pointer) or indirect (pointer). Data packing means yourdata may be contiguous or not contiguous in memory, and may usestrides toidentify the jumps in memory consecutive indices need to take for each dimension.

NumPy arrays provide a good model of strided direct data access, so we’ll usethem for a refresher on the concepts of C and Fortran contiguous arrays, anddata strides.

Brief recap on C, Fortran and strided memory layouts

The simplest data layout might be a C contiguous array. This is the defaultlayout in NumPy and Cython arrays. C contiguous means that the array data iscontinuous in memory (see below) and that neighboring elements in the firstdimension of the array are furthest apart in memory, whereas neighboringelements in the last dimension are closest together. For example, in NumPy:

In[2]:arr=np.array([['0','1','2'],['3','4','5']],dtype='S1')

Here,arr[0,0] andarr[0,1] are one byte apart in memory, whereasarr[0,0] andarr[1,0] are 3 bytes apart. This leads us to the idea ofstrides. Each axis of the array has a stride length, which is the number ofbytes needed to go from one element on this axis to the next element. In thecase above, the strides for axes 0 and 1 will obviously be:

In[3]:arr.stridesOut[4]:(3,1)

For a 3D C contiguous array:

In[5]:c_contig=np.arange(24,dtype=np.int8).reshape((2,3,4))In[6]c_contig.stridesOut[6]:(12,4,1)

A Fortran contiguous array has the opposite memory ordering, with the elementson the first axis closest together in memory:

In[7]:f_contig=np.array(c_contig,order='F')In[8]:np.all(f_contig==c_contig)Out[8]:TrueIn[9]:f_contig.stridesOut[9]:(1,2,6)

A contiguous array is one for which a single continuous block of memory containsall the data for the elements of the array, and therefore the memory blocklength is the product of number of elements in the array and the size of theelements in bytes. In the example above, the memory block is 2 * 3 * 4 * 1 byteslong, where 1 is the length of annp.int8.

An array can be contiguous without being C or Fortran order:

In[10]:c_contig.transpose((1,0,2)).stridesOut[10]:(4,12,1)

Slicing an NumPy array can easily make it not contiguous:

In[11]:sliced=c_contig[:,1,:]In[12]:sliced.stridesOut[12]:(12,1)In[13]:sliced.flagsOut[13]:C_CONTIGUOUS:FalseF_CONTIGUOUS:FalseOWNDATA:FalseWRITEABLE:TrueALIGNED:TrueUPDATEIFCOPY:False

Default behavior for memoryview layouts

As you’ll see inSpecifying more general memory layouts, you can specify memory layout forany dimension of an memoryview. For any dimension for which you don’t specify alayout, then the data access is assumed to be direct, and the data packingassumed to be strided. For example, that will be the assumption for memoryviewslike:

my_memoryview:cython.int[:,:,:]=obj

C and Fortran contiguous memoryviews

You can specify C and Fortran contiguous layouts for the memoryview by using the::1 step syntax at definition. For example, if you know for sure yourmemoryview will be on top of a 3D C contiguous layout, you could write:

c_contiguous:cython.int[:,:,::1]=c_contig

wherec_contig could be a C contiguous NumPy array. The::1 at the 3rdposition means that the elements in this 3rd dimension will be one element apartin memory. If you know you will have a 3D Fortran contiguous array:

f_contiguous:cython.int[::1,:,:]=f_contig

If you pass a non-contiguous buffer, for example:

# This array is C contiguousc_contig=np.arange(24).reshape((2,3,4))c_contiguous:cython.int[:,:,::1]=c_contig# But this isn'tc_contiguous=np.array(c_contig,order='F')

you will get aValueError at runtime:

/Users/mb312/dev_trees/minimal-cython/mincy.pyxininitmincy(mincy.c:17267)()6970# But this isn't--->71c_contiguous=np.array(c_contig,order='F')7273# Show the sum of all the arrays before altering it/Users/mb312/dev_trees/minimal-cython/stringsourceinView.MemoryView.memoryview_cwrapper(mincy.c:9995)()/Users/mb312/dev_trees/minimal-cython/stringsourceinView.MemoryView.memoryview.__cinit__(mincy.c:6799)()ValueError:ndarrayisnotC-contiguous

Thus the::1 in the slice type specification indicates in which dimension thedata is contiguous. It can only be used to specify full C or Fortrancontiguity.

C and Fortran contiguous copies

Copies can be made C or Fortran contiguous using the.copy() and.copy_fortran() methods:

# This view is C contiguousc_contiguous:cython.int[:,:,::1]=myview.copy()# This view is Fortran contiguousf_contiguous_slice:cython.int[::1,:]=myview.copy_fortran()

Specifying more general memory layouts

Data layout can be specified using the previously seen::1 slice syntax, orby using any of the constants incython.view. If no specifier is given inany dimension, then the data access is assumed to be direct, and the datapacking assumed to be strided. If you don’t know whether a dimension will bedirect or indirect (because you’re getting an object with a buffer interfacefrom some library perhaps), then you can specify thegeneric flag, in whichcase it will be determined at runtime.

The flags are as follows:

  • generic - strided and direct or indirect

  • strided - strided and direct (this is the default)

  • indirect - strided and indirect

  • contiguous - contiguous and direct

  • indirect_contiguous - the list of pointers is contiguous

and they can be used like this:

fromcython.cimports.cythonimportviewdefmain():# direct access in both dimensions, strided in the first dimension, contiguous in the lasta:cython.int[:,::view.contiguous]# contiguous list of pointers to contiguous lists of intsb:cython.int[::view.indirect_contiguous,::1]# direct or indirect in the first dimension, direct in the second dimension# strided in both dimensionsc:cython.int[::view.generic,:]

Only the first, last or the dimension following an indirect dimension may bespecified contiguous:

fromcython.cimports.cythonimportviewdefmain():# VALIDa:cython.int[::view.indirect,::1,:]b:cython.int[::view.indirect,:,::1]c:cython.int[::view.indirect_contiguous,::1,:]# INVALIDd:cython.int[::view.contiguous,::view.indirect,:]e:cython.int[::1,::view.indirect,:]

The difference between thecontiguous flag and the::1 specifier is that theformer specifies contiguity for only one dimension, whereas the latter specifiescontiguity for all following (Fortran) or preceding (C) dimensions:

c_contig:cython.int[:,::1]=...# VALIDmyslice:cython.int[:,::view.contiguous]=c_contig[::2]# INVALIDmyslice:cython.int[:,::1]=c_contig[::2]

The former case is valid because the last dimension remains contiguous, but thefirst dimension does not “follow” the last one anymore (meaning, it was stridedalready, but it is not C or Fortran contiguous any longer), since it was sliced.

Memoryviews and the GIL

As you will see from theQuickstart section, memoryviews often donot need the GIL:

@cython.nogil@cython.ccalldefsum3d(arr:cython.int[:,:,:])->cython.int:...

In particular, you do not need the GIL for memoryview indexing, slicing ortransposing. Memoryviews require the GIL for the copy methods(C and Fortran contiguous copies), or when the dtype is object and an objectelement is read or written.

Memoryview Objects and Cython Arrays

These typed memoryviews can be converted to Python memoryview objects(cython.view.memoryview). These Python objects are indexable, sliceable andtransposable in the same way that the original memoryviews are. They can also beconverted back to Cython-space memoryviews at any time.

They have the following attributes:

  • shape: size in each dimension, as a tuple.

  • strides: stride along each dimension, in bytes.

  • suboffsets

  • ndim: number of dimensions.

  • size: total number of items in the view (product of the shape).

  • itemsize: size, in bytes, of the items in the view.

  • nbytes: equal tosize timesitemsize.

  • base

And of course the aforementionedT attribute (Transposing).These attributes have the same semantics as inNumPy. For instance, toretrieve the original object:

importnumpyfromcython.cimports.numpyimportint32_tdefmain():a:int32_t[:]=numpy.arange(10,dtype=numpy.int32)a=a[::2]print(a)print(numpy.asarray(a))print(a.base)# this prints:#    <MemoryView of 'ndarray' object>#    [0 2 4 6 8]#    [0 1 2 3 4 5 6 7 8 9]

Note that this example returns the original object from which the view wasobtained, and that the view was resliced in the meantime.

Cython arrays

Whenever a Cython memoryview is copied (using any of thecopy() orcopy_fortran() methods), you get a new memoryview slice of a newly createdcython.view.array object. This array can also be used manually, and willautomatically allocate a block of data. It can later be assigned to a C orFortran contiguous slice (or a strided slice). It can be used like:

fromcython.cimports.cythonimportviewmy_array=view.array(shape=(10,2),itemsize=cython.sizeof(cython.int),format="i")my_slice:cython.int[:,:]=my_array

It also takes an optional argumentmode (‘c’ or ‘fortran’) and a booleanallocate_buffer, that indicates whether a buffer should be allocated and freedwhen it goes out of scope:

my_array:view.array=view.array(...,mode="fortran",allocate_buffer=False)my_array.data=cython.cast(cython.p_char,my_data_pointer)# define a function that can deallocate the data (if needed)my_array.callback_free_data=free

You can also cast pointers to array, or C arrays to arrays:

my_array:view.array=cython.cast(cython.int[:10,:2],my_data_pointer)my_array:view.array=cython.cast(cython.int[:,:],my_c_array)

Of course, you can also immediately assign a cython.view.array to a typed memoryview slice. A C arraymay be assigned directly to a memoryview slice:

myslice:cython.int[:,::1]=my_2d_c_array

The arrays are indexable and sliceable from Python space just like memoryview objects, and have the sameattributes as memoryview objects.

CPython array module

An alternative tocython.view.array is thearray module in thePython standard library. In Python 3, thearray.array type supportsthe buffer interface natively, so memoryviews work on top of it withoutadditional setup.

Starting with Cython 0.17, however, it is possible to use these arraysas buffer providers also in Python 2. This is done through explicitlycimporting thecpython.array module as follows:

defsum_array(view:cython.int[:]):"""    >>> from array import array    >>> sum_array( array('i', [1,2,3]) )    6    """total:cython.int=0foriinrange(view.shape[0]):total+=view[i]returntotal

Note that the cimport also enables the old buffer syntax for the arraytype. Therefore, the following also works:

fromcython.cimports.cpythonimportarraydefsum_array(arr:array.array[cython.int]):# using old buffer syntax...

Coercion to NumPy

Memoryview (and array) objects can be coerced to a NumPy ndarray, without havingto copy the data. You can e.g. do:

fromcython.cimports.numpyimportint32_timportnumpyasnpnumpy_array=np.asarray(cython.cast(int32_t[:10,:10],my_pointer))

Of course, you are not restricted to using NumPy’s type (such ascnp.int32_there), you can use any usable type.

None Slices

Although memoryview slices are not objects they can be set toNone and they canbe checked for beingNone as well:

deffunc(myarray:typing.Optional[cython.double[:]]=None):print(myarrayisNone)

If the function requires real memory views as input, it is therefore best torejectNone input straight away in the signature:

deffunc(myarray:cython.double[:]):...

Unlike object attributes of extension classes, memoryview slices are notinitialized toNone.

Pass data from a C function via pointer

Since use of pointers in C is ubiquitous, here we give a quick example of howto call C functions whose arguments contain pointers. Let’s suppose you want tomanage an array (allocate and deallocate) with NumPy (it can also be Python arrays, oranything that supports the buffer interface), but you want to perform computation on thisarray with an external C function implemented inC_func_file.c:

C_func_file.c
1#include "C_func_file.h"23voidmultiply_by_10_in_C(doublearr[],unsignedintn)4{5unsignedinti;6for(i=0;i<n;i++){7arr[i]*=10;8}9}

This file comes with a header file calledC_func_file.h containing:

C_func_file.h
1#ifndef C_FUNC_FILE_H2#define C_FUNC_FILE_H34voidmultiply_by_10_in_C(doublearr[],unsignedintn);56#endif

wherearr points to the array andn is its size.

You can call the function in a Cython file in the following way:

memview_to_c.pxd
1cdefexternfrom"C_func_file.c":2# The C file is include directly so that it doesn't need to be compiled separately.3pass45cdefexternfrom"C_func_file.h":6voidmultiply_by_10_in_C(double*,unsignedint)
memview_to_c.py
 1importnumpyasnp 2 3defmultiply_by_10(arr):# 'arr' is a one-dimensional numpy array 4 5ifnotarr.flags['C_CONTIGUOUS']: 6arr=np.ascontiguousarray(arr)# Makes a contiguous copy of the numpy array. 7 8arr_memview:cython.double[::1]=arr 910multiply_by_10_in_C(cython.address(arr_memview[0]),arr_memview.shape[0])1112returnarr131415a=np.ones(5,dtype=np.double)16print(multiply_by_10(a))1718b=np.ones(10,dtype=np.double)19b=b[::2]# b is not contiguous.2021print(multiply_by_10(b))# but our function still works as expected.
Several things to note:
  • ::1 requests a C contiguous view, and fails if the buffer is not C contiguous.SeeC and Fortran contiguous memoryviews.

  • &arr_memview[0] andcython.address(arr_memview[0] can be understood as ‘the address of the first element of thememoryview’. For contiguous arrays, this is equivalent to thestart address of the flat memory buffer.

  • arr_memview.shape[0] could have been replaced byarr_memview.size,arr.shape[0] orarr.size. Butarr_memview.shape[0] is more efficientbecause it doesn’t require any Python interaction.

  • multiply_by_10 will perform computation in-place if the array passed is contiguous,and will return a new numpy array ifarr is not contiguous.

  • If you are using Python arrays instead of numpy arrays, you don’t need to checkif the data is stored contiguously as this is always the case. SeeWorking with Python arrays.

This way, you can call the C function similar to a normal Python function,and leave all the memory management and cleanup to NumPy arrays and Python’sobject handling. For the details of how to compile andcall functions in C files, seeUsing C libraries.

Performance: Disabling initialization checks

Every time the memoryview is accessed, Cython adds a check to make sure that it has been initialized.

If you are looking for performance, you can disable them by setting theinitializedcheck directive toFalse.See:Compiler directives for more information about this directive.