Memory management in NumPy#
Thenumpy.ndarray is a python class. It requires additional memory allocationsto holdnumpy.ndarray.strides,numpy.ndarray.shape andnumpy.ndarray.data attributes. These attributes are specially allocatedafter creating the python object in__new__. Thestridesandshape are stored in a piece of memory allocated internally.
Thedata allocation used to store the actual array values (which could bepointers in the case ofobject arrays) can be very large, so NumPy hasprovided interfaces to manage its allocation and release. This document detailshow those interfaces work.
Historical overview#
Since version 1.7.0, NumPy has exposed a set ofPyDataMem_* functions(PyDataMem_NEW,PyDataMem_FREE,PyDataMem_RENEW)which are backed byalloc,free,realloc respectively.
Since those early days, Python also improved its memory managementcapabilities, and began providingvariousmanagement policies beginning in version3.4. These routines are divided into a set of domains, each domain has aPyMemAllocatorEx structure of routines for memory management. Python alsoadded atracemalloc module to trace calls to the various routines. Thesetracking hooks were added to the NumPyPyDataMem_* routines.
NumPy added a small cache of allocated memory in its internalnpy_alloc_cache,npy_alloc_cache_zero, andnpy_free_cachefunctions. These wrapalloc,alloc-and-memset(0) andfreerespectively, but whennpy_free_cache is called, it adds the pointer to ashort list of available blocks marked by size. These blocks can be re-used bysubsequent calls tonpy_alloc*, avoiding memory thrashing.
Configurable memory routines in NumPy (NEP 49)#
Users may wish to override the internal data memory routines with ones of theirown. Since NumPy does not use the Python domain strategy to manage data memory,it provides an alternative set of C-APIs to change memory routines. There areno Python domain-wide strategies for large chunks of object data, so those areless suited to NumPy’s needs. User who wish to change the NumPy data memorymanagement routines can usePyDataMem_SetHandler, which uses aPyDataMem_Handler structure to hold pointers to functions used tomanage the data memory. The calls are still wrapped by internal routines tocallPyTraceMalloc_Track,PyTraceMalloc_Untrack. Since thefunctions may change during the lifetime of the process, eachndarraycarries with it the functions used at the time of its instantiation, and thesewill be used to reallocate or free the data memory of the instance.
- typePyDataMem_Handler#
A struct to hold function pointers used to manipulate memory
typedefstruct{charname[127];/* multiple of 64 to keep the struct aligned */uint8_tversion;/* currently 1 */PyDataMemAllocatorallocator;}PyDataMem_Handler;
where the allocator structure is
/* The declaration of free differs from PyMemAllocatorEx */typedefstruct{void*ctx;void*(*malloc)(void*ctx,size_tsize);void*(*calloc)(void*ctx,size_tnelem,size_telsize);void*(*realloc)(void*ctx,void*ptr,size_tnew_size);void(*free)(void*ctx,void*ptr,size_tsize);}PyDataMemAllocator;
- PyObject*PyDataMem_SetHandler(PyObject*handler)#
Set a new allocation policy. If the input value is
NULL, will reset thepolicy to the default. Return the previous policy, orreturnNULLif an error has occurred. We wrap the user-provided functionsso they will still call the python and numpy memory management callbackhooks.
- PyObject*PyDataMem_GetHandler()#
Return the current policy that will be used to allocate data for thenext
PyArrayObject. On failure, returnNULL.
For an example of setting up and using the PyDataMem_Handler, see the test innumpy/_core/tests/test_mem_policy.py
What happens when deallocating if there is no policy set#
A rare but useful technique is to allocate a buffer outside NumPy, usePyArray_NewFromDescr to wrap the buffer in andarray, then switchtheOWNDATA flag to true. When thendarray is released, theappropriate function from thendarray’sPyDataMem_Handler should becalled to free the buffer. But thePyDataMem_Handler field was never set,it will beNULL. For backward compatibility, NumPy will callfree() torelease the buffer. IfNUMPY_WARN_IF_NO_MEM_POLICY is set to1, awarning will be emitted. The current default is not to emit a warning, this maychange in a future version of NumPy.
A better technique would be to use aPyCapsule as a base object:
/* define a PyCapsule_Destructor, using the correct deallocator for buff */voidfree_wrap(void*capsule){void*obj=PyCapsule_GetPointer(capsule,PyCapsule_GetName(capsule));free(obj);};/* then inside the function that creates arr from buff */...arr=PyArray_NewFromDescr(...buf,...);if(arr==NULL){returnNULL;}capsule=PyCapsule_New(buf,"my_wrapped_buffer",(PyCapsule_Destructor)&free_wrap);if(PyArray_SetBaseObject(arr,capsule)==-1){Py_DECREF(arr);returnNULL;}...
Example of memory tracing withnp.lib.tracemalloc_domain#
The builtintracemalloc module can be used to track allocations inside NumPy.NumPy places its CPU memory allocations into thenp.lib.tracemalloc_domain domain.For additional information, check:https://docs.python.org/3/library/tracemalloc.html.
Here is an example on how to usenp.lib.tracemalloc_domain:
""" The goal of this example is to show how to trace memory from an application that has NumPy and non-NumPy sections. We only select the sections using NumPy related calls."""importtracemallocimportnumpyasnp# Flag to determine if we select NumPy domainuse_np_domain=Truenx=300ny=500# Start to trace memorytracemalloc.start()# Section 1# ---------# NumPy related calla=np.zeros((nx,ny))# non-NumPy related callb=[i**2foriinrange(nx*ny)]snapshot1=tracemalloc.take_snapshot()# We filter the snapshot to only select NumPy related callsnp_domain=np.lib.tracemalloc_domaindom_filter=tracemalloc.DomainFilter(inclusive=use_np_domain,domain=np_domain)snapshot1=snapshot1.filter_traces([dom_filter])top_stats1=snapshot1.statistics('traceback')print("================ SNAPSHOT 1 =================")forstatintop_stats1:print(f"{stat.count} memory blocks:{stat.size/1024:.1f} KiB")print(stat.traceback.format()[-1])# Clear traces of memory blocks allocated by Python# before moving to the next section.tracemalloc.clear_traces()# Section 2#----------# We are only using NumPyc=np.sum(a*a)snapshot2=tracemalloc.take_snapshot()top_stats2=snapshot2.statistics('traceback')print()print("================ SNAPSHOT 2 =================")forstatintop_stats2:print(f"{stat.count} memory blocks:{stat.size/1024:.1f} KiB")print(stat.traceback.format()[-1])tracemalloc.stop()print()print("============================================")print("\nTracing Status : ",tracemalloc.is_tracing())try:print("\nTrying to Take Snapshot After Tracing is Stopped.")snap=tracemalloc.take_snapshot()exceptExceptionase:print("Exception : ",e)