Advanced debugging tools#

If you reached here, you want to dive into, or use, more advanced tooling.This is usually not necessary for first time contributors and mostday-to-day development.These are used more rarely, for example close to a new NumPy release,or when a large or particular complex change was made.

Since not all of these tools are used on a regular basis and only availableon some systems, please expect differences, issues, or quirks;we will be happy to help if you get stuck and appreciate any improvementsor suggestions to these workflows.

Finding C errors with additional tooling#

Most development will not require more than a typical debugging toolchainas shown inDebugging.But for example memory leaks can be particularly subtle or difficult tonarrow down.

We do not expect any of these tools to be run by most contributors.However, you can ensure that we can track down such issues more easily:

  • Tests should cover all code paths, including error paths.

  • Try to write short and simple tests. If you have a very complicated testconsider creating an additional simpler test as well.This can be helpful, because often it is only easy to find which testtriggers an issue and not which line of the test.

  • Never usenp.empty if data is read/used.valgrind will notice thisand report an error. When you do not care about values, you can generaterandom values instead.

This will help us catch any oversights before your change is releasedand means you do not have to worry about making reference counting errors,which can be intimidating.

Python debug build#

Debug builds of Python are easily available for example via the system packagemanager on Linux systems, but are also available on other platforms, possibly ina less convenient format. If you cannot easily install a debug build of Pythonfrom a system package manager, you can build one yourself usingpyenv. For example, to install and globallyactivate a debug build of Python 3.13.3, one would do:

pyenvinstall-g3.13.3pyenvglobal3.13.3

Note thatpyenvinstall builds Python from source, so you must ensure thatPython’s dependencies are installed before building, see the pyenv documentationfor platform-specific installation instructions. You can usepip to installPython dependencies you may need for your debugging session. If there is nodebug wheel available onpypi, you will need to build the dependencies fromsource and ensure that your dependencies are also compiled as debug builds.

Often debug builds of Python name the Python executablepythond instead ofpython. To check if you have a debug build of Python installed, you can rune.g.pythond-msysconfig to get the build configuration for the Pythonexecutable. A debug build will be built with debug compiler options inCFLAGS (e.g.-g-Og).

Running the Numpy tests or an interactive terminal is usually as easy as:

python3.8druntests.py# orpython3.8druntests.py--ipython

and were already mentioned inDebugging.

A Python debug build will help:

  • Find bugs which may otherwise cause random behaviour.One example is when an object is still used after it has been deleted.

  • Python debug builds allows to check correct reference counting.This works using the additional commands:

    sys.gettotalrefcount()sys.getallocatedblocks()
  • Python debug builds allow easier debugging with gdb and other C debuggers.

Use together withpytest#

Running the test suite only with a debug python build will not find manyerrors on its own. An additional advantage of a debug build of Python is thatit allows detecting memory leaks.

A tool to make this easier ispytest-leaks, which can be installed usingpip.Unfortunately,pytest itself may leak memory, but good results can usually(currently) be achieved by removing:

@pytest.fixture(autouse=True)defadd_np(doctest_namespace):doctest_namespace['np']=numpy@pytest.fixture(autouse=True)defenv_setup(monkeypatch):monkeypatch.setenv('PYTHONHASHSEED','0')

fromnumpy/conftest.py (This may change with newpytest-leaks versionsorpytest updates).

This allows to run the test suite, or part of it, conveniently:

python3.8druntests.py-tnumpy/_core/tests/test_multiarray.py---R2:3-s

where-R2:3 is thepytest-leaks command (see its documentation), the-s causes output to print and may be necessary (in some versions capturedoutput was detected as a leak).

Note that some tests are known (or even designed) to leak references, we tryto mark them, but expect some false positives.

valgrind#

Valgrind is a powerful tool to find certain memory access problems and shouldbe run on complicated C code.Basic use ofvalgrind usually requires no more than:

PYTHONMALLOC=mallocvalgrindpythonruntests.py

wherePYTHONMALLOC=malloc is necessary to avoid false positives from pythonitself.Depending on the system and valgrind version, you may see more false positives.valgrind supports “suppressions” to ignore some of these, and Python doeshave a suppression file (and even a compile time option) which may help if youfind it necessary.

Valgrind helps:

  • Find use of uninitialized variables/memory.

  • Detect memory access violations (reading or writing outside of allocatedmemory).

  • Findmany memory leaks. Note that formost leaks the pythondebug build approach (andpytest-leaks) is much more sensitive.The reason is thatvalgrind can only detect if memory is definitelylost. If:

    dtype=np.dtype(np.int64)arr.astype(dtype=dtype)

    Has incorrect reference counting fordtype, this is a bug, but valgrindcannot see it becausenp.dtype(np.int64) always returns the same object.However, not all dtypes are singletons, so this might leak memory fordifferent input.In rare cases NumPy usesmalloc and not the Python memory allocatorswhich are invisible to the Python debug build.malloc should normally be avoided, but there are some exceptions(e.g. thePyArray_Dims structure is public API and cannot use thePython allocators.)

Even though using valgrind for memory leak detection is slow and less sensitiveit can be a convenient: you can run most programs with valgrind withoutmodification.

Things to be aware of:

  • Valgrind does not support the numpylongdouble, this means that testswill fail or be flagged errors that are completely fine.

  • Expect some errors before and after running your NumPy code.

  • Caches can mean that errors (specifically memory leaks) may not be detectedor are only detect at a later, unrelated time.

A big advantage of valgrind is that it has no requirements aside from valgrinditself (although you probably want to use debug builds for better tracebacks).

Use together withpytest#

You can run the test suite with valgrind which may be sufficientwhen you are only interested in a few tests:

PYTHONMALLOC=mallocvalgrindpythonruntests.py \-tnumpy/_core/tests/test_multiarray.py----continue-on-collection-errors

Note the--continue-on-collection-errors, which is currently necessary due tomissinglongdouble support causing failures (this will usually not benecessary if you do not run the full test suite).

If you wish to detect memory leaks you will also require--show-leak-kinds=definiteand possibly more valgrind options. Just as forpytest-leaks certaintests are known to leak cause errors in valgrind and may or may not be markedas such.

We have developedpytest-valgrind which:

  • Reports errors for each test individually

  • Narrows down memory leaks to individual tests (by default valgrindonly checks for memory leaks after a program stops, which is verycumbersome).

Please refer to itsREADME for more information (it includes an examplecommand for NumPy).