Python Enhancement Proposals

Python »
PEP Index »
PEP 657

PEP 657 – Include Fine Grained Error Locations in Tracebacks

Author:: Pablo Galindo Salgado <pablogsal at python.org>,Batuhan Taskaya <batuhan at python.org>,Ammar Askar <ammar at ammaraskar.com>
Discussions-To:

Abstract

This PEP proposes adding a mapping from each bytecode instruction to the startand end column offsets of the line that generated them as well as the end linenumber. This data will be used to improve tracebacks displayed by the CPythoninterpreter in order to improve the debugging experience. The PEP also proposesadding APIs that allow other tools (such as coverage analysis tools, profilers,tracers, debuggers) to consume this information from code objects.

Motivation

The primary motivation for this PEP is to improve the feedback presented aboutthe location of errors to aid with debugging.

Python currently keeps a mapping of bytecode to line numbers from compilation.The interpreter uses this mapping to point to the source line associated withan error. While this line-level granularity for instructions is useful, asingle line of Python code can compile into dozens of bytecode operationsmaking it hard to track which part of the line caused the error.

Consider the following line of Python code:

x['a']['b']['c']['d']=1

If any of the values in the dictionaries areNone, the error shown is:

Traceback(mostrecentcalllast):File"test.py",line2,in<module>x['a']['b']['c']['d']=1TypeError:'NoneType'objectisnotsubscriptable

From the traceback, it is impossible to determine which one of the dictionarieshad theNone element that caused the error. Users often have to attach adebugger or split up their expression to track down the problem.

However, if the interpreter had a mapping of bytecode to column offsets as wellas line numbers, it could helpfully display:

Traceback(mostrecentcalllast):File"test.py",line2,in<module>x['a']['b']['c']['d']=1~~~~~~~~~~~^^^^^TypeError:'NoneType'objectisnotsubscriptable

indicating to the user that the objectx['a']['b'] must have beenNone.This highlighting will occur for every frame in the traceback. For instance, ifa similar error is part of a complex function call chain, the traceback woulddisplay the code associated to the current instruction in every frame:

Traceback(mostrecentcalllast):File"test.py",line14,in<module>lel3(x)^^^^^^^File"test.py",line12,inlel3returnlel2(x)/23^^^^^^^File"test.py",line9,inlel2return25+lel(x)+lel(x)^^^^^^File"test.py",line6,inlelreturn1+foo(a,b,c=x['z']['x']['y']['z']['y'],d=e)~~~~~~~~~~~~~~~~^^^^^TypeError:'NoneType'objectisnotsubscriptable

This problem presents itself in the following situations.

When passing down multiple objects to function calls whileaccessing the same attribute in them.For instance, this error:

Traceback(mostrecentcalllast):File"test.py",line19,in<module>foo(a.name,b.name,c.name)AttributeError:'NoneType'objecthasnoattribute'name'

With the improvements in this PEP this would show:

Traceback(mostrecentcalllast):File"test.py",line17,in<module>foo(a.name,b.name,c.name)^^^^^^AttributeError:'NoneType'objecthasnoattribute'name'

When dealing with lines with complex mathematical expressions,especially with libraries such as numpy where arithmeticoperations can fail based on the arguments. For example:
```
Traceback(mostrecentcalllast):File"test.py",line1,in<module>x=(a+b)@(c+d)ValueError:operandscouldnotbebroadcasttogetherwithshapes(1,2)(2,3)
```
There is no clear indication as to which operation failed, was it the additionon the left, the right or the matrix multiplication in the middle? With thisPEP the new error message would look like:
```
Traceback(mostrecentcalllast):File"test.py",line1,in<module>x=(a+b)@(c+d)~~^~~ValueError:operandscouldnotbebroadcasttogetherwithshapes(1,2)(2,3)
```
Giving a much clearer and easier to debug error message.

Debugging aside, this extra information would also be useful for codecoverage tools, enabling them to measure expression-level coverage instead ofjust line-level coverage. For instance, given the following line:

x=foo()ifbar()elsebaz()

coverage, profile or state analysis tools will highlight the full line in bothbranches, making it impossible to differentiate what branch was taken. This isa known problem inpycoverage.

Similar efforts to this PEP have taken place in other languages such as Java inthe form ofJEP358.NullPointerExceptions in Java were similarly nebulous whenit came to lines with complicated expressions. ANullPointerException wouldprovide very little aid in finding the root cause of an error. Theimplementation for JEP358 is fairly complex, requiring walking back through thebytecode by using a control flow graph analyzer and decompilation techniques torecover the source code that led to the null pointer. Although the complexityof this solution is high and requires maintenance for the decompiler every timeJava bytecode is changed, this improvement was deemed to be worth it for theextra information provided forjust one exception type.

Rationale

In order to identify the range of source code being executed when exceptionsare raised, this proposal requires adding new data for every bytecodeinstruction. This will have an impact on the size ofpyc files on disk andthe size of code objects in memory. The authors of this proposal have chosenthe data types in a way that tries to minimize this impact. The proposedoverhead is storing twouint8_t (one for the start offset and one for theend offset) and the end line information for every bytecode instruction (inthe same encoded fashion as the start line is stored currently).

As an illustrative example to gauge the impact of this change, we havecalculated that including the start and end offsets will increase the size ofthe standard library’s pyc files by 22% (6MB) from 28.4MB to 34.7MB. Theoverhead in memory usage will be the same (assuming thefull standard libraryis loaded into the same program). We believe that this is a very acceptablenumber since the order of magnitude of the overhead is very small, especiallyconsidering the storage size and memory capabilities of modern computers.Additionally, in general the memory size of a Python program is not dominatedby code objects. To check this assumption we have executed the test suite ofseveral popular PyPI projects (including NumPy, pytest, Django and Cython) aswell as several applications (Black, pylint, mypy executed over either mypy orthe standard library) and we found that code objects represent normally 3-6% ofthe average memory size of the program.

We understand that the extra cost of this information may not be acceptable forsome users, so we propose an opt-out mechanism which will cause generated codeobjects to not have the extra information while also allowing pyc files to notinclude the extra information.

Specification

In order to have enough information to correctly resolve the locationwithin a given line where an error was raised, a map linking bytecodeinstructions to column offsets (start and end offset) and end line numbersis needed. This is similar in fashion to how line numbers are currently linkedto bytecode instructions.

The following changes will be performed as part of the implementation ofthis PEP:

The offset information will be exposed to Python via a new attribute in thecode object class calledco_positions that will return a sequence offour-element tuples containing the full location of every instruction(including start line, end line, start column offset and end column offset)orNone if the code object was created without the offset information.
One new C-API function:
```
intPyCode_Addr2Location(PyCodeObject*co,intaddrq,int*start_line,int*start_column,int*end_line,int*end_column)
```
will be added so the end line, the start column offsets and the end columnoffset can be obtained given the index of a bytecode instruction. Thisfunction will set the values to 0 if the information is not available.

The internal storage, compression and encoding of the information is left as animplementation detail and can be changed at any point as long as the public APIremains unchanged.

Offset semantics

These offsets are propagated by the compiler from the ones stored currently inall AST nodes. The output of the public APIs (co_positions andPyCode_Addr2Location)that deal with these attributes use 0-indexed offsets (just like the AST nodes), but the underlyingimplementation is free to represent the actual data in whatever form they choose to be most efficient.The error code regarding information not available isNone for theco_positions() API,and-1 for thePyCode_Addr2Location API. The availability of the information highly dependson whether the offsets fall under the range, as well as the runtime flags for the interpreterconfiguration.

The AST nodes useint types to store these values. The current implementation, however,utilizesuint8_t types as an implementation detail to minimize storage impact. This decisionallows offsets to go from 0 to 255, while offsets bigger than these values will be treated asmissing (returning-1 on thePyCode_Addr2Location andNone API in theco_positions() API).

As specified previously, the underlying storage of the offsets should beconsidered an implementation detail, as the public APIs to obtain this valueswill return either Cint types or Pythonint objects, which allows toimplement better compression/encoding in the future if bigger ranges would needto be supported. This PEP proposes to start with this simpler version anddefer improvements to future work.

Displaying tracebacks

When displaying tracebacks, the default exception hook will be modified toquery this information from the code objects and use it to display a sequenceof carets for every displayed line in the traceback if the information isavailable. For instance:

File"test.py",line6,inlelreturn1+foo(a,b,c=x['z']['x']['y']['z']['y'],d=e)~~~~~~~~~~~~~~~~^^^^^TypeError:'NoneType'objectisnotsubscriptable

When displaying tracebacks, instruction offsets will be taken from thetraceback objects. This makes highlighting exceptions that are re-raised worknaturally without the need to store the new information in the stack. Forexample, for this code:

deffoo(x):1+1/0+2defbar(x):try:1+foo(x)+foo(x)exceptExceptionase:raiseValueError("oh no!")fromebar(bar(bar(2)))

The printed traceback would look like this:

Traceback(mostrecentcalllast):File"test.py",line6,inbar1+foo(x)+foo(x)^^^^^^File"test.py",line2,infoo1+1/0+2~^~ZeroDivisionError:divisionbyzeroTheaboveexceptionwasthedirectcauseofthefollowingexception:Traceback(mostrecentcalllast):File"test.py",line10,in<module>bar(bar(bar(2)))^^^^^^File"test.py",line8,inbarraiseValueError("oh no!")frome^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ValueError:ohno

While this code:

deffoo(x):1+1/0+2defbar(x):try:1+foo(x)+foo(x)exceptException:raisebar(bar(bar(2)))

Will be displayed as:

Traceback(mostrecentcalllast):File"test.py",line10,in<module>bar(bar(bar(2)))^^^^^^File"test.py",line6,inbar1+foo(x)+foo(x)^^^^^^File"test.py",line2,infoo1+1/0+2~^~ZeroDivisionError:divisionbyzero

Maintaining the current behavior, only a single line will be displayedin tracebacks. For instructions that span multiple lines (the end offsetand the start offset belong to different lines), the end line number mustbe inspected to know if the end offset applies to the same line as thestarting offset.

Opt-out mechanism

To offer an opt-out mechanism for those users that care about thestorage and memory overhead and to allow third party tools and otherprograms that are currently parsing tracebacks to catch up the followingmethods will be provided to deactivate this feature:

A new environment variable:PYTHONNODEBUGRANGES.
A new command line option for the dev mode:python-Xno_debug_ranges.

If any of these methods are used, the Python compiler willnot populatecode objects with the new information (None will be used instead) and anyunmarshalled code objects that contain the extra information will have it strippedaway and replaced withNone). Additionally, the traceback machinery will notshow the extended location information even if the information was present.This method allows users to:

Create smallerpyc files by using one of the two methods when said filesare created.
Don’t load the extra information frompyc files if those were created withthe extra information in the first place.
Deactivate the extra information when displaying tracebacks (the caret charactersindicating the location of the error).

Doing this has avery small performance hit as the interpreter state needsto be fetched when code objects are created to look up the configuration.Creating code objects is not a performance sensitive operation so this shouldnot be a concern.

Backwards Compatibility

The change is fully backwards compatible.

Reference Implementation

A reference implementation can be found in theimplementation fork.

Rejected Ideas

Use a single caret instead of a range

It has been proposed to use a single caret instead of highlighting the fullrange when reporting errors as a way to simplify the feature. We have decidedto not go this route for the following reasons:

Deriving the location of the caret is not straightforward using the currentlayout of the AST. This is because the AST nodes only record the start and endline numbers as well as the start and end column offsets. As the AST nodes donot preserve the original tokens (by design) deriving the exact location of sometokens is not possible without extra re-parsing. For instance, currently binaryoperators have nodes for the operands but the type of the operator is storedin an enumeration so its location cannot be derived from the node (this is justan example of how this problem manifest, and not the only one).
Deriving the ranges from AST nodes greatly simplifies the implementation and reducesa lot the maintenance cost and the possibilities of errors. This is because usingthe ranges is always possible to do generically for any AST node, while any othercustom information would need to be extracted differently from different types ofnodes. Given how error-prone getting the locations manually was when this used tobe a manual process when generating the AST, we believe that a generic solution isa very important property to pursue.
Storing the information to highlight a single caret will be very limiting for toolssuch as coverage tools and profilers as well as for tools like IPython and IDEs thatwant to make use of this new feature. Asthis message from the author of “friendly-traceback”mentions, the reason is that without the full range (including end lines) these toolswill find very difficult to highlight correctly the relevant source code. For instance,for this code:
```
something=foo(a,b,c)ifbar(a,b,c)elseother(b,c,d)
```
tools (such as coverage reporters) want to be able to highlight the totality of the callthat is covered by the executed bytecode (let’s sayfoo(a,b,c)) and not just a singlecharacter. Even if is technically possible to re-parse and re-tokenize the source codeto re-construct the information, it is not possible to do this reliably and wouldresult in a much worse user experience.
Many users have reported that a single caret is much harder to read than a full range,and this motivated using ranges to highlight syntax errors, which was very well received.Additionally, it has been noted that users with vision problems can identify the rangesmuch easily than a single caret character, which we believe is a great advantage ofusing ranges.

Have a configure flag to opt out

Having a configure flag to opt out of the overhead even when executing Pythonin non-optimized mode may sound desirable, but it may cause problems whenreading pyc files that were created with a version of the interpreter that wasnot compiled with the flag activated. This can lead to crashes that would bevery difficult to debug for regular users and will make different pyc filesincompatible between each other. As this pyc could be shipped as part oflibraries or applications without the original source, it is also not alwayspossible to force recompilation of said pyc files. For these reasons we havedecided to use the -O flag to opt-out of this behaviour.

Lazy loading of column information

One potential solution to reduce the memory usage of this feature is to notload the column information from the pyc file when code is imported. Only if anuncaught exception bubbles up or if a call to the C-API functions is made willthe column information be loaded from the pyc file. This is similar to how weonly read source lines to display them in the traceback when an exceptionbubbles up. While this would indeed lower memory usage, it also results in afar more complex implementation requiring changes to the importing machinery toselectively ignore a part of the code object. We consider this an interestingavenue to explore but ultimately we think is out of the scope for this particularPEP. It also means that column information will not be available if the user isnot using pyc files or for code objects created dynamically at runtime.

Implement compression

Although it would be possible to implement some form of compression over thepyc files and the new data in code objects, we believe that this is out of thescope of this proposal due to its larger impact (in the case of pyc files) andthe fact that we expect column offsets to not compress well due to the lack ofpatterns in them (in case of the new data in code objects).

Acknowledgments

Thanks to Carl Friedrich Bolz-Tereick for showing an initial prototype of thisidea for the Pypy interpreter and for the helpful discussion.

Copyright

This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.

Source:https://github.com/python/peps/blob/main/peps/pep-0657.rst

Last modified:2025-11-07 04:32:09 GMT

Movatterモバイル変換

PEP 657 – Include Fine Grained Error Locations in Tracebacks