Debug emit

The F# compiler code base emits debug information and attributes. This article documents what we do, how it is implemented and the problem areas in our implementation.

There are mistakes and missing pieces to our debug information. Small improvements can make a major difference. Please help us fix mistakes and get things right.

The filetests\walkthroughs\DebugStepping\TheBigFileOfDebugStepping.fsx is crucial for testing the stepping experience for a range of constructs.

User experiences

Debugging information affects numerous user experiences:

Call stacks during debugging
Breakpoint placement before and during debugging
Locals during debugging
Just my code debugging (which limits the view of debug code to exclude libraries)
Exception debugging (e.g. "first chance" debugging when exceptions occur)
Stepping debugging
Watch window
Profiling results
Code coverage results

Some experiences are un-implemented by F# including:

Autos during debugging
Edit and Continue
Hot reload

Emitted information

Emitted debug information includes:

The names of methods in .NET IL
The PDB file/information (embedded or in PDB file) which contains
- Debug "sequence" points for IL code
- Names of locals and the IL code scopes over which those names are active
The attributes on IL methods such asCompilerGeneratedAttribute andDebuggerNonUserCodeAttribute, see below
We add some codegen to give better debug experiences, see below.

We almost always now emit thePortable PDB format.

Design-time services

IDE tooling performs queries into the F# language service, notably:

ValidateBreakpointLocation is called to validate every breakpoint before debugging is launched. This operates on syntax trees. See notes below.

Debugging and optimization

Nearly all optimizations areoff when debug code is being generated.

The optimizer is run for forced inlining only
List and array expressions do generate collector code
State machines are generated for tasks and sequences
"let mutable" --> "ref" promotion happens for captured local mutables
Tailcalls are off by default and not emitted in IlxGen.

Otherwise, what comes out of the type checker is pretty much what goes into IlxGen.fs.

We use the terms "sequence point" and "debug point" interchangeably. The word "sequence" has too many meanings in the F# compiler so in the actual code you'll see "DebugPoint" more often, though for abbreviations you may seespFoo ormFoo.

How breakpoints work (high level)

Breakpoints have two existences which must give matching behavior:

At design-time, before debugging is launched,ValidateBreakpointLocation is called to validate every breakpoint. This operates on the SyntaxTree and forms a kind of "gold-standard" about the exact places where break points are valid.
At run-time, breakpoints are "mapped" by the .NET runtime to actual sequence points found in the PDB data for .NET methods. The runtime searches all methods with debug points for the relevant document and determines where to "bind" the actual breakpoint to. A typical debugger can bind a breakpoint to multiple locations.

This means there is an invariant thatValidateBreakpointLocation and the emitted IL debug points correspond.

NOTE: The IL code can and does contain extra debug points that don't pass ValidateBreakpointLocation. It won't be possible to set a breakpoint for these, but they will appear in stepping.

Intended debug points based on syntax

The intended debug points for constructs are determined by syntax as follows. Processing depends on whether a construct is being processed as "control-flow" or not. This means at least one debug point will be placed, either over the whole expression or some of its parts.

The bodies of functions, methods, lambdas and initialization code for top-level-bindings are all processed as control flow
Each Upper-Cased EXPR below is processed as control-flow (the bodies of loops, conditionals etc.)
Leaf expressions are the other composite expressions like applications that are not covered by the other constructs.
The sub-expressions of leaf expressions are not processed as control-flow.

Construct	Debug points
`let x = leaf-expr in BODY-EXPR`	Debug point over`let x = leaf-expr`.
`let x = NON-LEAF-EXPR in BODY-EXPR`
`let f x = BODY-EXPR in BODY-EXPR`
`let rec f x = BODY-EXPR and g x = BODY-EXPR in BODY-EXPR`
`if guard-expr then THEN-EXPR`	Debug point over`if guard-expr then`
`if guard-expr then THEN-EXPR else ELSE-EXPR`	Debug point over`if .. then`
`match .. with ...`	Debug point over`match .. with`
`... -> TARGET-EXPR`
`... when WHEN-EXPR -> TARGET-EXPR`
`while .. do BODY-EXPR`	Debug point over`while .. do`
`for .. in collection-expr do BODY-EXPR`	Debug points over`for`,`in` and`collection-expr`
`try TRY-EXPR with .. -> HANDLER-EXPR`	Debug points over`try` and`with`
`try TRY-EXPR finally .. -> FINALLY-EXPR`	Debug points`try` and`finally`
`use x = leaf-expr in BODY-EXPR`	Debug point over`use x = leaf-expr`.
`use x = NON-LEAF-EXPR in BODY-EXPR`
`EXPR; EXPR`
`(fun .. -> BODY-EXPR)`	Not a leaf, do not produce a debug point on outer expression, but include them on BODY-EXPR
`{ new C(args) with member ... = BODY-EXPR }`
Pipe`EXPR1 && EXPR2`
Pipe`EXPR1 \|\| EXPR2`
Pipe`EXPR1 \|> EXPR2`
Pipe`(EXPR1, EXPR2) \|\|> EXPR3`
Pipe`(EXPR1, EXPR2, EXPR3) \|\|\|> EXPR4`
`yield leaf-expr`	Debug point over 'yield expr'
`yield! leaf-expr`	Debug point over 'yield! expr'
`return leaf-expr`	Debug point over 'return expr'
`return! leaf-expr`	Debug point over 'return! expr'
`[ BODY ]`	See notes below. If a computed list expression with yields (explicit or implicit) then process as control-flow. Otherwise treat as leaf
`[\| BODY \|]`	See notes below. If a computed list expression with yields (explicit or implicit) then process as control-flow. Otherwise treat as leaf
`seq { BODY }`	See notes below
`builder { BODY }`	See notes below
`f expr`,`new C(args)`, constants or other leaf	Debug point when being processed as control-flow. The sub-expressions are processed as non-control-flow.

Intended debug points for let-bindings

Simplelet bindings get debug points that extend over thelet (if the thing is not a function and the implementation is a leaf expression):

letf()=letx=1// debug point for whole of `let x = 1`letfx=1// no debug point on `let f x =`, debug point on `1`letx=iftodaythen1elsetomorrow// no debug point on `let x =`, debug point on `if today then` and `1` and `tomorrow`letx=lety=1iny+y// no debug point on `let x =`, debug point on `let y = 1` and `y + y`...

Intended debug points for nested control-flow

Debug points are not generally emitted for constituent parts of non-leaf constructs, in particular function applications, e.g. consider:

leth1x=g(fx)leth2x=x|>f|>g

Hereg (f x) gets one debug point covering the whole expression. The corresponding pipelining gets three debug points.

If however a nested expression is control-flow, then debug points start being emitted again e.g.

leth3x=f(iftodaythen1else2)

Here debug points are atif today then and1 and2 and all off (if today then 1 else 2)

NOTE: these debug points are overlapping. That's life.

Intended debug points for`[...]`,`[| ... |]` code

The intended debug points for computed list and array expressions are the same as for the expressions inside the constructs. For example

letx=[foriin1..10doyield1]

This will have debug points onfor i in 1 .. 10 do andyield 1.

Intended debug points for`seq { .. }` and`task { .. }` code

The intended debug points for tasks is the same as for the expressions inside the constructs. For example

letf()=task{foriin1..10doprintfn"hello"}

This will have debug points onfor i in 1 .. 10 do andprintfn "hello".

NOTE: there are glitches, see further below

Intended debug points for other computation expressions

Other computation expressions such asasync { .. } orbuilder { ... } get debug points as follows:

A debug point forbuilder prior to the evaluation of the expression
In the de-sugaring of the computation expression, each point a lambda is created implicitly, then the body of thatlambda as specified by the F# language spec is treated as control-flow and debug points added per the earlier spec.
For everybuilder.Bind,builder.BindReturn and similar call that corresponds to alet where there would be a debug point, a debug point is added immediately prior to the call.
For everybuilder.For call, a debug point covering thefor keyword is added immediately prior to the call. No debug point is added for thebuilder.For call itself even if used in statement position.
For everybuilder.While call, a debug point covering thewhile keyword plus guard expression is added immediately prior to the execution of the guard within the guard lambda expression. No debug point is added for thebuilder.While call itself even if used in statement position.
For everybuilder.TryFinally call, a debug point covering thetry keyword is added immediately within the body lambda expression. A debug point covering thefinally keyword is added immediately within the finally lambda expression. No debug point is added for thebuilder.TryFinally call itself even if used in statement position.
For everybuilder.Yield,builder.Return,builder.YieldFrom orbuilder.ReturnFrom call, debug points are placed on the expression as if it were control flow. For exampleyield 1 will place a debug point on1 andyield! printfn "hello"; [2] will place two debug points.
No debug point is added for thebuilder.Run,builder.Run orbuilder.Delay calls at the entrance to the computation expression, nor thebuilder.Delay calls implied bytry/with ortry/finally or sequentialCombine calls.

The computations are often "cold-start" anyway, leading to a two-phase debug problem.

The "step-into" and "step-over" behaviour for computation expressions is often buggy because it is performed with respect to the de-sugaring and inlining rather than the original source.For example, a "step over" on a "while" with a non-inlinedbuilder.While will step over the whole call, when the user expects it to step the loop.One approach is to inline thebuilder.While method, and apply[<InlineIfLambda>] to the body function. This however has only limited successas at some points inlining fails to fully flatten. Builders implemented with resumable code tend to be much better in this regards asmore complete inlining and code-flattening is applied.

Intended debug points for implicit constructors

Thelet anddo bindings of an implicit constructor generally gets debug points as if it were a function.
inherits SubClass(expr) gets a debug point. If there is no inherits, an initial debug point is placed over the text of the arguments.

e.g.

typeC(args)=letx=1+1// debug point over `let x = 1+1` as the only side effectletfx=x+1member_.P=x+f4typeC(args)=doprintfn"hello"// debug point over `printfn "hello"` as side effectstaticdoprintfn"hello"// debug point over `printfn "hello"` as side effect for static initletfx=x+1member_.P=x+f4typeC(args)=// debug point over `(args)` since there's no other place to stop on object constructionletfx=x+1member_.P=4

Internal implementation of debug points in the compiler

Most (but not all) debug points are noted by the parser by addingDebugPointAtTry,DebugPointAtWith,DebugPointAtFinally,DebugPointAtFor,DebugPointAtWhile,DebugPointAtBinding orDebugPointAtLeaf.

These are then used byValidateBreakpointLocation. These same values are also propagated unchanged all the way through toIlxGen.fs for actual code generation, and used for IL emit, e.g. a simple case like this:

matchspTrywith|DebugPointAtTry.Yesm->CG.EmitDebugPointcgbufm...|DebugPointAtTry.No->......

For many constructs this is adequate. However, in practice the situation is far more complicated.

Internals: Debug points for`[...]`,`[| ... |]`

The internal implementation of debug points for list and array expressions is conceptually simple but a little complex.

Conceptually the task is easy, e.g.[ while check() do yield x + x ] is lowered to code like this:

let$collector=ListCollector<int>()whilecheck()do$collector.Add(x+x)$collector.Close()

Note thewhile loop is still awhile loop - no magic here - and the debug points for thewhile loop can also apply to the actual generatedfor loop.

However, the actual implementation is more complicated because there is a TypedTree representation of the code in-between that at first seems to bear little resemblance to what comes in.

SyntaxTree --[CheckComputationExpressions.fs]--> TypedTree --> IlxGen -->[LowerComputedListOrArrayExpr.fs]--> IlxGen

The TypedTree is a functional encoding intoSeq.toList,Seq.singleton and so on. How do the debug points get propagated?

InCheckComputationExpressions.fs we "note" the debug point for the For loop and attach it to one of the lambdas generated in the TypedTreeForm
InLowerSequences.fs we "recover" the debug point from precisely that lambda.
InIlxGen.fs this becomes an actual debug point in the actual generated "while" loop.

This then gives accurate debug points for these constructs.

Internals: debug points for`seq { .. .}` code

Debug points forseq { .. } compiling to state machines poses similar problems.

The de-sugaring is as for list and array expressions
The debug points are recovered in the state machine generation

Internals: debug points for`task { .. .}` code

Debug points fortask { .. } poses much harder problems. We use "while" loops as an example:

The de-sugaring is for computation expressions, and in CheckComputationExpressions.fs places a debug point forwhile directly before the evaluation of the guard
The code is then checked and optimized, and all the resumable code is inlined, and this debug point is preserved throughout this process.

Internals: debug points for other computation expressions

As mentioned above, other computation expressions such asasync { .. } have significant problems with their debug points.

The main problem is stepping: even after inlining the code for computation expressions is rarely "flattened" enough, so, for example, a "step-into" is required to get into the second part of anexpr1; expr2 construct (i.e. anasync.Combine(..., async.Delay(fun () -> ...))) where the user expects to press "step-over".

Breakpoints tend to be less problematic.

NOTE: A systematic solution for quality debugging of computation expressions code is still elusive, and especially forasync { ... }. Extensive use of inlining andInlineIfLambda can succeed in flattening most simple computation expression code. This is however not yet fully applied toasync programming.NOTE: The use of library code to implement "async" and similar computation expressions also interacts badly with "Just My Code" debugging, seehttps://github.com/dotnet/fsharp/issues/5539 for example.NOTE: As mentioned, the use of many functions to implement "async" and friends implements badly with "Step Into" and "Step Over" and related attributes, see for examplehttps://github.com/dotnet/fsharp/issues/3359

FeeFee and F00F00 debug points (Hidden and JustMyCodeWithNoSource)

Some fragments of code use constructs generate calls and other IL code that should not have debug points and not participate in "Step Into", for example. These are generated in IlxGen as "FeeFee" debug points. See thethe Portable PDB spec linked here.

TODO: There is also the future prospect of generatingJustMyCodeWithNoSource (0xF00F00) debug points but these are not yet emitted by F#. We should check what this is and when the C# compiler emits these.NOTE: We always make space for a debug point at the head of each method byemitting a FeeFee debug sequence point. This may be immediately replaced by a "real" debug pointhere.

Generated code

The F# compiler generates entire IL classes and methods for constructs such as records, closures, state machines and so on. Each time code is generated we must carefully consider what attributes and debug points are generated.

Generated "augment" methods for records, unions and structs

Generated methods for equality, hash and comparison on records, unions and structs do not get debug points at all.

NOTE: Methods without debug points (or with only 0xFEEFEE debug points) are shown as "no code available" in Visual Studio - or in Just My Code they are hidden altogether - and are removed from profiling traces (in profiling, their costs are added to the cost of the calling method).TODO: we should also consider emittingExcludeFromCodeCoverageAttribute, being assessed at time of writing, however the absence of debug points should be sufficient to exclude these.

Generated "New", "Is", "Tag" etc. for unions

Discriminated unions generateNewXYZ,IsXYZ,Tag etc. members. These do not get debug points at all.

These methods also getCompilerGeneratedAttribute, andDebuggerNonUserCodeAttribute.

TODO: we should also consider emittingExcludeFromCodeCoverageAttribute, being assessed at time of writing, however the absence of debug points should be sufficient to exclude these.TODO: theNewABC methods are missingCompilerGeneratedAttribute, andDebuggerNonUserCodeAttribute. However, the absence of debug points should be sufficient to exclude these from code coverage and profiling.

Generated closures for lambdas

The debug codegen involved in closures is as follows:

Source	Construct	Debug Points	Attributes
(fun x -> ...)	Closure class
	`.ctor` method	none	CompilerGenerated, DebuggerNonUserCode
	`Invoke` method	from body of closure
generic local defn	Closure class
	`.ctor` method	none	CompilerGenerated, DebuggerNonUserCode
	`Specialize` method	from body of closure
Intermediate closure classes	For long curried closures`fun a b c d e f -> ...`.		CompilerGenerated, DebuggerNonUserCode

Generated intermediate closure methods do not get debug points, and are labelled CompilerGenerated and DebuggerNonUserCode.

TODO: we should also consider emittingExcludeFromCodeCoverageAttribute, being assessed at time of writing

Generated state machines for`seq { .. }`

Sequence expressions generate class implementations which resemble closures.

The debug points recovered for the generated state machine code forseq { ... } is covered up above. The other codegen is as follows:

Source	Construct	Debug Points	Attributes
seq { ... }	State machine class		"Closure"
	`.ctor` method	none	none
	`GetFreshEnumerator`	none	CompilerGenerated, DebuggerNonUserCode
	`LastGenerated`	none	CompilerGenerated, DebuggerNonUserCode
	`Close`	none	none
	`get_CheckClose`	none	none
	`GenerateNext`	from desugaring	none

NOTE: it appears from the code that extraneous debug points are not being generated, which is good, though should be checkedTODO: we should likely be generatingCompilerGeneratedAttribute andDebuggerNonUserCodeAttribute attributes for theClose andget_CheckClose and.ctor methodsTODO: we should also consider emittingExcludeFromCodeCoverageAttribute, being assessed at time of writing

Generated state machines for`task { .. }`

Resumable state machines used fortask { .. } also generate struct implementations which resemble closures.

The debug points recovered for the generated state machine code forseq { ... } is covered up above. The other codegen is as follows:

Source	Construct	Debug Points	Attributes
task { ... }	State machine struct		"Closure"
	`.ctor` method	none	none
	TBD

TODO: we should be generating attributes for some of theseTODO: we should assess that only the "MoveNext" method gets any debug points at allTODO: Currently stepping into a task-returning method needs a secondstep-into to get into the MoveNext method of the state machine. We should emit theStateMachineMethod andStateMachineHoistedLocalScopes tables into the PDB to get better debugging intotask methods. Seehttps://github.com/dotnet/fsharp/issues/12000.

Generated code for delegate constructions`Func<int,int,int>(fun x y -> x + y)`

A closure class is generated. Consider the code

openSystemletd=Func<int,int,int>(funxy->x+y)

There is one debug point over all ofFunc<int,int,int>(fun x y -> x + y) and one overx+y.

Generated code for constant-sized array and list expressions

These are not generally problematic for debug.

Generated code for large constant arrays

These are not generally problematic for debug.

Generated code for pattern matching

The implementation is a little gnarly and complicated and has historically had glitches.

Generated code for conditionals and boolean logic

Generally straight-forward. See for examplethis proposed feature improvement

Capture and closures

Captured locals are available via thethis pointer of the immediate closure. Un-captured locals arenot available as things stand. See for examplethis proposed feature improvement.

Consider this code:

letF()=letx=1lety=2(fun()->x+y)

Herex andy become closure fields of the closure class generated for the final lambda. When inspecting locals in the inner closure, the C# expression evaluator we rely on for Visual Studio takes local names likex andy and is happy to look them up viathis. This means hovering overx correctly produces the value stored inthis.x.

For nested closures, values are implicitly re-captured, and again the captured locals will be available.

However this doesn't work with "capture" from a class-defined "let" context. Consider the following variation:

typeC()=letx=1member_.M()=lety=2(fun()->x+y)

Here the implicitly captured local isy, butx isnot captured, instead it is implicitly rewritten by the F# compiler toc.x wherec is the captured outer "this" pointer of the invocation ofM(). This means that hovering overx does not produce a value. Seeissue 3759.

Provided code

Code provided by erasing type providers has all debugging points removed. It isn't possible to step into such code or if there are implicit debug points they will be the same range as the construct that was macro-expanded by the code erasure.

For example, aprovided if/then/else expression has no debug point

Added code generation for better debugging

We do some "extra" code gen to improve debugging. It is likely much of this could be removed if we had an expression evaluator for F#.

'this' value

Formember x.Foo() = ... the implementation of the member adds a local variablex containing thethis pointer fromldarg.0. This means hovering overx in the method produces the right value, as doesx.Property etc.

Pipeline debugging

For pipeline debugging we emit extra locals for each stage of a pipe and debug points at each stage.

Seepipeline debugging mini-spec.

Shadowed locals

For shadowed locals we change the name of a local for the scope for which it is shadowed.

Seeshadowed locals mini-spec.

Discriminated union debug display text

For discriminated union types and all implied subtypes we emit aDebuggerDisplayAttribute and a private__DebugDisplay() method that usessprintf "%+0.8A" obj to format the object.

Missing debug emit

Missing debug emit for PDBs

Our PDB emit is missing considerable information:

Not emitted:LocalConstants table
Not emitted:Compilation options table
Not emitted:Dynamic local variables table
Not emitted:StateMachineMethod table and StateMachineHoistedLocalScopes table
Not emitted:ImportScopes table

These are major holes in the F# experience. Some are required for things like hot-reload.

Missing design-time services

Some design-time services are un-implemented by F#:

Unimplemented:F# expression evaluator
Unimplemented:Proximity expressions (for Autos window)

These are major holes in the F# experience and should be implemented.

val f: unit -> 'a

val x: int

val f: x: 'b -> int

val x: 'b

val y: int

val h1: x: 'a -> 'b

val x: 'a

val h2: x: unit -> 'a

val x: unit

val h3: x: 'a -> 'b

val x: int list

val i: int

val f: unit -> System.Threading.Tasks.Task<unit>

val task: TaskBuilder

val printfn: format: Printf.TextWriterFormat<'T> -> 'T

type C = new: args: obj -> C member P: int with get

val args: obj

val f: x: int -> int

member C.P: int with get

Multiple items
type C = new: args: obj -> C member P: int with get

--------------------
new: args: obj -> C

Multiple items
val int: value: 'T -> int (requires member op_Explicit)

--------------------
type int = int32

--------------------
type int<'Measure> = int

namespace System

Movatterモバイル変換