Toggle site navigation sidebar

Toggle table of contents sidebar

Using LLDB

Scripting LLDB

Developing LLDB

External Links

Edit this page

Toggle table of contents sidebar

Variable Formatting #

LLDB has a data formatters subsystem that allows users to define custom displayoptions for their variables.

Usually, when you typeframevariable or run some expression LLDB willautomatically choose the way to display your results on a per-type basis, as inthe following example:

(lldb)framevariable(uint8_t)x='a'(intptr_t)y=124752287

Note:framevariable without additional arguments prints the list ofvariables of the current frame.

However, in certain cases, you may want to associate a different style to thedisplay for certain datatypes. To do so, you need to give hints to the debuggeras to how variables should be displayed. The LLDB type command allows you to dojust that.

Using it you can change your visualization to look like this:

(lldb)framevariable(uint8_t)x=chr='a'dec=65hex=0x41(intptr_t)y=0x76f919f

In addition, some data structures can encode their data in a way that is noteasily readable to the user, in which case a data formatter can be used toshow the data in a human readable way. For example, without a formatter,printing astd::deque<int> with the elements{2,3,4,5,6} wouldresult in something like:

(lldb) frame variable a_deque(std::deque<Foo, std::allocator<int> >) $0 = {   std::_Deque_base<Foo, std::allocator<int> > = {      _M_impl = {         _M_map = 0x000000000062ceb0         _M_map_size = 8         _M_start = {            _M_cur = 0x000000000062cf00            _M_first = 0x000000000062cf00            _M_last = 0x000000000062d2f4            _M_node = 0x000000000062cec8         }         _M_finish = {            _M_cur = 0x000000000062d300            _M_first = 0x000000000062d300            _M_last = 0x000000000062d6f4            _M_node = 0x000000000062ced0         }      }   }}

which is very hard to make sense of.

Note:framevariable<var> prints out the variable<var> in the currentframe.

On the other hand, a proper formatter is able to produce the following output:

(lldb) frame variable a_deque(std::deque<Foo, std::allocator<int> >) $0 = size=5 {   [0] = 2   [1] = 3   [2] = 4   [3] = 5   [4] = 6}

which is what the user would expect from a good debugger.

Note: you can also usev<var> instead offramevariable<var>.

It’s worth mentioning that thesize=5 string is produced by a summaryprovider and the list of children is produced by a synthetic child provider.More information about these providers is available later in this document.

There are several features related to data visualization: formats, summaries,filters, synthetic children.

To reflect this, the type command has five subcommands:

typeformattypesummarytypefiltertypesynthetictypecategory

These commands are meant to bind printing options to types. When variables areprinted, LLDB will first check if custom printing options have been associatedto a variable’s type and, if so, use them instead of picking the defaultchoices.

Each of the commands (excepttypecategory) has four subcommands available:

add: associates a new printing option to one or more types
delete: deletes an existing association
list: provides a listing of all associations
clear: deletes all associations

Type Format#

Type formats enable you to quickly override the default format for displayingprimitive types (the usual basic C/C++/ObjC types: int, float, char, …).

If for some reason you want all int variables in your program to print out ashex, you can add a format to the int type.

This is done by typing

(lldb)typeformatadd--formathexint

at the LLDB command line.

The--format (which you can shorten to -f) option accepts aformatname. Then, you provide one or more types to which you want thenew format applied.

A frequent scenario is that your program has a typedef for a numeric type thatyou know represents something that must be printed in a certain way. Again, youcan add a format just to that typedef by using type format add with the namealias.

But things can quickly get hierarchical. Let’s say you have a situation likethe following:

typedefintA;typedefAB;typedefBC;typedefCD;

and you want to show all A’s as hex, all C’s as byte arrays and leave thedefaults untouched for other types (albeit its contrived look, the example isfar from unrealistic in large software systems).

If you simply type

(lldb)typeformatadd-fhexA(lldb)typeformatadd-fuint8_t[]C

values of type B will be shown as hex and values of type D as byte arrays, as in:

(lldb)framevariable-T(A)a=0x00000001(B)b=0x00000002(C)c={0x030x000x000x00}(D)d={0x040x000x000x00}

This is because by default LLDB cascades formats through typedef chains. Inorder to avoid that you can use the option -C no to prevent cascading, thusmaking the two commands required to achieve your goal:

(lldb)typeformatadd-Cno-fhexA(lldb)typeformatadd-Cno-fuint8_t[]C

which provides the desired output:

(lldb)framevariable-T(A)a=0x00000001(B)b=2(C)c={0x030x000x000x00}(D)d=4

Note, that qualifiers such as const and volatile will be stripped when matching types for example:

(lldb)framevarxyz(int)x=1(constint)y=2(volatileint)z=4(lldb)typeformatadd-fhexint(lldb)framevarxyz(int)x=0x00000001(constint)y=0x00000002(volatileint)z=0x00000004

Two additional options that you will want to look at are –skip-pointers (-p)and –skip-references (-r). These two options prevent LLDB from applying aformat for type T to values of type T* and T& respectively.

(lldb)typeformatadd-ffloat32[]int(lldb)framevariablepointer*pointer-T(int*)pointer={1.46991e-391.4013e-45}(int)*pointer={1.53302e-42}(lldb)typeformatadd-ffloat32[]int-p(lldb)framevariablepointer*pointer-T(int*)pointer=0x0000000100100180(int)*pointer={1.53302e-42}

While they can be applied to pointers and references, formats will make noattempt to dereference the pointer and extract the value before applying theformat, which means you are effectively formatting the address stored in thepointer rather than the pointee value. For this reason, you may want to use the-p option when defining formats.

If you need to delete a custom format simply type type format delete followedby the name of the type to which the format applies.Even if you defined thesame format for multiple types on the same command, type format delete willonly remove the format for the type name passed as argument.

To delete ALL formats, usetypeformatclear. To see all the formatsdefined, use type format list.

If all you need to do, however, is display one variable in a custom format,while leaving the others of the same type untouched, you can simply type:

(lldb)framevariablecounter-fhex

This has the effect of displaying the value of counter as an hexadecimalnumber, and will keep showing it this way until you either pick a differentformat or till you let your program run again.

Finally, this is a list of formatting options available out of which you canpick:

Format name	Abbreviation	Description
`default`		the default LLDB algorithm is used to pick a format
`boolean`	B	show this as a true/false boolean, using the customary rule that 0 isfalse and everything else is true
`binary`	b	show this as a sequence of bits
`bytes`	y	show the bytes one after the other
`byteswithASCII`	Y	show the bytes, but try to display them as ASCII characters as well
`character`	c	show the bytes as ASCII characters
`printablecharacter`	C	show the bytes as printable ASCII characters
`complexfloat`	F	interpret this value as the real and imaginary part of a complexfloating-point number
`c-string`	s	show this as a 0-terminated C string
`decimal`	d	show this as a signed integer number (this does not perform a cast, itsimply shows the bytes as an integer with sign)
`enumeration`	E	show this as an enumeration, printing thevalue’s name if available or the integer value otherwise
`hex`	x	show this as in hexadecimal notation (this doesnot perform a cast, it simply shows the bytes as hex)
`float`	f	show this as a floating-point number (this does not perform a cast, itsimply interprets the bytes as an IEEE754 floating-point value)
`octal`	o	show this in octal notation
`OSType`	O	show this as a MacOS OSType
`unicode16`	U	show this as UTF-16 characters
`unicode32`		show this as UTF-32 characters
`unsigneddecimal`	u	show this as an unsigned integer number (this does not perform a cast,it simply shows the bytes as unsigned integer)
`pointer`	p	show this as a native pointer (unless this is really a pointer, theresulting address will probably be invalid)
`char[]`		show this as an array of characters
`int8_t[],uint8_t[]int16_t[],uint16_t[]int32_t[],uint32_t[]int64_t[],uint64_t[]uint128_t[]`		show this as an array of the corresponding integer type
`float32[],float64[]`		show this as an array of the corresponding floating-point type
`complexinteger`	I	interpret this value as the real and imaginary part of a complex integernumber
`characterarray`	a	show this as a character array
`address`	A	show this as an address target (symbol/file/line + offset), possiblyalso the string this address is pointing to
`hexfloat`		show this as hexadecimal floating point
`instruction`	i	show this as an disassembled opcode
`void`	v	don’t show anything

Type Summary#

Type formats work by showing a different kind of display for the value of avariable. However, they only work for basic types. When you want to display aclass or struct in a custom format, you cannot do that using formats.

A different feature, type summaries, works by extracting information fromclasses, structures, … (aggregate types) and arranging it in a user-definedformat, as in the following example:

before adding a summary…

(lldb)framevariable-Tone(i_am_cool)one={(int)x=3(float)y=3.14159(char)z='E'}

after adding a summary…

(lldb)framevariableone(i_am_cool)one=int=3,float=3.14159,char=69

There are two ways to use type summaries: the first one is to bind a summarystring to the type; the second is to write a Python script that returns thestring to be used as summary. Both options are enabled by the type summary addcommand.

The command to obtain the output shown in the example is:

(lldb)typesummaryadd--summary-string"int = ${var.x}, float = ${var.y}, char = ${var.z%u}"i_am_cool

Initially, we will focus on summary strings, and then describe the Pythonbinding mechanism.

Summary Format Matching On Pointers#

A summary formatter for a typeT might or might not be appropriate to usefor pointers to that type. If the formatter is only appropriate for the type andnot its pointers, use the-p option to restrict it to match SBValues of typeT. If you want the formatter to also match pointers to the type, you can usethe-d option to specify how many pointer layers the formatter should match.The default value is 1, so if you don’t specify-p or-d, your formatterwill be used on SBValues of typeT andT*. If you want to also matchT** set-d to 2, etc. In all cases, the SBValue passed to the summaryformatter will be the matched ValueObject. lldb doesn’t dereference the matchedvalue down to the SBValue of typeT before passing it to your formatter.

Summary Strings#

Summary strings are written using a simple control language, exemplified by thesnippet above. A summary string contains a sequence of tokens that areprocessed by LLDB to generate the summary.

Summary strings can contain plain text, control characters and specialvariables that have access to information about the current object and theoverall program state.

Plain text is any sequence of characters that doesn’t contain a{,},$,or\ character, which are the syntax control characters.

The special variables are found in between a “${” prefix, and end with a “}”suffix. Variables can be a simple name or they can refer to complex objectsthat have subitems themselves. In other words, a variable looks like${object} or${object.child.otherchild}. A variable can also beprefixed or suffixed with other symbols meant to change the way its value ishandled. An example is${*var.int_pointer[0-3]}.

Basically, the syntax is the same one described Frame and Thread Formattingplus additional symbols specific for summary strings. The main of them is${var, which is used refer to the variable that a summary is being created for.

The simplest thing you can do is grab a member variable of a class or structureby typing its expression path. In the previous example, the expression path forthe field float y is simply .y. Thus, to ask the summary string to display yyou would type ${var.y}.

If you have code like the following:

structA{intx;inty;};structB{Ax;Ay;int*z;};

the expression path for the y member of the x member of an object of type Bwould be .x.y and you would type${var.x.y} to display it in a summarystring for type B.

By default, a summary defined for type T, also works for types T* and T& (youcan disable this behavior if desired). For this reason, expression paths do notdifferentiate between . and ->, and the above expression path .x.y would bejust as good if you were displaying a B*, or even if the actual definition of Bwere:

structB{A*x;Ay;int*z;};

This is unlike the behavior of frame variable which, on the contrary, willenforce the distinction. As hinted above, the rationale for this choice is thatwaiving this distinction enables you to write a summary string once for type Tand use it for both T and T* instances. As a summary string is mostly aboutextracting nested members’ information, a pointer to an object is just as goodas the object itself for the purpose.

If you need to access the value of the integer pointed to by B::z, you cannotsimply say ${var.z} because that symbol refers to the pointer z. In order todereference it and get the pointed value, you should say${*var.z}. The${*var tells LLDB to get the object that the expression paths leads to, andthen dereference it. In this example is it equivalent to*(bObject.z) inC/C++ syntax. Because. and-> operators can both be used, there is noneed to have dereferences in the middle of an expression path (e.g. you do notneed to type${*(var.x).x}) to read A::x as contained in*(B::x). Toachieve that effect you can simply write${var.x->x}, or even${var.x.x}. The* operator only binds to the result of the wholeexpression path, rather than piecewise, and there is no way to use parenthesesto change that behavior.

Of course, a summary string can contain more than one ${var specifier, and canuse${var and${*var specifiers together.

Formatting Summary Elements#

An expression path can include formatting codes. Much like the type formatsdiscussed previously, you can also customize the way variables are displayed insummary strings, regardless of the format they have applied to their types. Todo that, you can use %format inside an expression path, as in ${var.x->x%u},which would display the value of x as an unsigned integer.

Additionally, custom output can be achieved by using an LLVM format string,commencing with the: marker. To illustrate, compare${var.byte%x} and${var.byte:x-}. The former uses lldb’s builtin hex formatting (x),which unconditionally inserts a0x prefix, and also zero pads the value tomatch the size of the type. The latter usesllvm::formatv formatting(:x-), and will print only the hex value, with no0x prefix, and nopadding. This raw control is useful when composing multiple pieces into alarger whole.

You can also use some other special format markers, not available for formatsthemselves, but which carry a special meaning when used in this context:

Symbol	Description
`Symbol`	`Description`
`%S`	Use this object’s summary (the default for aggregate types)
`%V`	Use this object’s value (the default for non-aggregate types)
`%@`	Use a language-runtime specific description (for C++ this does nothing, for Objective-C it calls the NSPrintForDebugger API)
`%L`	Use this object’s location (memory address, register name, …)
`%#`	Use the count of the children of this object
`%T`	Use this object’s datatype name
`%N`	Print the variable’s basename
`%>`	Print the expression path for this item

Since lldb 3.7.0, you can also specify${script.var:pythonFuncName}.

It is expected that the function name you use specifies a function whosesignature is the same as a Python summary function. The return string from thefunction will be placed verbatim in the output.

You cannot use element access, or formatting symbols, in combination with thissyntax. For example the following:

${script.var.element[0]:myFunctionName%@}

is not valid and will cause the summary to fail to evaluate.

Element Inlining#

Option –inline-children (-c) to type summary add tells LLDB not to look for a summary string, but instead to just print a listing of all the object’s children on one line.

As an example, given a type pair:

(lldb)framevariable--show-typesa_pair(pair)a_pair={(int)first=1;(int)second=2;}

If one types the following commands:

(lldb)typesummaryadd--inline-childrenpair

the output becomes:

(lldb)framevariablea_pair(pair)a_pair=(first=1,second=2)

Of course, one can obtain the same effect by typing

(lldb)typesummaryaddpair--summary-string"(first=${var.first}, second=${var.second})"

While the final result is the same, using –inline-children can often savetime. If one does not need to see the names of the variables, but just theirvalues, the option –omit-names (-O, uppercase letter o), can be combined with–inline-children to obtain:

(lldb)framevariablea_pair(pair)a_pair=(1,2)

which is of course the same as typing

(lldb)typesummaryaddpair--summary-string"(${var.first}, ${var.second})"

Bitfields And Array Syntax#

Sometimes, a basic type’s value actually represents several different valuespacked together in a bitfield.

With the classical view, there is no way to look at them. Hexadecimal displaycan help, but if the bits actually span nibble boundaries, the help is limited.

Binary view would show it all without ambiguity, but is often too detailed andhard to read for real-life scenarios.

To cope with the issue, LLDB supports native bitfield formatting in summarystrings. If your expression paths leads to a so-called scalar type (the usualint, float, char, double, short, long, long long, double, long double andunsigned variants), you can ask LLDB to only grab some bits out of the valueand display them in any format you like. If you only need one bit you can usethe [n], just like indexing an array. To extract multiple bits, you can use aslice-like syntax: [n-m], e.g.

(lldb)framevariablefloat_point(float)float_point=-3.14159

(lldb)typesummaryadd--summary-string"Sign: ${var[31]%B} Exponent: ${var[30-23]%x} Mantissa: ${var[0-22]%u}"float(lldb)framevariablefloat_point(float)float_point=-3.14159Sign:trueExponent:0x00000080Mantissa:4788184

In this example, LLDB shows the internal representation of a float variable byextracting bitfields out of a float object.

When typing a range, the extremes n and m are always included, and the order ofthe indices is irrelevant.

LLDB also allows to use a similar syntax to display array members inside a summary string. For instance, you may want to display all arrays of a given type using a more compact notation than the default, and then just delve into individual array members that prove interesting to your debugging task. You can tell LLDB to format arrays in special ways, possibly independent of the way the array members’ datatype is formatted.e.g.

(lldb)framevariablesarray(Simple[3])sarray={[0]={x=1y=2z='\x03'}[1]={x=4y=5z='\x06'}[2]={x=7y=8z='\t'}}(lldb)typesummaryadd--summary-string"${var[].x}""Simple [3]"(lldb)framevariablesarray(Simple[3])sarray=[1,4,7]

The [] symbol amounts to: if var is an array and I know its size, apply this summary string to every element of the array. Here, we are asking LLDB to display .x for every element of the array, and in fact this is what happens. If you find some of those integers anomalous, you can then inspect that one item in greater detail, without the array format getting in the way:

(lldb)framevariablesarray[1](Simple)sarray[1]={x=4y=5z='\x06'}

You can also ask LLDB to only print a subset of the array range by using thesame syntax used to extract bit for bitfields:

(lldb)typesummaryadd--summary-string"${var[1-2].x}""Simple [3]"(lldb)framevariablesarray(Simple[3])sarray=[4,7]

If you are dealing with a pointer that you know is an array, you can use thissyntax to display the elements contained in the pointed array instead of justthe pointer value. However, because pointers have no notion of their size, theempty brackets [] operator does not work, and you must explicitly providehigher and lower bounds.

In general, LLDB needs the square bracketsoperator[] in order to handlearrays and pointers correctly, and for pointers it also needs a range. However,a few special cases are defined to make your life easier:

you can print a 0-terminated string (C-string) using the %s format, omittingsquare brackets, as in:

(lldb)typesummaryadd--summary-string"${var%s}""char *"

This syntax works for char* as well as for char[] because LLDB can rely on thefinal 0 terminator to know when the string has ended.

LLDB has default summary strings for char* and char[] that use this specialcase. On debugger startup, the following are defined automatically:

(lldb)typesummaryadd--summary-string"${var%s}""char *"(lldb)typesummaryadd--summary-string"${var%s}"-x"char \[[0-9]+]"

any of the array formats (int8_t[], float32{}, …), and the y, Y and a formatswork to print an array of a non-aggregate type, even if square brackets areomitted.

(lldb)typesummaryadd--summary-string"${var%int32_t[]}""int [10]"

This feature, however, is not enabled for pointers because there is no way forLLDB to detect the end of the pointed data.

This also does not work for other formats (e.g. boolean), and you must specifythe square brackets operator to get the expected output.

Python Scripting#

Most of the times, summary strings prove good enough for the job of summarizingthe contents of a variable. However, as soon as you need to do more thanpicking some values and rearranging them for display, summary strings stopbeing an effective tool. This is because summary strings lack the power toactually perform any kind of computation on the value of variables.

To solve this issue, you can bind some Python scripting code as a summary foryour datatype, and that script has the ability to both extract childrenvariables as the summary strings do and to perform active computation on theextracted values. As a small example, let’s say we have a Rectangle class:

classRectangle{private:intheight;intwidth;public:Rectangle():height(3),width(5){}Rectangle(intH):height(H),width(H*2-1){}Rectangle(intH,intW):height(H),width(W){}intGetHeight(){returnheight;}intGetWidth(){returnwidth;}};

Summary strings are effective to reduce the screen real estate used by thedefault viewing mode, but are not effective if we want to display the area andperimeter of Rectangle objects

To obtain this, we can simply attach a small Python script to the Rectangleclass, as shown in this example:

(lldb)typesummaryadd-PRectangleEnteryourPythoncommand(s).Type'DONE'toend.deffunction(valobj,internal_dict,options):height_val=valobj.GetChildMemberWithName('height')width_val=valobj.GetChildMemberWithName('width')height=height_val.GetValueAsUnsigned(0)width=width_val.GetValueAsUnsigned(0)area=height*widthperimeter=2*(height+width)return'Area: '+str(area)+', Perimeter: '+str(perimeter)DONE(lldb)framevariable(Rectangle)r1=Area:20,Perimeter:18(Rectangle)r2=Area:72,Perimeter:36(Rectangle)r3=Area:16,Perimeter:16

In order to write effective summary scripts, you need to know the LLDB publicAPI, which is the way Python code can access the LLDB object model. For furtherdetails on the API you should look at the LLDB API reference documentation.

As a brief introduction, your script is encapsulated into a function that ispassed two parameters:valobj andinternal_dict.

internal_dict is an internal support parameter used by LLDB and you shouldnot touch it.

valobj is the object encapsulating the actual variable being displayed, andits type isSBValue. Out of the many possible operations on anSBValue, thebasic one is retrieve the children objects it contains (essentially, the fieldsof the object wrapped by it), by callingGetChildMemberWithName(), passingit the child’s name as a string.

If the variable has a value, you can ask for it, and return it as a stringusingGetValue(), or as a signed/unsigned number usingGetValueAsSigned(),GetValueAsUnsigned(). It is also possible toretrieve anSBData object by callingGetData() and then read the object’scontents out of theSBData.

If you need to delve into several levels of hierarchy, as you can do withsummary strings, you can use the methodGetValueForExpressionPath(),passing it an expression path just like those you could use for summary strings(one of the differences is that dereferencing a pointer does not occur byprefixing the path with a*`, but by calling theDereference() methodon the returnedSBValue). If you need to access array slices, you cannot dothat (yet) via this method call, and you must useGetChildAtIndex()querying it for the array items one by one. Also, handling custom formats issomething you have to deal with on your own.

options Python summary formatters can optionally define thisthird argument, which is an object of typelldb.SBTypeSummaryOptions,allowing for a few customizations of the result. The decision toadopt or not this third argument - and the meaning of optionsthereof - is up to the individual formatter’s writer.

Other than interactively typing a Python script there are two other ways foryou to input a Python script as a summary:

using the –python-script option to type summary add and typing the scriptcode as an option argument; as in:

(lldb)typesummaryadd--python-script"height = valobj.GetChildMemberWithName('height').GetValueAsUnsigned(0);width = valobj.GetChildMemberWithName('width').GetValueAsUnsigned(0); return 'Area:%d' % (height*width)"Rectangle

using the –python-function (-F) option to type summary add and giving thename of a Python function with the correct prototype. Most probably, you willdefine (or have already defined) the function in the interactive interpreter,or somehow loaded it from a file, using the command script import command.LLDB will emit a warning if it is unable to find the function you passed, butwill still register the binding.

Regular Expression Typenames#

As you noticed, in order to associate the custom summary string to the arraytypes, one must give the array size as part of the typename. This can longbecome tiresome when using arrays of different sizes, Simple [3], Simple [9],Simple [12], …

If you use the -x option, type names are treated as regular expressions insteadof type names. This would let you rephrase the above example for arrays of typeSimple [3] as:

(lldb)typesummaryadd--summary-string"${var[].x}"-x"Simple \[[0-9]+\]"(lldb)framevariable(Simple[3])sarray=[1,4,7](Simple[2])sother=[3,6]

The above scenario works for Simple [3] as well as for any other array ofSimple objects.

While this feature is mostly useful for arrays, you could also use regularexpressions to catch other type sets grouped by name. However, as regularexpression matching is slower than normal name matching, LLDB will first try tomatch by name in any way it can, and only when this fails, will it resort toregular expression matching.

One of the ways LLDB uses this feature internally, is to match the names of STLcontainer classes, regardless of the template arguments provided. The detailsfor this are found at FormatManager.cpp

The regular expression language used by LLDB is the POSIX extended language, asdefined by the Single UNIX Specification, of which macOS is a compliantimplementation.

Names Summaries#

For a given type, there may be different meaningful summary representations.However, currently, only one summary can be associated to a type at eachmoment. If you need to temporarily override the association for a variable,without changing the summary string for to its type, you can use namedsummaries.

Named summaries work by attaching a name to a summary when creating it. Then,when there is a need to attach the summary to a variable, the frame variablecommand, supports a –summary option that tells LLDB to use the named summarygiven instead of the default one.

(lldb)typesummaryadd--summary-string"x=${var.integer}"--nameNamedSummary(lldb)framevariableone(i_am_cool)one=int=3,float=3.14159,char=69(lldb)framevariableone--summaryNamedSummary(i_am_cool)one=x=3

When defining a named summary, binding it to one or more types becomesoptional. Even if you bind the named summary to a type, and later change thesummary string for that type, the named summary will not be changed by that.You can delete named summaries by using the type summary delete command, as ifthe summary name was the datatype that the summary is applied to

A summary attached to a variable using the –summary option, has the samesemantics that a custom format attached using the -f option has: it staysattached till you attach a new one, or till you let your program run again.

Synthetic Children#

Summaries work well when one is able to navigate through an expression path. Inorder for LLDB to do so, appropriate debugging information must be available.

Some types are opaque, i.e. no knowledge of their internals is provided. Whenthat’s the case, expression paths do not work correctly.

In other cases, the internals are available to use in expression paths, butthey do not provide a user-friendly representation of the object’s value.

For instance, consider an STL vector, as implemented by the GNU C++ Library:

(lldb)framevariablenumbers-T(std::vector<int>)numbers={(std::_Vector_base<int,std::allocator<int>>)std::_Vector_base<int,std::allocator<int>>={(std::_Vector_base<int,std::allocator&tl;int>>::_Vector_impl)_M_impl={(int*)_M_start=0x00000001001008a0(int*)_M_finish=0x00000001001008a8(int*)_M_end_of_storage=0x00000001001008a8}}}

Here, you can see how the type is implemented, and you can write a summary forthat implementation but that is not going to help you infer what items areactually stored in the vector.

What you would like to see is probably something like:

(lldb)framevariablenumbers-T(std::vector<int>)numbers={(int)[0]=1(int)[1]=12(int)[2]=123(int)[3]=1234}

Synthetic children are a way to get that result.

The feature is based upon the idea of providing a new set of children for avariable that replaces the ones available by default through the debuginformation. In the example, we can use synthetic children to provide thevector items as children for the std::vector object.

In order to create synthetic children, you need to provide a Python class thatadheres to a given interface (the word is italicized because Python has noexplicit notion of interface, by that word we mean a given set of methods mustbe implemented by the Python class):

classSyntheticChildrenProvider:def__init__(self,valobj,internal_dict):thiscallshouldinitializethePythonobjectusingvalobjasthevariabletoprovidesyntheticchildrenfordefnum_children(self,max_children):thiscallshouldreturnthenumberofchildrenthatyouwantyourobjecttohave[1]defget_child_index(self,name):thiscallshouldreturntheindexofthesyntheticchildwhosenameisgivenasargumentdefget_child_at_index(self,index):thiscallshouldreturnanewLLDBSBValueobjectrepresentingthechildattheindexgivenasargumentdefupdate(self):thiscallshouldbeusedtoupdatetheinternalstateofthisPythonobjectwheneverthestateofthevariablesinLLDBchanges.[2]Also,thismethodisinvokedbeforeanyothermethodintheinterface.defhas_children(self):thiscallshouldreturnTrueifthisobjectmighthavechildren,andFalseifthisobjectcanbeguaranteednottohavechildren.[3]defget_value(self):thiscallcanreturnanSBValuetobepresentedasthevalueofthesyntheticvalueunderconsideration.[4]

As a warning, exceptions that are thrown by python formatters are caughtsilently by LLDB and should be handled appropriately by the formatter itself.Being more specific, in case of exceptions, LLDB might assume that the givenobject has no children or it might skip printing some children, as they areprinted one by one.

[1] Themax_children argument is optional (since lldb 3.8.0) and indicates themaximum number of children that lldb is interested in (at this moment). If thecomputation of the number of children is expensive (for example, requirestraversing a linked list to determine its size) your implementation may returnmax_children rather than the actual number. If the computation is cheap (e.g., thenumber is stored as a field of the object), then you can always return the truenumber of children (that is, ignore themax_children argument).

[2] This method is optional. Also, a boolean value must be returned (since lldb3.1.0). IfFalse is returned, then whenever the process reaches a new stop,this method will be invoked again to generate an updated list of the childrenfor a given variable. Otherwise, ifTrue is returned, then the value iscached and this method won’t be called again, effectively freezing the state ofthe value in subsequent stops. Beware that returningTrue incorrectly couldshow misleading information to the user.

[3] This method is optional (since lldb 3.2.0). While implementing it in termsof num_children is acceptable, implementors are encouraged to look foroptimized coding alternatives whenever reasonable.

[4] This method is optional (since lldb 3.5.2). TheSBValue you return herewill most likely be a numeric type (int, float, …) as its value bytes will beused as-if they were the value of the rootSBValue proper. As a shortcut forthis, you can inherit from lldb.SBSyntheticValueProvider, and just defineget_value as other methods are defaulted in the superclass as returning defaultno-children responses.

If a synthetic child provider supplies a special child named$$dereference$$ then it will be used when evaluatingoperator* andoperator-> in the frame variable command and related SB APIfunctions. It is possible to declare this synthetic child withoutincluding it in the range of children displayed by LLDB. For example,this subset of a synthetic children provider class would allow thesynthetic value to be dereferenced without actually showing anysynthetic children in the UI:

classSyntheticChildrenProvider:[...]defnum_children(self):return0defget_child_index(self,name):ifname=='$$dereference$$':return0return-1defget_child_at_index(self,index):ifindex==0:return<valobjresultingfromdereference>returnNone

For examples of how synthetic children are created, you are encouraged to lookat examples/synthetic in the LLDB trunk. Please, be aware that the code inthose files (except bitfield/) is legacy code and is not maintained. You mayespecially want to begin looking at this example to get a feel for thisfeature, as it is a very easy and well commented example.

The design pattern consistently used in synthetic providers shipping with LLDBis to use the __init__ to store theSBValue instance as a part of self. Theupdate function is then used to perform the actual initialization. Once asynthetic children provider is written, one must load it into LLDB before itcan be used. Currently, one can use the LLDB script command to type Python codeinteractively, or use the command script import fileName command to load Pythoncode from a Python module (ordinary rules apply to importing modules this way).A third option is to type the code for the provider class interactively whileadding it.

For example, let’s pretend we have a class Foo for which a synthetic childrenprovider class Foo_Provider is available, in a Python module contained in file~/Foo_Tools.py. The following interaction sets Foo_Provider as a syntheticchildren provider in LLDB:

(lldb)commandscriptimport~/Foo_Tools.py(lldb)typesyntheticaddFoo--python-classFoo_Tools.Foo_Provider(lldb)framevariablea_foo(Foo)a_foo={x=1y="Hello world"}

LLDB has synthetic children providers for a core subset of STL classes, both inthe version provided by libstdcpp and by libcxx, as well as for severalFoundation classes.

Synthetic children extend summary strings by enabling a new special variable:${svar.

This symbol tells LLDB to refer expression paths to the synthetic childreninstead of the real ones. For instance,

(lldb)typesummaryadd--expand-x"std::vector<"--summary-string"${svar%#} items"(lldb)framevariablenumbers(std::vector<int>)numbers=4items{(int)[0]=1(int)[1]=12(int)[2]=123(int)[3]=1234}

It’s important to mention that LLDB invokes the synthetic child provider beforeinvoking the summary string provider, which allows the latter to have access tothe actual displayable children. This applies to both inlined summary stringsand python-based summary providers.

As a warning, when programmatically accessing the children or children count ofa variable that has a synthetic child provider, notice that LLDB hides theactual raw children. For example, suppose we have astd::vector, which hasan actual in-memory property__begin marking the beginning of its data.After the synthetic child provider is executed, thestd::vector variablewon’t show__begin as child anymore, even through the SB API. It will haveinstead the children calculated by the provider. In case the actual rawchildren are needed, a call tovalue.GetNonSyntheticValue() is enough toget a raw version of the value. It is import to remember this when implementingsummary string providers, as they run after the synthetic child provider.

In some cases, if LLDB is unable to use the real object to get a childspecified in an expression path, it will automatically refer to the syntheticchildren. While in summaries it is best to always use ${svar to make yourintentions clearer, interactive debugging can benefit from this behavior, asin:

(lldb)framevariablenumbers[0]numbers[1](int)numbers[0]=1(int)numbers[1]=12

Unlike many other visualization features, however, the access to syntheticchildren only works when using frame variable, and is not supported inexpression:

(lldb)expressionnumbers[0]Error[IRForTarget]:Calltoafunction'_ZNSt33vector<int, std::allocator<int> >ixEm'thatisnotpresentinthetargeterror:Couldn't convert the expression to DWARF

The reason for this is that classes might have an overloadedoperator[],or other special provisions and the expression command chooses to ignoresynthetic children in the interest of equivalency with code you asked to havecompiled from source.

Filters#

Filters are a solution to the display of complex classes. At times, classeshave many member variables but not all of these are actually necessary for theuser to see.

A filter will solve this issue by only letting the user see those membervariables they care about. Of course, the equivalent of a filter can beimplemented easily using synthetic children, but a filter lets you get the jobdone without having to write Python code.

For instance, if your class Foobar has member variables named A thru Z, but youonly need to see the ones named B, H and Q, you can define a filter:

(lldb)typefilteraddFoobar--childB--childH--childQ(lldb)framevariablea_foobar(Foobar)a_foobar={(int)B=1(char)H='H'(std::string)Q="Hello world"}

Callback-based type matching#

Even though regular expression matching works well for the vast majority of dataformatters (you normally know the name of the type you’re writing a formatterfor), there are some cases where it’s useful to look at the type before decidingwhat formatter to apply.

As an example scenario, imagine we have a code generator that produces someclasses that inherit from a commonGeneratedObject class, and we have asummary function and a synthetic child provider that work for allGeneratedObject instances (they all follow the same pattern). However, thereis no common pattern in the name of these classes, so we can’t register theformatter neither by name nor by regular expression.

In that case, you can write a recognizer function like this:

defis_generated_object(sbtype,internal_dict):forbaseinsbtype.get_bases_array():ifbase.GetName()=="GeneratedObject"returnTruereturnFalse

And pass this function totypesummaryadd andtypesyntheticadd usingthe flag--recognizer-function.

(lldb)typesummaryadd--expand--python-functionmy_summary_function--recognizer-functionis_generated_object(lldb)typesyntheticadd--python-classmy_child_provider--recognizer-functionis_generated_object

Objective-C Dynamic Type Discovery#

When doing Objective-C development, you may notice that some of your variablescome out as of type id (for instance, items extracted from NSArray). Bydefault, LLDB will not show you the real type of the object. it can actuallydynamically discover the type of an Objective-C variable, much like the runtimeitself does when invoking a selector. In order to be shown the result of thatdiscovery that, however, a special option to frame variable or expression isrequired:--dynamic-type.

--dynamic-type can have one of three values:

no-dynamic-values: the default, prevents dynamic type discovery
no-run-target: enables dynamic type discovery as long as running code onthe target is not required
run-target: enables code execution on the target in order to performdynamic type discovery

If you specify a value of either no-run-target or run-target, LLDB will detectthe dynamic type of your variables and show the appropriate formatters forthem. As an example:

(lldb) expr @"Hello"(NSString *) $0 = 0x00000001048000b0 @"Hello"(lldb) expr -d no-run @"Hello"(__NSCFString *) $1 = 0x00000001048000b0 @"Hello"

Because LLDB uses a detection algorithm that does not need to invoke anyfunctions on the target process, no-run-target is enough for this to work.

As a side note, the summary for NSString shown in the example is built rightinto LLDB. It was initially implemented through Python (the code is stillavailable for reference at CFString.py). However, this is out of sync with thecurrent implementation of the NSString formatter (which is a C++ functioncompiled into the LLDB core).

Categories#

Categories are a way to group related formatters. For instance, LLDB itselfgroups the formatters for STL types in a category named cpluspus. Basically,categories act like containers in which to store formatters for a same libraryor OS release.

By default, several categories are created in LLDB:

default: this is the category where every formatter ends up, unless another category is specified
objc: formatters for basic and common Objective-C types that do not specifically depend on macOS
cplusplus: formatters for STL types (currently only libc++ and libstdc++ are supported). Enabled when debugging C++ targets.
system: truly basic types for which a formatter is required
AppKit: Cocoa classes
CoreFoundation: CF classes
CoreGraphics: CG classes
CoreServices: CS classes
VectorTypes: compact display for several vector types

If you want to use a custom category for your formatters, all thetype...addprovide a--category (-w) option, that names the category to add the formatterto. To delete the formatter, you then have to specify the correct category.

Categories can be in one of two states: enabled and disabled. A category isinitially disabled, and can be enabled using thetypecategoryenable command.To disable an enabled category, the command to use istypecategorydisable.

The order in which categories are enabled or disabled is significant, in thatLLDB uses that order when looking for formatters. Therefore, when you enable acategory, it becomes the second one to be searched (after default, which alwaysstays on top of the list). The default categories are enabled in such a waythat the search order is:

default
objc
CoreFoundation
AppKit
CoreServices
CoreGraphics
cplusplus
VectorTypes
system

As said, cplusplus contain formatters for C++ STL data types.system contains formatters for char* and char[], which reflect the behavior ofolder versions of LLDB which had built-in formatters for these types. Becausenow these are formatters, you can even replace them with your own if so youwish.

There is no special command to create a category. When you place a formatter ina category, if that category does not exist, it is automatically created. Forinstance,

(lldb)typesummaryaddFoobar--summary-string"a foobar"--categorynewcategory

automatically creates a (disabled) category named newcategory.

Another way to create a new (empty) category, is to enable it, as in:

(lldb)typecategoryenablenewcategory

However, in this case LLDB warns you that enabling an empty category has noeffect. If you add formatters to the category after enabling it, they will behonored. But an empty category per se does not change the way any type isdisplayed. The reason the debugger warns you is that enabling an empty categorymight be a typo, and you effectively wanted to enable a similarly-named butnot-empty category.

Finding Formatters 101#

Searching for a formatter (including formats, since lldb 3.4.0) given avariable goes through a rather intricate set of rules. Namely, what happens isthat LLDB starts looking in each enabled category, according to the order inwhich they were enabled (latest enabled first). In each category, LLDB does thefollowing:

If there is a formatter for the type of the variable, use it
If this object is a pointer, and there is a formatter for the pointee typethat does not skip pointers, use it
If this object is a reference, and there is a formatter for the referred typethat does not skip references, use it
If this object is an Objective-C class and dynamic types are enabled, lookfor a formatter for the dynamic type of the object. If dynamic types aredisabled, or the search failed, look for a formatter for the declared type ofthe object
If this object’s type is a typedef, go through typedef hierarchy (LLDB mightnot be able to do this if the compiler has not emitted enough information. Ifthe required information to traverse typedef hierarchies is missing, typecascading will not work. The clang compiler, part of the LLVM project, emitsthe correct debugging information for LLDB to cascade). If at any level ofthe hierarchy there is a valid formatter that can cascade, use it.
If everything has failed, repeat the above search, looking for regularexpressions instead of exact matches

If any of those attempts returned a valid formatter to be used, that one isused, and the search is terminated (without going to look in other categories).If nothing was found in the current category, the next enabled category isscanned according to the same algorithm. If there are no more enabledcategories, the search has failed.

Warning: previous versions of LLDB defined cascading to mean not only goingthrough typedef chains, but also through inheritance chains. This feature hasbeen removed since it significantly degrades performance. You need to set upyour formatters for every type in inheritance chains to which you want theformatter to apply.

On this page

Variable Formatting

Movatterモバイル変換

Variable Formatting#