Movatterモバイル変換

Next:Debugging the Analyzer, Up:Static Analyzer [Contents][Index]

26.1 Analyzer Internals ¶

26.1.1 Overview ¶

At a high-level, we’re doing coverage-guided symbolic execution of theuser’s code.

The analyzer implementation works on the gimple-SSA representation.(I chose this in the hopes of making it easy to work with LTO todo whole-program analysis).

The implementation is read-only: it doesn’t attempt to change anything,just emit warnings.

The gimple representation can be seen using-fdump-ipa-analyzer.

Tip: If the analyzer ICEs before this is written out, one workaround is to use--param=analyzer-bb-explosion-factor=0 to force the analyzerto bail out after analyzing the first basic block.

First, we build a directed graph to represent the user’s code.For historical reasons we call this thesupergraph, althoughthis is now a misnomer as we no longer add callgraph edges to this graph.The nodes and edges in the supergraph are called “supernodes” and“superedges”, and often referred to in code assnodes andsedges.

We make a node in the supergraph before every gimple statement, withedges representing the transitions between statements within a basic block,along with additional nodes and edges at CFG edges.

The nodes in the supergraph represent locations in the user’s code,and discrete points between operations. The edges represent transitionsbetween these locations. Each edge in the supergraph can have an optionaloperation associated with it, representing a single state transitionthat occurs along the edge, such as

individual non-control-flow gimple statements (such as an assignment)
control flow statements on a CFG edge that impose a condition for thetransition to be possible (e.g. a branch of a conditional or aswitch case)
the collection of phi nodes at the entry to a basic block, with anassociated CFG edge (so that these all take effect simultaneously)
etc

There can be multiple nodes and edges in the supergraph corresponding toa single CFG edge so that e.g. we can handle filtering states on a conditionseparately from handling the effect of the phi nodes if the conditionwas satisfied.

The analyzer in GCC 10 - GCC 15 attempted to have a single supernode perbasic block for the sake of efficiency, but given that state transitionscan happen mid-block, this became unmaintainable, hence we now havefine-grained nodes with one node/edge per gimple statement.

Having built the supergraph from the CFGs of all of the functions inthe user’s code, we manipulate it:

We fixup locations to try to ensure that every supernode has a reasonablelocation_t value referring to the location in the user’s source.This is necessary, since in the gimple IR seen by the analyzer, many gimplestatements have no location associated with them.
We simplify the supergraph to remove redundant nodes and edges, such asthose that are simply no-ops that add no useful location information.This can eliminate about 5-10% of the nodes.
We sort and renumber the nodes into an order that we hope will lead toefficient state merging when exploring the graph (see below).

The supergraph can be seen at each stage using-fdump-analyzer-supergraph, which creates a series ofSRC.supergraph.N.KIND.dot GraphViz files files showing the stateof the supergraph after each of the above.

We then build ananalysis_plan which walks the callgraph todetermine which calls might be suitable for being summarized (ratherthan fully explored) and thus in what order to explore the functions.

Next is the heart of the analyzer: we use a worklist to explore statewithin the supergraph, building an "exploded graph".Nodes in the exploded graph correspond to <point, state> pairs, as in "Precise Interprocedural Dataflow Analysis via Graph Reachability" (Thomas Reps, Susan Horwitz and Mooly Sagiv) - but note thatwe’re not using the algorithm described in that paper, just the“exploded graph” terminology.

We reuse nodes for <point, state> pairs we’ve already seen, and avoidtracking state too closely, so that (hopefully) we rapidly convergeon a final exploded graph, and terminate the analysis. We also bailout if the number of exploded <point, state> nodes getslarger than a particular multiple of the total number of supernodes,(to ensure termination in the face of pathological state-explosioncases, or bugs). We also stop exploring a point once we hit a limitof states for that point.

We can identify problems directly when processing a <point, state>instance. For example, if we’re finding the successors of

   <point: before-stmt: "free (ptr);",    state: {"ptr": freed}>

then we can detect a double-free of "ptr". We can then emit a pathto reach the problem by finding the simplest route through the graph.

Program points in the analysis are a combination of a supernodetogether with a "call string" identifying thestack of callsites below them, so that paths in the exploded graphcorrespond to interprocedurally valid paths: we always return to thecorrect call site, propagating state information accordingly.We avoid infinite recursion by stopping the analysis if a callsiteappears more thananalyzer-max-recursion-depth in a callstring(defaulting to 2).

26.1.2 Graphs ¶

Nodes and edges in the exploded graph are called “exploded nodes” and“exploded edges” and often referred to in the code asenodes andeedges (especially when distinguishing themfrom thesnodes andsedges in the supergraph).

Each graph numbers its nodes, giving unique identifiers - supernodesare referred to throughout dumps in the form ‘SN':index’ andexploded nodes in the form ‘EN:index’ (e.g. ‘SN: 2’ and‘EN:29’).

The supergraph can be seen using-fdump-analyzer-supergraph.

The exploded graph can be seen using-fdump-analyzer-exploded-graphand other dump options. Exploded nodes are color-coded in the .dot outputbased on state-machine states to make it easier to see state changes ata glance.

26.1.3 State Tracking ¶

There’s a tension between:

precision of analysis in the straight-line case, vs
exponential blow-up in the face of control flow.

For example, in general, given this CFG:

      A     / \    B   C     \ /      D     / \    E   F     \ /      G

we want to avoid differences in state-tracking in B and C fromleading to blow-up. If we don’t prevent state blowup, we end upwith exponential growth of the exploded graph like this:

           1:A          /   \         /     \        /       \      2:B       3:C       |         |      4:D       5:D        (2 exploded nodes for D)     /   \     /   \   6:E   7:F 8:E   9:F    |     |   |     |   10:G 11:G 12:G  13:G    (4 exploded nodes for G)

Similar issues arise with loops.

To prevent this, we follow various approaches:

state pruning: which tries to discard state that won’t be relevantlater on within the function.This can be disabled via-fno-analyzer-state-purge.
state merging. We can try to find the commonality between twoprogram_state instances to make a third, simpler program_state.We have two strategies here:
1. the worklist keeps new nodes for the same program_point together, and tries to merge them before processing, and thus before they have successors. Hence, in the above, the two nodes for D (4 and 5) reach the front of the worklist together, and we create a node for D with the merger of the incoming states.
2. try merging with the state of existing enodes for the program_point (which may have already been explored). There will be duplication, but only one set of duplication; subsequent duplicates are more likely to hit the cache. In particular, (hopefully) all merger chains are finite, and so we guarantee termination. This is intended to help with loops: we ought to explore the first iteration, and then have a "subsequent iterations" exploration, which uses a state merged from that of the first, to be more abstract.
We avoid merging pairs of states that have state-machine differences,as these are the kinds of differences that are likely to be mostinteresting. So, for example, given:
```
      if (condition)        ptr = malloc (size);      else        ptr = local_buf;      .... do things with 'ptr'      if (condition)        free (ptr);      ...etc
```
then we end up with an exploded graph that looks like this:
```
                   if (condition)                     / T      \ F            ---------          ----------           /                             \      ptr = malloc (size)             ptr = local_buf          |                               |      copy of                         copy of        "do things with 'ptr'"          "do things with 'ptr'"      with ptr: heap-allocated        with ptr: stack-allocated          |                               |      if (condition)                  if (condition)          | known to be T                 | known to be F      free (ptr);                         |           \                             /            -----------------------------                         | ('ptr' is pruned, so states can be merged)                        etc
```
where some duplication has occurred, but only for the places where thethe different paths are worth exploringly separately.
Merging can be disabled via-fno-analyzer-state-merge.

26.1.4 Region Model ¶

Part of the state stored at aexploded_node is aregion_model.This is an implementation of the region-based ternary model described in"A Memory Model for Static Analysis of C Programs"(Zhongxing Xu, Ted Kremenek, and Jian Zhang).

Aregion_model encapsulates a representation of the state ofmemory, with astore recording a binding betweenregioninstances, tosvalue instances. The bindings are organized intoclusters, where regions accessible via well-defined pointer arithmeticare in the same cluster. The representation is graph-like because valuescan be pointers to regions. It also stores aconstraint_manager,capturing relationships between the values.

Because each node in theexploded_graph has aregion_model,and each of the latter is graph-like, theexploded_graph is in someways a graph of graphs.

There are several “dump” functions for use when debugging the analyzer.

Consider this example C code:

void *calls_malloc (size_t n){  void *result = malloc (1024);  return result; /* HERE */}void test (size_t n){  void *ptr = calls_malloc (n * 4);  /* etc.  */}

and the state at the point/* HERE */ for the interproceduralanalysis case wherecalls_malloc returns back totest.

Here’s an example of printing aprogram_state at/* HERE */,showing theregion_model within it, along with state for themalloc state machine.

(gdb) break region_model::on_return[..snip...](gdb) run[..snip...](gdb) up[..snip...](gdb) call state->dump()State├─ Region Model│  ├─ Current Frame: frame: ‘calls_malloc’@2│  ├─ Store│  │  ├─ m_called_unknown_fn: false│  │  ├─ frame: ‘test’@1│  │  │  ╰─ _1: (INIT_VAL(n_2(D))*(size_t)4)│  │  ╰─ frame: ‘calls_malloc’@2│  │     ├─ result_4: &HEAP_ALLOCATED_REGION(27)│  │     ╰─ _5: &HEAP_ALLOCATED_REGION(27)│  ╰─ Dynamic Extents│     ╰─ HEAP_ALLOCATED_REGION(27): (INIT_VAL(n_2(D))*(size_t)4)╰─ ‘malloc’ state machine   ╰─ 0x468cb40: &HEAP_ALLOCATED_REGION(27): unchecked ({free}) (‘result_4’)

Within the store, there are bindings clusters for the SSA names for thevarious local variables within frames fortest andcalls_malloc. For example,

withintest the whole cluster for_1 is boundto abinop_svalue representingn * 4, and
withintest the whole cluster forresult_4 is bound to aregion_svalue pointing atHEAP_ALLOCATED_REGION(12).

Additionally, this latter pointer has theunchecked state for themalloc state machine indicating it hasn’t yet been checked againstNULL since the allocation call.

We also see that the state has captured the size of the heap-allocatedregion (“Dynamic Extents”).

This visualization can also be seen within the output of-fdump-analyzer-exploded-nodes-2 and-fdump-analyzer-exploded-nodes-3.

As well as the above visualizations of states, there are tree-likevisualizations for instances ofsvalue andregion, showingtheir IDs and how they are constructed from simpler symbols:

(gdb) break region_model::set_dynamic_extents[..snip...](gdb) run[..snip...](gdb) up[..snip...](gdb) call size_in_bytes->dump()(17): ‘long unsigned int’: binop_svalue(mult_expr: ‘*’)├─ (15): ‘size_t’: initial_svalue│  ╰─ m_reg: (12): ‘size_t’: decl_region(‘n_2(D)’)│     ╰─ parent: (9): frame_region(‘test’, index: 0, depth: 1)│        ╰─ parent: (1): stack region│           ╰─ parent: (0): root region╰─ (16): ‘size_t’: constant_svalue (‘4’)

i.e. thatsize_in_bytes is abinop_svalue expressingthe result of multiplying

the initial value of thePARM_DECLn_2(D) for theparametern within the frame fortest by
the constant value4.

The above visualizations rely on thetext_art::widget framework,which performs significant work to lay out the output, so there is alsoan earlier, simpler, form of dumping available. For states there is:

(gdb) call state->dump(eg.m_ext_state, true)rmodel:stack depth: 2  frame (index 1): frame: ‘calls_malloc’@2  frame (index 0): frame: ‘test’@1clusters within frame: ‘test’@1  cluster for: _1: (INIT_VAL(n_2(D))*(size_t)4)clusters within frame: ‘calls_malloc’@2  cluster for: result_4: &HEAP_ALLOCATED_REGION(27)  cluster for: _5: &HEAP_ALLOCATED_REGION(27)m_called_unknown_fn: FALSEconstraint_manager:  equiv classes:  constraints:dynamic_extents:  HEAP_ALLOCATED_REGION(27): (INIT_VAL(n_2(D))*(size_t)4)malloc:  0x468cb40: &HEAP_ALLOCATED_REGION(27): unchecked ({free}) (‘result_4’)

or forregion_model just:

(gdb) call state->m_region_model->debug()stack depth: 2  frame (index 1): frame: ‘calls_malloc’@2  frame (index 0): frame: ‘test’@1clusters within frame: ‘test’@1  cluster for: _1: (INIT_VAL(n_2(D))*(size_t)4)clusters within frame: ‘calls_malloc’@2  cluster for: result_4: &HEAP_ALLOCATED_REGION(27)  cluster for: _5: &HEAP_ALLOCATED_REGION(27)m_called_unknown_fn: FALSEconstraint_manager:  equiv classes:  constraints:dynamic_extents:  HEAP_ALLOCATED_REGION(27): (INIT_VAL(n_2(D))*(size_t)4)

and for instances ofsvalue andregion there is thisolder dump implementation, which takes abool simple flagcontrolling the verbosity of the dump:

(gdb) call size_in_bytes->dump(true)(INIT_VAL(n_2(D))*(size_t)4)(gdb) call size_in_bytes->dump(false)binop_svalue (mult_expr, initial_svalue(‘size_t’, decl_region(frame_region(‘test’, index: 0, depth: 1), ‘size_t’, ‘n_2(D)’)), constant_svalue(‘size_t’, 4))

26.1.5 Analyzer Paths ¶

We need to explain to the user what the problem is, and to persuade themthat there really is a problem. Hence having adiagnostics::paths::pathisn’t just an incidental detail of the analyzer; it’s required.

Paths ought to be:

interprocedurally-valid
feasible

Without state-merging, all paths in the exploded graph are feasible(in terms of constraints being satisfied).With state-merging, paths in the exploded graph can be infeasible.

We collate warnings and only emit them for the simplest pathe.g. for a bug in a utility function, with lots of routes to calling it,we only emit the simplest path (which could be intraprocedural, ifit can be reproduced without a caller).

We thus want to find the shortest feasible path through the explodedgraph from the origin to the exploded node at which the diagnostic wassaved. Unfortunately, if we simply find the shortest such path andcheck if it’s feasible we might falsely reject the diagnostic, as theremight be a longer path that is feasible. Examples include the caseswhere the diagnostic requires us to go at least once around a loop for alater condition to be satisfied, or where for a later condition to besatisfied we need to enter a suite of code that the simpler path skips.

We attempt to find the shortest feasible path to each diagnostic byfirst constructing a “trimmed graph” from the exploded graph,containing only those nodes and edges from which there are paths tothe target node, and using Dijkstra’s algorithm to order the trimmednodes by minimal distance to the target.

We then use a worklist to iteratively build a “feasible graph”(actually a tree), capturing the pertinent state along each path, inwhich every path to a “feasible node” is feasible by construction,restricting ourselves to the trimmed graph to ensure we stay on target,and ordering the worklist so that the first feasible path we find to thetarget node is the shortest possible path. Hence we start by trying theshortest possible path, but if that fails, we explore progressivelylonger paths, eventually trying iterations through loops. Theexploration is captured in the feasible_graph, which can be dumped as a.dot file via-fdump-analyzer-feasibility to visualize theexploration. The indices of the feasible nodes show the order in whichthey were created. We effectively explore the tree of feasible paths inorder of shortest path until we either find a feasible path to thetarget node, or hit a limit and give up.

This is something of a brute-force approach, but the trimmed graphhopefully keeps the complexity manageable.

This algorithm can be disabled (for debugging purposes) via-fno-analyzer-feasibility, which simply uses the shortest path,and notes if it is infeasible.

The above gives us a shortest feasibleexploded_path through theexploded_graph (a list ofexploded_edge *). We use thisexploded_path to build adiagnostics::paths::path (a list ofevents for the diagnostic subsystem) - specifically achecker_path.

Having built thechecker_path, we prune it to try to eliminateevents that aren’t relevant, to minimize how much the user has to read.

After pruning, we notify each event in the path of its ID and record theIDs of interesting events, allowing for events to refer to other eventsin their descriptions. Thepending_diagnostic class has variousvfuncs to support emitting more precise descriptions, so that e.g.

a deref-of-unchecked-malloc diagnostic might use:
```
  returning possibly-NULL pointer to 'make_obj' from 'allocator'
```
for areturn_event to make it clearer how the unchecked value movesfrom callee back to caller

a double-free diagnostic might use:

  second 'free' here; first 'free' was at (3)

and a use-after-free might use

  use after 'free' here; memory was freed at (2)

At this point we can emit the diagnostic.

26.1.6 Limitations ¶

Only for C so far
The implementation of call summaries is currently very simplistic.
Lack of function pointer analysis
The constraint-handling code assumes reflexivity in some places(that values are equal to themselves), which is not the case for NaN.As a simple workaround, constraints on floating-point values arecurrently ignored.
There are various other limitations in the region model (grep for TODO/xfailin the testsuite).
The constraint_manager’s implementation of transitivity is currently tooexpensive to enable by default and so must be manually enabled via-fanalyzer-transitivity).
The checkers are currently hardcoded and don’t allow for user extensibility(e.g. adding allocate/release pairs).
Although the analyzer’s test suite has a proof-of-concept test case forLTO, LTO support hasn’t had extensive testing. There are variouslang-specific things in the analyzer that assume C rather than LTO.For example, SSA names are printed to the user in “raw” form, ratherthan printing the underlying variable name.