Movatterモバイル変換


[0]ホーム

URL:


Next:, Previous:, Up:Control Flow Graph   [Contents][Index]


14.3 Profile information

In many cases a compiler must make a choice whether to trade speed inone part of code for speed in another, or to trade code size for codespeed. In such cases it is useful to know information about how oftensome given block will be executed. That is the purpose formaintaining profile within the flow graph.GCC can handle profile information obtained throughprofilefeedback, but it can also estimate branch probabilities based onstatics and heuristics.

The feedback based profile is produced by compiling the program withinstrumentation, executing it on a train run and reading the numbersof executions of basic blocks and edges back to the compiler whilere-compiling the program to produce the final executable. This methodprovides very accurate information about where a program spends mostof its time on the train run. Whether it matches the average run ofcourse depends on the choice of train data set, but several studieshave shown that the behavior of a program usually changes justmarginally over different data sets.

When profile feedback is not available, the compiler may be asked toattempt to predict the behavior of each branch in the program using aset of heuristics (seepredict.def for details) and computeestimated frequencies of each basic block by propagating theprobabilities over the graph.

Eachbasic_block contains two integer fields to representprofile information:frequency andcount. Thefrequency is an estimation how often is basic block executedwithin a function. It is represented as an integer scaled in therange from 0 toBB_FREQ_BASE. The most frequently executedbasic block in function is initially set toBB_FREQ_BASE andthe rest of frequencies are scaled accordingly. During optimization,the frequency of the most frequent basic block can both decrease (forinstance by loop unrolling) or grow (for instance by cross-jumpingoptimization), so scaling sometimes has to be performed multipletimes.

Thecount contains hard-counted numbers of execution measuredduring training runs and is nonzero only when profile feedback isavailable. This value is represented as the host’s widest integer(typically a 64 bit integer) of the special typegcov_type.

Most optimization passes can use only the frequency information of abasic block, but a few passes may want to know hard execution counts.The frequencies should always match the counts after scaling, howeverduring updating of the profile information numerical error mayaccumulate into quite large errors.

Each edge also contains a branch probability field: an integer in therange from 0 toREG_BR_PROB_BASE. It represents probability ofpassing control from the end of thesrc basic block to thedest basic block, i.e. the probability that control will flowalong this edge. TheEDGE_FREQUENCY macro is available tocompute how frequently a given edge is taken. There is acountfield for each edge as well, representing same information as for abasic block.

The basic block frequencies are not represented in the instructionstream, but in the RTL representation the edge frequencies arerepresented for conditional jumps (via theREG_BR_PROBmacro) since they are used when instructions are output to theassembly file and the flow graph is no longer maintained.

The probability that control flow arrives via a given edge to itsdestination basic block is calledreverse probability and is notdirectly represented, but it may be easily computed from frequenciesof basic blocks.

Updating profile information is a delicate task that can unfortunatelynot be easily integrated with the CFG manipulation API. Many of thefunctions and hooks to modify the CFG, such asredirect_edge_and_branch, do not have enough information toeasily update the profile, so updating it is in the majority of casesleft up to the caller. It is difficult to uncover bugs in the profileupdating code, because they manifest themselves only by producingworse code, and checking profile consistency is not possible becauseof numeric error accumulation. Hence special attention needs to begiven to this issue in each pass that modifies the CFG.

It is important to point out thatREG_BR_PROB_BASE andBB_FREQ_BASE are both set low enough to be possible to computesecond power of any frequency or probability in the flow graph, it isnot possible to even square thecount field, as modern CPUs arefast enough to execute $2^32$ operations quickly.


Next:Maintaining the CFG, Previous:Edges, Up:Control Flow Graph   [Contents][Index]


[8]ページ先頭

©2009-2026 Movatter.jp