- Notifications
You must be signed in to change notification settings - Fork70
CodeViz: A CallGraph Visualiser
License
NotificationsYou must be signed in to change notification settings
petersenna/codeviz
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
WHERE DOES THIS CONTENT COMES FROM?-----------------------------------The website of codeviz was down today and It was difficult to find the sourcecode for download. I used:https://web.archive.org/web/20150502053825/http://www.csn.ul.ie/~mel/projects/codeviz/#downloadto find and download the source code you see here. The only change I madewas adding this header to this file.README------This is the README for the CodeViz set of scripts. The tools are forthe creation of call graphs for C and C++ so that function flow can bevisualised. They are licensed under the GPL, see the COPYING file for details.Introduction------------At some stage in everyone's programming career, they will need to read througha lot of code written by another programmer. An important part of programcomprehension is building a picture of how the program is structured froma high-level view and call graphs can be an invaluable aid when buildingthis piecture. This is particularly useful if the original programmer usesclear function names. This project provides the ability to generate call graphs to aid the task ofunderstanding code. It uses a highly modular set of collection methods andcan be adapted to support any language although only C and C++ are currentlysupported. Each collection method has different advantages and disadvantages.Installing ----------cd codevizcp ./lib/* -rv /usr/lib/ (Or your preferred perl library path)cp ./bin/* /usr/local/binAlternatively just run the scripts directly from the package bin/ directoryas the libraries can be found as long as the libs are installed in a globalperl directory or at the lib/ directory is the same level as bin/.The graphs are rendered using dot which is part of the GraphViz project.Install the package for your distribution or obtain it directly fromhttp://www.graphviz.orgScripts-------genfull - Use this to generate the full call graph for a project. This willbe quite large and probably should be pared down with gengraph. Anumber of collection methods are available but cdepn is thedefault. Run genfull --man to get a full man page. Do not botherputting the output full.graph through dot yourself as it is unlikelyto be rendered within a reasonable amount of timegengraph - This will generate a small subgraph and postscript file for agiven set of functions. Run gengraph --man for full detailsGenerating cdepn Files for genfull----------------------------------If the full.graph for the source you are interested in have already beencreated, you can skip this section. See ./graphs to see if a full.graphis available.The cobjdump and cppobjdump (for C and C++ respectively) will generateadequate call graphs but the information is a bit lacking. For example,the source file of a function declaration is unknown and macros and inlinefunctions will be totally missing. Ideally, the cdepn method should be usedbut it requires a patched version of gcc and g++ to work. The patches andsome scripts are available in the compilers/ directory.The patched version of gcc and g++ outputs .cdepn files for every c and c++file compiled. This .cdepn file contains information such as when functionsare called, where they are declared and so on. Earlier versions of CodeVizsupported multiple gcc versions but this one only support 7.4.0.First, the source tar has to be downloaded. For those who have better thingsto do than read the gcc install doc, just do the followingcd compilersncftpget ftp://ftp.gnu.org/pub/gnu/gcc/gcc-7.4.0/gcc-7.4.0.tar.gz./install_gcc-7.4.0.sh <optional install path>This script will untar gcc, patch it and install it to the supplied path. Ifno path is given, it'll be installed to $HOME/gcc-graph . I usually installit to /usr/local/gcc-graph with./install_gcc-7.4.0.sh /usr/local/gcc-graphIf you seriously want to patch by hand, just read the script as it goes througheach of the steps one at a time. There is one step to note though.For now, we will presume a patched version of gcc and g++ is now in$HOME/gcc-graph/. Most projects will use the variable CC for decidingwhich version of gcc to use. The handiest way to use the patched one is withsomething likemake CC=$HOME/gcc-graph/bin/gcc CXX=$HOME/gcc-graph/bin/g++Or alternatively, adjust your path that gcc-graph will appear before thenormal gcc. As each source file is compiled, the corresponding cdepn filewill be created.In the case of building the Linux Kernel, the commands would be;make CC=$HOME/gcc-graph/bin/gcc bzImagemake CC=$HOME/gcc-graph/bin/gcc modulesSimilar methods will work for other projects presuming that the Makefileuses the CC or CXX macros correctly to indicate the compiler to use. If it'sa Makefile of your own type or it does not use proper macros, you may haveto edit the Makefile yourself or else adjust your path to put gcc-graph first.For example, with bash, the following will work.PATH=$HOME/gcc-graph/bin:$PATHWhen building, watch the compiler output to make sure the .cdepn files are beingcreated.Generating nccout files for genfull-----------------------------------An alternative to using a patched version of gcc is to use ncc(http://freshmeat.net/projects/ncc) which is a C compiler specificallydesigned for code browsing. It comes with it's own navigation tool andis well worth checking out.CodeViz supports ncc with the cncc collection method (just like cdepn is foruse with gcc) and supports C only. The really big thing going for the ncccollection method is that it can traverse function pointers. If you downloadand install ncc, use the cncc collection method if it is C code and functionpointers are common.Once ncc is installed, in the case of building the Linux Kernel, the commands would be:make -i CC='ncc -ncoo -ncfabs' bzImagemake -i CC='ncc -ncoo -ncfabs' modulesfind . -name \*.nccout | xargs cat > code.map.nccoutGenerating full.graph ---------------------Some full.graph files are provided with the tar in the downloads section. Ifone you want is not available, read on.To create a full.graph, the script genfull is used. run genfull --help to seeall options but the easiest thing to do is run the script with no argumentsin the top level source directory after a compile and a file full.graph willbe created in the top level source directory. While it should be possible to put full.graph though dot and see the postscriptfile, it is recommended you do not try. A full graph is extremely large andunlikely to be rendered in a reasonable amount of time. One really shoulduse the gengraph program to create smaller graphs.Problems that might exist with full.graph-----------------------------------------In more complex code, the full.graph may not be perfect. For example, there maybe naming collisions where there is duplicate function names between modules orif there is multiple binaries being compiled, genfull will not distinguishbetween them. If you think this will be a problem, there is two steps you canmake.First, compare the graph generated by cdepn with the one generated bycobjdump. As cobjdump is analysing a binary, it is highly unlikely the graphis wrong, it just will have no information on inline functions or macros. Withthe linux kernel, this test would look something likegenfull -g cobjdump -o full.graph-objdumpgenfull -g cdepn -o full.graph-cdepngengraph -t -d 5 -g full.graph-objdump -f kswapd -o kswapd-objdump.psgengraph -t -d 5 -g full.graph-cdepn -f kswapd -o kswapd-cdepn.psThis would generate two full.graphs and two call graphs of the functionkswapd() which could be compared to make sure the cdepn graph is accurate. Asimilar method can be used for other projects.The second problem that may occur is where function names are duplicatedbetween modules. In this case, the best course of action is to use the -sswitch to genfull to limit which branches of the tree are examined. Forexample, in the linux kernel there is an alloc_pages() function in mm/ anddrivers/char/drm . If one was examining the VM alone and naming collisions wereexpected to be a problem, genfull could be invoked asgenfull -s "mm include/linux drivers/block arch/i386"which would cover most of the functions of interest. In other projects, it willbe a case of different libraries colliding with each other. For instance, withavifile, genfull with no arguments will create a horrible mess. Instead, the-s switch must be used to generate a full.graph for each part of the project.For example, the player would be graphed withgenfull -s "player" -o full.graph-playerand each of the libraries would be graphed separately.Generating Call Graphs ----------------------The script gengraph generates a call graph for a specified function based onthe full.graph file. gengraph --man will provide all the information you need.The most important switch to note is -g which determines what collection methodto use. Once the script completes, a postscript file will be available whichcan be viewed with any postscript viewer. By default, the output filename willbe functionname.psIf it takes a long time to generate a graph, it is usually a good idea to firstlimit it's depth to something reasonable with -d . We'll take an example ofgraphing alloc_pages() with kernel 2.4.20Step 1: gengraph -f alloc_pagesResult: Taking way too long, hit ctrl-c and limited by some reasonable depth toget an idea of what was happeningStep 2: gengraph -d 10 -f alloc_pagesResult: Output graph is massive, mainly with kernel stock functions of nointerest. Use the -t switch to omit functions that are usually of no interest.For other projects, edit the gengraph script and go to the line "subgenerate_trimlist", this function has a list of functions to "trim" with the -tswitch is usedStep 3: gengraph -t -d 10 -f alloc_pagesResult: Output graph is still massive but a glance at the graph shows that acall to "shrink_cache()" is resulting in a massive graph below it that does notlook like it is directly related to page allocation. Lets just show thatfunction but not traverse it with the -s switchStep 4: gengraph -t -d 10 -s "shrink_cache" -f alloc_pagesResult: Graph size is drastically reduced. Most of the remaining graphinvolves two functions "try_to_free_pages_zone()" and "__free_pages_ok". We'llnot traverse try_to_free_pages_zone() and will ignore __free_pages_ok()altogether with the -i switchStep 5: gengraph -t -d 10 -s "shrink_cache try_to_free_pages_zone" -i"__free_pages_ok" -f alloc_pagesResult: Perfect, shows a nice graph which clearly shows what the importantfunctions are in relation to just page allocation. Later the branches that werenot traversed in this graph can be graphed separatelyThe bottom line is that the first graph is usually too large and needs to becut down. How to pare it down in a combination of experience with the codeand common sense. I find it usually helps to just limit the depth first by4 and start ignoring functions that are obviously not of current interestand traverse them laterGenerating Graphs based on Regular Expressions----------------------------------------------Support is available for selecting functions to graph, show and ignore basedon regular expressions. The format of the expression is the same as perl except without the //'s. For example, to generate a graph that dthat look like an alloc function in the kernel, this would workgengraph -t -d 4 --func-re "^.?.?alloc(_page)?$" -i "pmd_alloc" -o allocs.psNote that with --func-re in particular, it is important that you use the -oswitch or dot will fail to create a graph with complaints about bad outputfilenames.Post-Processing Options-----------------------Both genfull and gengraph support the use of post-procesing steps. Currently,two are supported. The first is stack usage by a single function. This is x86specific as it depends on object files regardless of the collection methodused. This is mainly of benefit to the Linux kernel as normal applicationscan expand their stack and do not need to worry about stack usage asmuch. The second module shows cumulative usage in gengraph between pairs offunctions. This is really handy for showing the usage between a system calland a lower-level function to identify places where stack is used too much.See the man pages for genfull and gengraph for more information on the useof the post-processing options.Daemon/Client Support---------------------With a large input graph, the longest operation for the generation of the callgraph is the reading of the input file. To compare, to generate a small graphon the authors machine, it takes 4 seconds to read the input graph and 0.1seconds to generate the output file. To address, this, gengraph can run as adaemon if the -q (--daemon) switch is specified. Use -v if you want to see whatit is doing.gengraph -q -g /usr/src/linux-2.4.20-clean/full.graphWhen this returns, the daemon is running. To generate a graph using the daemon,rungengraph -q -t -d 2 -f alloc_pagesNote the use of the -q switch which says that gengraph should run as a clientto the daemon instance. If you are bored, compare the difference in runningtimes between normal gengraph and when it is used as a client :-) . To stopthe daemon, do the followingecho QUIT > /tmp/codeviz.pipeand the daemon will shutdown and cleanup.Generating Graphs for the Web-----------------------------Gengraph is now suitable for use with CGI scripts. To generate GIF output instead of postfix, use the -w switch. How you choose to implement is upto yourself but what I did was the followingo Have CGI script call gengraph to output GIF to /tmpo In the HTML, have <img src=/cgi-bin/readgif?output.gif>Where output.gif is actually some temporary file in /tmp created by the CGIscript with a unique name and readgif is a simply exectuable which readsGIFs from /tmp and then unlinks them.There is no demo of this available because the webserver which hosts this project is a bit loaded. While I could run a demo, my popularity would takea bit of a dent.There is also support for generating HTML files with source-highlight if youhave it installed. See the detailed manpage for the --shighlight option formore details.Misc Notes----------Reports of success or failure, especially with C++, using any of the collectionmethods are appreciated.Bugs and Feedback -----------------Email any comments, feedback and bug reports to mel@csn.ul.ie. Enjoy.....Credits-------The vast majority of this has been implemented by Mel Gorman<mel@csn.ul.ie>. However, the diff to gcc and original cdepn.pl thatthis project was originally based on was written by Martin Devera (Devik)(http://luxik.cdi.cz/~devik). They have since changed considerably to supportother languages and be more flexible but the original idea was his, thanksMartin. Encouragement and prodding to support ncc is courtesy of the authorof ncc Xanthakis Stelios (sxanth@ceid.upatras.gr). The main guts of the implementation of the regular expression support and HTML rendering iscourtesy of Robert Lehr (bozzio@the-lehrs.com).