This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed. Find sources: "Profiling" computer programming – news ·newspapers ·books ·scholar ·JSTOR(January 2009) (Learn how and when to remove this message) |
Part of a series on |
Software development |
---|
Paradigms and models |
Standards and bodies of knowledge |
Outlines |
Insoftware engineering,profiling (program profiling,software profiling) is a form ofdynamic program analysis that measures, for example, the space (memory) or timecomplexity of a program, theusage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aidprogram optimization, and more specifically,performance engineering.
Profiling is achieved byinstrumenting either the programsource code or its binary executable form using a tool called aprofiler (orcode profiler). Profilers may use a number of different techniques, such as event-based, statistical, instrumented, and simulation methods.
Profilers use a wide variety of techniques to collect data, includinghardware interrupts,code instrumentation,instruction set simulation, operating systemhooks, andperformance counters.
Program analysis tools are extremely important for understanding program behavior. Computer architects need such tools to evaluate how well programs will perform on newarchitectures. Software writers need tools to analyze their programs and identify critical sections of code.Compiler writers often use such tools to find out how well theirinstruction scheduling orbranch prediction algorithm is performing...
— ATOM,PLDI
The output of a profiler may be:
/* ------------ source------------------------- count */ 0001 IF X = "A" 00550002 THEN DO 0003 ADD 1 to XCOUNT 00320004 ELSE0005 IF X = "B" 0055
A profiler can be applied to an individual method or at the scale of a module or program, to identify performance bottlenecks by making long-running code obvious.[1] A profiler can be used to understand code from a timing point of view, with the objective of optimizing it to handle various runtime conditions[2] or various loads.[3] Profiling results can be ingested by a compiler that providesprofile-guided optimization.[4] Profiling results can be used to guide the design and optimization of an individual algorithm; theKrauss matching wildcards algorithm is an example.[5] Profilers are built into someapplication performance management systems that aggregate profiling data to provide insight intotransaction workloads indistributed applications.[6]
Performance-analysis tools existed onIBM/360 andIBM/370 platforms from the early 1970s, usually based on timer interrupts which recorded theprogram status word (PSW) at set timer-intervals to detect "hot spots" in executing code.[citation needed] This was an early example ofsampling (see below). In early 1974instruction-set simulators permitted full trace and other performance-monitoring features.[citation needed]
Profiler-driven program analysis on Unix dates back to 1973,[7] when Unix systems included a basic tool,prof
, which listed each function and how much of program execution time it used. In 1982gprof
extended the concept to a completecall graph analysis.[8]
In 1994, Amitabh Srivastava andAlan Eustace ofDigital Equipment Corporation published a paper describing ATOM[9] (Analysis Tools with OM). The ATOM platform converts a program into its own profiler: atcompile time, it inserts code into the program to be analyzed. That inserted code outputs analysis data. This technique - modifying a program to analyze itself - is known as "instrumentation".
In 2004 both thegprof
and ATOM papers appeared on the list of the 50 most influentialPLDI papers for the 20-year period ending in 1999.[10]
Flat profilers compute the average call times, from the calls, and do not break down the call times based on the callee or the context.
Call graph profilers[8] show the call times, and frequencies of the functions, and also the call-chains involved based on the callee. In some tools full context is not preserved.
Input-sensitive profilers[11][12][13] add a further dimension to flat or call-graph profilers by relating performance measures to features of the input workloads, such as input size or input values. They generate charts that characterize how an application's performance scales as a function of its input.
Profilers, which are also programs themselves, analyze target programs by collecting information on the target program's execution. Based on their data granularity, which depends upon how profilers collect information, they are classified asevent-based orstatistical profilers. Profilers interrupt program execution to collect information. Those interrupts can limit time measurement resolution, which implies that timing results should be taken with a grain of salt.Basic block profilers report a number of machineclock cycles devoted to executing each line of code, or timing based on adding those together; the timings reported per basic block may not reflect a difference betweencache hits and misses.[14][15]
Event-based profilers are available for the following programming languages:
These profilers operate bysampling. A sampling profiler probes the target program'scall stack at regular intervals usingoperating systeminterrupts. Sampling profiles are typically less numerically accurate and specific, providing only a statistical approximation, but allow the target program to run at near full speed. "The actual amount of error is usually more than one sampling period. In fact, if a value is n times the sampling period, the expected error in it is the square-root of n sampling periods."[16]
In practice, sampling profilers can often provide a more accurate picture of the target program's execution than other approaches, as they are not as intrusive to the target program and thus don't have as many side effects (such as on memory caches or instruction decoding pipelines). Also since they don't affect the execution speed as much, they can detect issues that would otherwise be hidden. They are also relatively immune to over-evaluating the cost of small, frequently called routines or 'tight' loops. They can show the relative amount of time spent in user mode versus interruptible kernel mode such assystem call processing.
Unfortunately, running kernel code to handle the interrupts incurs a minor loss of CPU cycles from the target program, diverts cache usage, and cannot distinguish the various tasks occurring in uninterruptible kernel code (microsecond-range activity) from user code. Dedicated hardware can do better: ARM Cortex-M3 and some recent MIPS processors' JTAG interfaces have a PCSAMPLE register, which samples theprogram counter in a truly undetectable manner, allowing non-intrusive collection of a flat profile.
Some commonly used[17] statistical profilers for Java/managed code areSmartBear Software'sAQtime[18] andMicrosoft'sCLR Profiler.[19] Those profilers also support native code profiling, along withApple Inc.'sShark (OSX),[20]OProfile (Linux),[21]IntelVTune and Parallel Amplifier (part ofIntel Parallel Studio), andOraclePerformance Analyzer,[22] among others.
This technique effectively adds instructions to the target program to collect the required information. Note thatinstrumenting a program can cause performance changes, and may in some cases lead to inaccurate results and/orheisenbugs. The effect will depend on what information is being collected, on the level of timing details reported, and on whether basic block profiling is used in conjunction with instrumentation.[23] For example, adding code to count every procedure/routine call will probably have less effect than counting how many times each statement is obeyed. A few computers have special hardware to collect information; in this case the impact on the program is minimal.
Instrumentation is key to determining the level of control and amount of time resolution available to the profilers.
gprof
OutputArchived 2012-05-29 at theWayback Machine