Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Cargo subcommand to profile binaries

License

NotificationsYou must be signed in to change notification settings

svenstaro/cargo-profiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build StatusCrates.iolicense

Cargo subcommand to profile binaries

To install

NOTE: This subcommand can only be used on Linux machines.

First install valgrind:

$ sudo apt-get install valgrind

Then you can installcargo-profiler viacargo install.

$ cargo install cargo-profiler

Alternatively, you can clone this repo and build the binary from the source.

$ cargo build --release

Now, copy the built binary to the same directory as cargo.

$ sudo cp ./target/release/cargo-profiler $(dirname $(which cargo))/

To run

Cargo profiler currently supports callgrind and cachegrind.

You can call cargo profiler anywhere in a rust project directory with aCargo.toml.

$ cargo profiler callgrind$ cargo profiler cachegrind --release

You can also specify a binary directly:

$ cargo profiler callgrind --bin $PATH_TO_BINARY

To specify command line arguments to the executable being profiled, append themafter a--:

$ cargo profiler callgrind --bin $PATH_TO_BINARY -- -a 3 --like this

You can chose to keep the callgrind/cachegrind output files using the--keep option

$ cargo profiler callgrind --keep

You can limit the number of functions you'd like to look at:

$ cargo profiler callgrind --bin ./target/debug/rsmat -n 10Profiling rsmat with callgrind...Total Instructions...198,466,45678,346,775 (39.5%) dgemm_kernel.rs:matrixmultiply::gemm::masked_kernel-----------------------------------------------------------------------23,528,320 (11.9%) iter.rs:_..std..ops..Range..A....as..std..iter..Iterator..::next-----------------------------------------------------------------------16,824,925 (8.5%) loopmacros.rs:matrixmultiply::gemm::masked_kernel-----------------------------------------------------------------------10,236,864 (5.2%) mem.rs:core::mem::swap-----------------------------------------------------------------------7,712,846 (3.9%) memset.S:memset-----------------------------------------------------------------------7,197,344 (3.6%) ???:core::cmp::impls::_..impl..cmp..PartialOrd..for..usize..::lt-----------------------------------------------------------------------6,979,680 (3.5%) ops.rs:_..usize..as..ops..Add..::add-----------------------------------------------------------------------

With cachegrind, you can also sort the data by a particular metric column:

$ cargo profiler cachegrind --bin ./target/debug/rsmat -n 10 --sort drProfiling rsmat with cachegrind...Total Memory Accesses...320,385,356Total L1 I-Cache Misses...371 (0%)Total LL I-Cache Misses...308 (0%)Total L1 D-Cache Misses...58,549 (0%)Total LL D-Cache Misses...8,451 (0%) Ir  I1mr ILmr  Dr  D1mr DLmr  Dw  D1mw DLmw0.40 0.18 0.21 0.35 0.93 1.00 0.38 0.00 0.00 dgemm_kernel.rs:matrixmultiply::gemm::masked_kernel-----------------------------------------------------------------------0.08 0.04 0.05 0.12 0.00 0.00 0.02 0.00 0.00 loopmacros.rs:matrixmultiply::gemm::masked_kernel-----------------------------------------------------------------------0.12 0.02 0.02 0.10 0.00 0.00 0.15 0.00 0.00 iter.rs:_std..ops..RangeAasstd..iter..Iterator::next-----------------------------------------------------------------------0.05 0.01 0.01 0.07 0.00 0.00 0.08 0.00 0.00 mem.rs:core::mem::swap-----------------------------------------------------------------------0.03 0.00 0.00 0.05 0.00 0.00 0.00 0.00 0.00 ???:core::cmp::impls::_implcmp..PartialOrdforusize::lt-----------------------------------------------------------------------0.03 0.01 0.01 0.04 0.00 0.00 0.03 0.00 0.00 ops.rs:_busizeasops..Addausize::add-----------------------------------------------------------------------0.04 0.01 0.01 0.04 0.00 0.00 0.03 0.00 0.00 ptr.rs:core::ptr::_implconstT::offset-----------------------------------------------------------------------0.02 0.01 0.00 0.03 0.00 0.00 0.01 0.00 0.00 ???:_usizeasops..Add::add-----------------------------------------------------------------------0.01 0.01 0.01 0.02 0.00 0.00 0.01 0.00 0.00 mem.rs:core::mem::uninitialized-----------------------------------------------------------------------0.02 0.01 0.01 0.02 0.00 0.00 0.04 0.00 0.00 wrapping.rs:_XorShiftRngasRng::next_u32-----------------------------------------------------------------------

What are the cachegrind metrics?

  • Ir -> Total Instructions
  • I1mr -> Level 1 I-Cache misses
  • ILmr -> Last Level I-Cache misses
  • Dr -> Total Memory Reads
  • D1mr -> Level 1 D-Cache read misses
  • DLmr -> Last Level D-cache read misses
  • Dw -> Total Memory Writes
  • D1mw -> Level 1 D-Cache write misses
  • DLmw -> Last Level D-cache write misses

TODO

  • cmp subcommand - compare binary profiles
  • profiler macros
  • better context around expensive functions
  • support for more profiling tools

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp