test
The tracking issue for this feature is: None.
The internals of thetest
crate are unstable, behind thetest
flag. Themost widely used part of thetest
crate are benchmark tests, which can testthe performance of your code. Let's make oursrc/lib.rs
look like this(comments elided):
#![allow(unused)]#![feature(test)]fn main() {extern crate test;pub fn add_two(a: i32) -> i32 { a + 2}#[cfg(test)]mod tests { use super::*; use test::Bencher; #[test] fn it_works() { assert_eq!(4, add_two(2)); } #[bench] fn bench_add_two(b: &mut Bencher) { b.iter(|| add_two(2)); }}}
Note thetest
feature gate, which enables this unstable feature.
We've imported thetest
crate, which contains our benchmarking support.We have a new function as well, with thebench
attribute. Unlike regulartests, which take no arguments, benchmark tests take a&mut Bencher
. ThisBencher
provides aniter
method, which takes a closure. This closurecontains the code we'd like to benchmark.
We can run benchmark tests withcargo bench
:
$ cargo bench Compiling adder v0.0.1 (file:///home/steve/tmp/adder) Running target/release/adder-91b3e234d4ed382arunning 2 teststest tests::it_works ... ignoredtest tests::bench_add_two ... bench: 1 ns/iter (+/- 0)test result: ok. 0 passed; 0 failed; 1 ignored; 1 measured
Our non-benchmark test was ignored. You may have noticed thatcargo bench
takes a bit longer thancargo test
. This is because Rust runs our benchmarka number of times, and then takes the average. Because we're doing so littlework in this example, we have a1 ns/iter (+/- 0)
, but this would showthe variance if there was one.
Advice on writing benchmarks:
- Move setup code outside the
iter
loop; only put the part you want to measure inside - Make the code do "the same thing" on each iteration; do not accumulate or change state
- Make the outer function idempotent too; the benchmark runner is likely to runit many times
- Make the inner
iter
loop short and fast so benchmark runs are fast and thecalibrator can adjust the run-length at fine resolution - Make the code in the
iter
loop do something simple, to assist in pinpointingperformance improvements (or regressions)
Gotcha: optimizations
There's another tricky part to writing benchmarks: benchmarks compiled withoptimizations activated can be dramatically changed by the optimizer so thatthe benchmark is no longer benchmarking what one expects. For example, thecompiler might recognize that some calculation has no external effects andremove it entirely.
#![allow(unused)]#![feature(test)]fn main() {extern crate test;use test::Bencher;#[bench]fn bench_xor_1000_ints(b: &mut Bencher) { b.iter(|| { (0..1000).fold(0, |old, new| old ^ new); });}}
gives the following results
running 1 testtest bench_xor_1000_ints ... bench: 0 ns/iter (+/- 0)test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
The benchmarking runner offers two ways to avoid this. Either, the closure thattheiter
method receives can return an arbitrary value which forces theoptimizer to consider the result used and ensures it cannot remove thecomputation entirely. This could be done for the example above by adjusting theb.iter
call to
#![allow(unused)]fn main() {struct X;impl X { fn iter<T, F>(&self, _: F) where F: FnMut() -> T {} } let b = X;b.iter(|| { // Note lack of `;` (could also use an explicit `return`). (0..1000).fold(0, |old, new| old ^ new)});}
Or, the other option is to call the generictest::black_box
function, whichis an opaque "black box" to the optimizer and so forces it to consider anyargument as used.
#![feature(test)]extern crate test;fn main() {struct X;impl X { fn iter<T, F>(&self, _: F) where F: FnMut() -> T {} } let b = X;b.iter(|| { let n = test::black_box(1000); (0..n).fold(0, |a, b| a ^ b)})}
Neither of these read or modify the value, and are very cheap for small values.Larger values can be passed indirectly to reduce overhead (e.g.black_box(&huge_struct)
).
Performing either of the above changes gives the following benchmarking results
running 1 testtest bench_xor_1000_ints ... bench: 131 ns/iter (+/- 3)test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
However, the optimizer can still modify a testcase in an undesirable mannereven when using either of the above.