- Notifications
You must be signed in to change notification settings - Fork14
Getting Started with Core_bench
Core_bench is a micro-benchmarking library for OCaml that can measure execution costs of operations that take 1ns to about 100ms. Core_bench tries to measure execution costs of such short-lived computations precisely while trying to account for delayed GC costs and noise introduced by other activity on the system.
The easiest way to get started is using an example:
open Coreopen Core_benchlet main () = Random.self_init (); let x = Random.float 10.0 in let y = Random.float 10.0 in Command.run (Bench.make_command [ Bench.Test.create ~name:"Float add" (fun () -> ignore (x +. y)); Bench.Test.create ~name:"Float mul" (fun () -> ignore (x *. y)); Bench.Test.create ~name:"Float div" (fun () -> ignore (x /. y)); ])let () = main ()
When compiled this gives you an executable:
$ ./z.exe Estimated testing time 30s (3 benchmarks x 10s). Change using -quota SECS.┌───────────┬──────────┬─────────┬────────────┐│ Name │ Time/Run │ mWd/Run │ Percentage │├───────────┼──────────┼─────────┼────────────┤│ Float add │ 2.53ns │ 2.00w │ 41.04% ││ Float mul │ 2.50ns │ 2.00w │ 40.63% ││ Float div │ 6.16ns │ 2.00w │ 100.00% │└───────────┴──────────┴─────────┴────────────┘
If any of the functions resulted in allocation of words on the major heap (mjWd) or promotions, columns corresponding to those would be automatically displayed. In general, if a column does not have sginificant values, the column is not displayed. The most common options one would want to change are the-q
flag which controls the time quota for testing and enabling/disabling specific columns.
In the simple case, a benchmark is simply aunit -> unit
thunk and a name:
Bench.Test.create ~name:"Float add" (fun () -> ignore (x +. y));
One can also create indexed benchmarks, which can be helpful in understanding non-linearities in the execution profiles of functions. For example:
open Core.Stdopen Core_bench.Stdlet main () = Command.run (Bench.make_command [ Bench.Test.create_indexed ~name:"Array.create" ~args:[1;10;100;200;300;400] (fun len -> Staged.stage (fun () -> ignore(Array.create ~len 0))); ])let () = main ()
which produces:
$ ./z.exe -q 3Estimated testing time 18s (6 benchmarks x 3s). Change using -quota SECS.┌──────────────────┬────────────┬─────────┬──────────┬────────────┐│ Name │ Time/Run │ mWd/Run │ mjWd/Run │ Percentage │├──────────────────┼────────────┼─────────┼──────────┼────────────┤│ Array.create:1 │ 26.60ns │ 2.00w │ │ 0.99% ││ Array.create:10 │ 35.29ns │ 11.00w │ │ 1.31% ││ Array.create:100 │ 108.39ns │ 101.00w │ │ 4.03% ││ Array.create:200 │ 178.45ns │ 201.00w │ │ 6.64% ││ Array.create:300 │ 1_996.86ns │ │ 301.00w │ 74.25% ││ Array.create:400 │ 2_689.28ns │ │ 401.00w │ 100.00% │└──────────────────┴────────────┴─────────┴──────────┴────────────┘
Core_bench produces self documenting executables. This documentation also closely corresponds to the functionality exposed through the .mli file and is a great way to interactively explore what the various options do. At the time of this writing-?
displays:
Benchmark for Float add, Float mul, Float div z.exe [COLUMN ...]Columns that can be specified are:time - Number of nano secs taken.cycles - Number of CPU cycles (RDTSC) taken.alloc - Allocation of major, minor and promoted words.gc - Show major and minor collections per 1000 runs.percentage - Relative execution time as a percentage.speedup - Relative execution cost as a speedup.samples - Number of samples collected for profiling.Columns with no significant values will not be displayed. Thefollowing columns will be displayed by default:time alloc percentageError Estimates===============To display error estimates, prefix the column name (orregression) with a '+'. Example +time.(1) R^2 is the fraction of the variance of the responder (such asruntime) that is accounted for by the predictors (such as number ofruns). More informally, it describes how good a fit we're getting,with R^2 = 1 indicating a perfect fit and R^2 = 0 indicating ahorrible fit. Also see:http://en.wikipedia.org/wiki/Coefficient_of_determination(2) Bootstrapping is used to compute 95% confidence intervalsfor each estimate.Because we expect runtime to be very highly correlated with number ofruns, values very close to 1 are typical; an R^2 value for 'time' thatis less than 0.99 should cause some suspicion, and a value less than0.9 probably indicates either a shortage of data or that the data iserroneous or peculiar in some way.Specifying additional regressions=================================The builtin in columns encode common analysis that apply to mostfunctions. Bench allows the user to specify custom analysis to helpunderstand relationships specific to a particular function using theflag "-regression" . It is worth noting that this feature requiressome understanding of both linear regression and how various quatitiesrelate to each other in the OCaml runtime. To specify a regressionone must specify the responder variable and a command separated listof predictor variables.For example: +Time:Run,mjGC,Compwhich asks bench to estimate execution time using three predictorsnamely the number of runs, major GCs and compaction stats and displayerror estimates. Drop the prefix '+' to suppress error estimation. Thevariables available for regression include:Time - TimeCycls - CyclesRun - Runs per sampled batchmGC - Minor CollectionsmjGC - Major CollectionsComp - CompactionsmWd - Minor WordsmjWd - Major WordsProm - Promoted WordsOne - Constant predictor for estimating measurement overhead=== flags === [-all-values] Show all column values, including very small ones. [-ascii] Display data in simple ascii based tables. [-ci-absolute] Display 95% confidence interval in absolute numbers [-clear-columns] Don't display default columns. Only show user specified ones. [-display STYLE] Table style (short, tall, line, blank or column). Default short. [-fork] Fork and run each benchmark in separate child-process [-geometric SCALE] Use geometric sampling. (default 1.01) [-linear INCREMENT] Use linear sampling to explore number of runs, example 1. [-load FILE] Analyze previously saved data files and don't run tests. [-load] can be specified multiple times. [-no-compactions] Disable GC compactions. [-overheads] Show measurement overheads, when applicable. [-quota SECS] Time quota allowed per test (default 10s). [-reduced-bootstrap] Reduce the number of bootstrapping iterations [-regression REGR] Specify additional regressions (See -? help). [-save] Save benchmark data to .txt files. [-stabilize-gc] Stabilize GC between each sample capture. [-v] High verbosity level. [-width WIDTH] width limit on column display (default 200). [-build-info] print info about this build and exit [-version] print the version of this build and exit [-help] print this help text and exit (alias: -?)