- Notifications
You must be signed in to change notification settings - Fork75
A C++ header-only library of statistical distribution functions.
License
kthohr/stats
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
StatsLib is a templated C++ library of statistical distribution functions, featuring unique compile-time computing capabilities and seamless integration with several popular linear algebra libraries.
Features:
- A header-only library of probability density functions, cumulative distribution functions, quantile functions, and random sampling methods.
- Functions are written in C++11
constexpr
format, enabling the library to operate as both a compile-time and run-time computation engine. - Designed with a simpleR-like syntax.
- Optional vector-matrix functionality with wrappers to support:
- Matrix-based operations are parallelizable with OpenMP.
- Released under a permissive, non-GPL license.
- Distributions
- Installation
- Documentation
- Jupyter Notebook
- Options
- Syntax and Examples
- Compile-time Computation Capabilities
- Author and License
Functions to compute the cdf, pdf, quantile, as well as random sampling methods, are available for the following distributions:
- Bernoulli
- Beta
- Binomial
- Cauchy
- Chi-squared
- Exponential
- F
- Gamma
- Inverse-Gamma
- Inverse-Gaussian
- Laplace
- Logistic
- Log-Normal
- Normal (Gaussian)
- Poisson
- Rademacher
- Student's t
- Uniform
- Weibull
In addition, pdf and random sampling functions are available for several multivariate distributions:
- inverse-Wishart
- Multivariate Normal
- Wishart
StatsLib is a header-only library. Simply add the header files to your project using
#include"stats.hpp"
The only dependency is the latest version ofGCEM and a C++11 compatible compiler.
Full documentation is available online:
A PDF version of the documentation is availablehere.
You can test the library online using an interactive Jupyter notebook:
The following options should be declaredbefore including the StatsLib header files.
- For inline-only functionality (i.e., no
constexpr
specifiers):
#defineSTATS_GO_INLINE
- OpenMP functionality is enabled by default if the
_OPENMP
macro is detected (e.g., by invoking-fopenmp
with GCC or Clang). To explicitly enable OpenMP features use:
#defineSTATS_USE_OPENMP
- To disable OpenMP functionality:
#defineSTATS_DONT_USE_OPENMP
- To use StatsLib with Armadillo, Blaze or Eigen:
#defineSTATS_ENABLE_ARMA_WRAPPERS#defineSTATS_ENABLE_BLAZE_WRAPPERS#defineSTATS_ENABLE_EIGEN_WRAPPERS
- To enable wrappers for
std::vector
:
#defineSTATS_ENABLE_STDVEC_WRAPPERS
Functions are called using anR-like syntax. Some general rules:
- density functions:
stats::d*
. For example, the Normal (Gaussian) density is called using
stats::dnorm(<value>,<mean parameter>,<standard deviation>);
- cumulative distribution functions:
stats::p*
. For example, the Gamma CDF is called using
stats::pgamma(<value>,<shape parameter>,<scale parameter>);
- quantile functions:
stats::q*
. For example, the Beta quantile is called using
stats::qbeta(<value>,<a parameter>,<b parameter>);
- random sampling:
stats::r*
. For example, to generate a single draw from the Logistic distribution:
stats::rlogis(<location parameter>,<scale parameter>,<seed valueor random number engine>);
All of these functions have matrix-based equivalents using Armadillo, Blaze, and Eigen dense matrices.
- The pdf, cdf, and quantile functions can take matrix-valued arguments. For example,
// Using Armadillo:arma::mat norm_pdf_vals = stats::dnorm(arma::ones(10,20),1.0,2.0);
- The randomization functions (
r*
) can output random matrices of arbitrary size. For example, For example, the following code will generate a 100-by-50 matrix of iid draws from a Gamma(3,2) distribution:
// Armadillo:arma::mat gamma_rvs = stats::rgamma<arma::mat>(100,50,3.0,2.0);// Blaze:blaze::DynamicMatrix<double> gamma_rvs = stats::rgamma<blaze::DynamicMatrix<double>>(100,50,3.0,2.0);// Eigen:Eigen::MatrixXd gamma_rvs = stats::rgamma<Eigen::MatrixXd>(100,50,3.0,2.0);
- All matrix-based operations are parallelizable with OpenMP. For GCC and Clang compilers, simply include the
-fopenmp
option during compilation.
Random number seeding is available in two forms: seed values and random number engines.
- Seed values are passed as unsigned integers. For example, to generate a draw from a normal distribution N(1,2) with seed value 1776:
stats::rnorm(1,2,1776);
- Random engines in StatsLib use the 64-bit Mersenne-Twister generator (
std::mt19937_64
) and are passed by reference. Example:
std::mt19937_64engine(1776);stats::rnorm(1,2,engine);
More examples with code:
// evaluate the normal PDF at x = 1, mu = 0, sigma = 1double dval_1 = stats::dnorm(1.0,0.0,1.0);// evaluate the normal PDF at x = 1, mu = 0, sigma = 1, and return the log valuedouble dval_2 = stats::dnorm(1.0,0.0,1.0,true);// evaluate the normal CDF at x = 1, mu = 0, sigma = 1double pval = stats::pnorm(1.0,0.0,1.0);// evaluate the Laplacian quantile at p = 0.1, mu = 0, sigma = 1double qval = stats::qlaplace(0.1,0.0,1.0);// draw from a t-distribution dof = 30double rval = stats::rt(30);// matrix outputarma::mat beta_rvs = stats::rbeta<arma::mat>(100,100,3.0,2.0);// matrix inputarma::mat beta_cdf_vals = stats::pbeta(beta_rvs,3.0,2.0);
StatsLib is designed to operate equally well as a compile-time computation engine. Compile-time computation allows the compiler to replace function calls (e.g.,dnorm(0,0,1)
) with static values in the source code. That is, functions are evaluated during the compilation process, rather than at run-time. This capability is made possible due to the templatedconstexpr
design of the library and can be verified by inspecting the assembly code generated by the compiler.
The compile-time features are enabled using theconstexpr
specifier. The example below computes the pdf, cdf, and quantile function of the Laplace distribution.
#include"stats.hpp"intmain(){constexprdouble dens_1 =stats::dlaplace(1.0,1.0,2.0);// answer = 0.25constexprdouble prob_1 =stats::plaplace(1.0,1.0,2.0);// answer = 0.5constexprdouble quant_1 =stats::qlaplace(0.1,1.0,2.0);// answer = -2.218875...return0;}
Assembly code generated by Clang without any optimization:
LCPI0_0:.quad-4611193153885729483 ## double-2.2188758248682015LCPI0_1:.quad4602678819172646912 ## double0.5LCPI0_2:.quad4598175219545276417 ## double0.25000000000000006.section__TEXT,__text,regular,pure_instructions.globl_main.p2align4,0x90_main: ## @mainpushrbpmovrbp,rspxoreax,eaxmovsdxmm0, qword ptr[rip+ LCPI0_0] ##xmm0 = mem[0],zeromovsdxmm1, qword ptr[rip+ LCPI0_1] ##xmm1 = mem[0],zeromovsdxmm2, qword ptr[rip+ LCPI0_2] ##xmm2 = mem[0],zeromovdword ptr[rbp-4],0movsdqword ptr[rbp-16],xmm2movsdqword ptr[rbp-24],xmm1movsdqword ptr[rbp-32],xmm0poprbpret
Keith O'Hara
Apache Version 2
About
A C++ header-only library of statistical distribution functions.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.