- Notifications
You must be signed in to change notification settings - Fork15
Rcpp Integration for Numerical Computing Libraries
yixuan/RcppNumerical
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Rcpp is apowerful tool to write fast C++ code to speed up R programs. However,it is not easy, or at least not straightforward, to compute numericalintegration or do optimization using pure C++ code inside Rcpp.
RcppNumerical integrates a number of open source numerical computinglibraries into Rcpp, so that users can call convenient functions toaccomplish such tasks.
- To useRcppNumerical with
Rcpp::sourceCpp(), add
// [[Rcpp::depends(RcppEigen)]]// [[Rcpp::depends(RcppNumerical)]]
in the C++ source file.
- To useRcppNumerical in your package, add
Imports: RcppNumericalandLinkingTo: Rcpp, RcppEigen, RcppNumericalto theDESCRIPTIONfile,andimport(RcppNumerical)to theNAMESPACEfile.
The one-dimensional numerical integration code contained inRcppNumericalis based on theNumericalIntegrationlibrary developed bySreekumar Thaithara Balan,Mark Sauder, and Matt Beall.
To compute integration of a function, first define a functor derived fromtheFunc class (under the namespaceNumer):
classFunc{public:virtualdoubleoperator()(constdouble& x)const = 0;virtualvoideval(double* x,constint n)const {for(int i =0; i < n; i++) x[i] =this->operator()(x[i]); }virtual~Func() {}};
The first function evaluates one point at a time, and the second versionoverwrites each point in the array by the corresponding function values.Only the second function will be used by the integration code, but usually itis easier to implement the first one.
RcppNumerical provides a wrapper function for theNumericalIntegrationlibrary with the following interface:
inlinedoubleintegrate(const Func& f,constdouble& lower,constdouble& upper,double& err_est,int& err_code,constint subdiv =100,constdouble& eps_abs =1e-8,constdouble& eps_rel =1e-6,const Integrator<double>::QuadratureRule rule = Integrator<double>::GaussKronrod41)
f: The functor of integrand.lower,upper: Limits of integral.err_est: Estimate of the error (output).err_code: Error code (output). Seeinst/include/integration/Integrator.hLine 676-704.subdiv: Maximum number of subintervals.eps_abs,eps_rel: Absolute and relative tolerance.rule: Integration rule. Possible values areGaussKronrod{15, 21, 31, 41, 51, 61, 71, 81, 91, 101, 121, 201}. Rules withlarger values have better accuracy, but may involve more function calls.- Return value: The final estimate of the integral.
See a full example below, which can be compiled using theRcpp::sourceCppfunction in Rcpp.
// [[Rcpp::depends(RcppEigen)]]// [[Rcpp::depends(RcppNumerical)]]#include<RcppNumerical.h>usingnamespaceNumer;// P(0.3 < X < 0.8), X ~ Beta(a, b)classBetaPDF:publicFunc{private:double a;double b;public:BetaPDF(double a_,double b_) : a(a_), b(b_) {}doubleoperator()(constdouble& x)const {returnR::dbeta(x, a, b,0); }};// [[Rcpp::export]]Rcpp::Listintegrate_test(){constdouble a =3, b =10;constdouble lower =0.3, upper =0.8;constdouble true_val =R::pbeta(upper, a, b,1,0) -R::pbeta(lower, a, b,1,0); BetaPDFf(a, b);double err_est;int err_code;constdouble res =integrate(f, lower, upper, err_est, err_code);returnRcpp::List::create(Rcpp::Named("true") = true_val,Rcpp::Named("approximate") = res,Rcpp::Named("error_estimate") = err_est,Rcpp::Named("error_code") = err_code );}
Runing theintegrate_test() function in R gives
integrate_test()## $true## [1] 0.2528108#### $approximate## [1] 0.2528108#### $error_estimate## [1] 2.806764e-15#### $error_code## [1] 0
Multi-dimensional integration inRcppNumerical is done by theCuba library developed byThomas Hahn.
To calculate the integration of a multivariate function, one needs to definea functor that inherits from theMFunc class:
classMFunc{public:virtualdoubleoperator()(Constvec& x) = 0;virtual~MFunc() {}};
HereConstvec represents a read-only vector with the definition
// Constant reference to a vectortypedefconst Eigen::Ref<const Eigen::VectorXd> Constvec;
(Basically you can treatConstvec as aconst Eigen::VectorXd. UsingEigen::Ref is mainly to avoid memory copy. See the explanationhere.)
The function provided byRcppNumerical for multi-dimensionalintegration is
inlinedoubleintegrate( MFunc& f, Constvec& lower, Constvec& upper,double& err_est,int& err_code,constint maxeval =1000,constdouble& eps_abs =1e-6,constdouble& eps_rel =1e-6)
f: The functor of integrand.lower,upper: Limits of integral. Both are vectors of the samedimension off.err_est: Estimate of the error (output).err_code: Error code (output). Non-zero values indicate failure ofconvergence.maxeval: Maximum number of function evaluations.eps_abs,eps_rel: Absolute and relative tolerance.- Return value: The final estimate of the integral.
See the example below:
// [[Rcpp::depends(RcppEigen)]]// [[Rcpp::depends(RcppNumerical)]]#include<RcppNumerical.h>usingnamespaceNumer;// P(a1 < X1 < b1, a2 < X2 < b2), (X1, X2) ~ N([0], [1 rho])// ([0], [rho 1])classBiNormal:publicMFunc{private:constdouble rho;double const1;// 2 * (1 - rho^2)double const2;// 1 / (2 * PI) / sqrt(1 - rho^2)public:BiNormal(constdouble& rho_) : rho(rho_) { const1 =2.0 * (1.0 - rho * rho); const2 =1.0 / (2 * M_PI) /std::sqrt(1.0 - rho * rho); }// PDF of bivariate normaldoubleoperator()(Constvec& x) {double z = x[0] * x[0] -2 * rho * x[0] * x[1] + x[1] * x[1];return const2 *std::exp(-z / const1); }};// [[Rcpp::export]]Rcpp::Listintegrate_test2(){ BiNormalf(0.5);// rho = 0.5 Eigen::VectorXdlower(2); lower << -1, -1; Eigen::VectorXdupper(2); upper <<1,1;double err_est;int err_code;constdouble res =integrate(f, lower, upper, err_est, err_code);returnRcpp::List::create(Rcpp::Named("approximate") = res,Rcpp::Named("error_estimate") = err_est,Rcpp::Named("error_code") = err_code );}
We can test the result in R:
library(mvtnorm)trueval= pmvnorm(c(-1,-1), c(1,1),sigma=matrix(c(1,0.5,0.5,1),2))integrate_test2()## $approximate## [1] 0.4979718#### $error_estimate## [1] 4.612333e-09#### $error_code## [1] 0trueval- integrate_test2()$approximate## [1] 2.893336e-11
CurrentlyRcppNumerical contains the L-BFGS algorithm for unconstrainedminimization problems based on theLBFGS++ library.
Again, one needs to first define a functor to represent the multivariatefunction to be minimized.
classMFuncGrad{public:virtualdoublef_grad(Constvec& x, Refvec grad) = 0;virtual~MFuncGrad() {}};
Same as the case in multi-dimensional integration,Constvec represents aread-only vector andRefvec a writable vector. Their definitions are
// Reference to a vectortypedef Eigen::Ref<Eigen::VectorXd> Refvec;typedefconst Eigen::Ref<const Eigen::VectorXd> Constvec;
Thef_grad() member function returns the function value on vectorx,and overwritesgrad by the gradient.
The wrapper function forLBFGS++ is
inlineintoptim_lbfgs( MFuncGrad& f, Refvec x,double& fx_opt,constint maxit =300,constdouble& eps_f =1e-6,constdouble& eps_g =1e-5)
f: The function to be minimized.x: In: the initial guess. Out: best value of variables found.fx_opt: Out: Function value on the outputx.maxit: Maximum number of iterations.eps_f: Algorithm stops if|f_{k+1} - f_k| < eps_f * |f_k|.eps_g: Algorithm stops if||g|| < eps_g * max(1, ||x||).- Return value: Error code. Negative values indicate errors.
Below is an example that illustrates the optimization of the Rosenbrock functionf(x1, x2) = 100 * (x2 - x1^2)^2 + (1 - x1)^2:
// [[Rcpp::depends(RcppEigen)]]// [[Rcpp::depends(RcppNumerical)]]#include<RcppNumerical.h>usingnamespaceNumer;// f = 100 * (x2 - x1^2)^2 + (1 - x1)^2// True minimum: x1 = x2 = 1classRosenbrock:publicMFuncGrad{public:doublef_grad(Constvec& x, Refvec grad) {double t1 = x[1] - x[0] * x[0];double t2 =1 - x[0]; grad[0] = -400 * x[0] * t1 -2 * t2; grad[1] =200 * t1;return100 * t1 * t1 + t2 * t2; }};// [[Rcpp::export]]Rcpp::Listoptim_test(){ Eigen::VectorXdx(2); x[0] = -1.2; x[1] =1;double fopt; Rosenbrock f;int res =optim_lbfgs(f, x, fopt);returnRcpp::List::create(Rcpp::Named("xopt") = x,Rcpp::Named("fopt") = fopt,Rcpp::Named("status") = res );}
Calling the generated R functionoptim_test() gives
optim_test()## $xopt## [1] 1 1#### $fopt## [1] 3.12499e-15#### $status## [1] 0
It may be more meaningful to look at a real application of theRcppNumericalpackage. Below is an example to fit logistic regression using the L-BFGSalgorithm. It also demonstrates the performance of the library.
// [[Rcpp::depends(RcppEigen)]]// [[Rcpp::depends(RcppNumerical)]]#include<RcppNumerical.h>usingnamespaceNumer;typedef Eigen::Map<Eigen::MatrixXd> MapMat;typedef Eigen::Map<Eigen::VectorXd> MapVec;classLogisticReg:publicMFuncGrad{private:const MapMat X;const MapVec Y;public:LogisticReg(const MapMat x_,const MapVec y_) : X(x_), Y(y_) {}doublef_grad(Constvec& beta, Refvec grad) {// Negative log likelihood// sum(log(1 + exp(X * beta))) - y' * X * beta Eigen::VectorXd xbeta = X * beta;constdouble yxbeta = Y.dot(xbeta);// X * beta => exp(X * beta) xbeta = xbeta.array().exp();constdouble f = (xbeta.array() +1.0).log().sum() - yxbeta;// Gradient// X' * (p - y), p = exp(X * beta) / (1 + exp(X * beta))// exp(X * beta) => p xbeta.array() /= (xbeta.array() +1.0); grad.noalias() = X.transpose() * (xbeta - Y);return f; }};// [[Rcpp::export]]Rcpp::NumericVectorlogistic_reg(Rcpp::NumericMatrix x, Rcpp::NumericVector y){const MapMat xx = Rcpp::as<MapMat>(x);const MapVec yy = Rcpp::as<MapVec>(y);// Negative log likelihood LogisticRegnll(xx, yy);// Initial guess Eigen::VectorXdbeta(xx.cols()); beta.setZero();double fopt;int status =optim_lbfgs(nll, beta, fopt);if(status <0)Rcpp::stop("fail to converge");returnRcpp::wrap(beta);}
Here is the R code to test the function:
set.seed(123)n=5000p=100x=matrix(rnorm(n*p),n)beta= runif(p)xb= c(x%*%beta)p= exp(xb)/ (1+ exp(xb))y= rbinom(n,1,p)system.time(res1<- glm.fit(x,y,family= binomial())$coefficients)## user system elapsed## 0.229 0.006 0.234system.time(res2<- logistic_reg(x,y))## user system elapsed## 0.005 0.000 0.006max(abs(res1-res2))## [1] 0.0001873564
It is much faster than the standardglm.fit() function in R! (Althoughglm.fit() calculates some other quantities besides beta.)
RcppNumerical also provides thefastLR() function to run fast logisticregression, which is a modified and more stable version of the code above.
system.time(res3<- fastLR(x,y)$coefficients)## user system elapsed## 0.007 0.001 0.008max(abs(res1-res3))## [1] 7.066969e-06
About
Rcpp Integration for Numerical Computing Libraries
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors4
Uh oh!
There was an error while loading.Please reload this page.
