Sören Künzel, Theo Saarinen, Simon Walter, Edward Liu, Allen Tang,Jasjeet Sekhon
Rforestry is a fast implementation of Random Forests, GradientBoosting, and Linear Random Forests, with an emphasis on inference andinterpretability.
install.packages("devtools").devtools::has_devel() to check whether you do. If nodevelopment environment exists, Windows users download and installRtools andmacOS users download and installXcode.devtools::install_github("forestry-labs/Rforestry"). ForWindows users, you’ll need to skip 64-bit compilationdevtools::install_github("forestry-labs/Rforestry", INSTALL_opts = c('--no-multiarch'))due to an outstanding gcc issue.set.seed(292315)library(Rforestry)test_idx<-sample(nrow(iris),3)x_train<- iris[-test_idx,-1]y_train<- iris[-test_idx,1]x_test<- iris[test_idx,-1]rf<-forestry(x = x_train,y = y_train)weights=predict(rf, x_test,aggregation ="weightMatrix")$weightMatrixweights%*% y_trainpredict(rf, x_test)A fast implementation of random forests using ridge penalizedsplitting and ridge regression for predictions.
Example:
set.seed(49)library(Rforestry)n<-c(100)a<-rnorm(n)b<-rnorm(n)c<-rnorm(n)y<-4*a+5.5*b- .78*cx<-data.frame(a,b,c)forest<-forestry(x, y,ridgeRF =TRUE)predict(forest, x)A parameter controlling monotonic constraints for features inforestry.
library(Rforestry)x<-rnorm(150)+5y<- .15*x+ .5*sin(3*x)data_train<-data.frame(x1 = x,x2 =rnorm(150)+5,y = y+rnorm(150,sd = .4))monotone_rf<-forestry(x = data_train%>%select(-y),y = data_train$y,monotonicConstraints =c(-1,-1),nodesizeStrictSpl =5,nthread =1,ntree =25)predict(monotone_rf,feature.new = data_train%>%select(-y))We can return the predictions for the training dataset using only thetrees in which each observation was out of bag. Note that when there arefew trees, or a high proportion of the observations sampled, there maybe some observations which are not out of bag for any trees. Thepredictions for these are returned NaN.
library(Rforestry)# Train a forestrf<-forestry(x = iris[,-1],y = iris[,1],ntree =500)# Get the OOB predictions for the training setoob_preds<-getOOBpreds(rf)# This should be equal to the OOB errorsum((oob_preds- iris[,1])^2)getOOB(rf)