| Version: | 2.2.2 |
| Title: | Generalized Boosted Regression Models |
| Depends: | R (≥ 2.9.0) |
| Imports: | lattice, parallel, survival |
| Suggests: | covr, gridExtra, knitr, pdp, RUnit, splines, tinytest, vip,viridis |
| Description: | An implementation of extensions to Freund and Schapire's AdaBoost algorithm and Friedman's gradient boosting machine. Includes regression methods for least squares, absolute loss, t-distribution loss, quantile regression, logistic, multinomial logistic, Poisson, Cox proportional hazards partial likelihood, AdaBoost exponential loss, Huberized hinge loss, and Learning to Rank measures (LambdaMart). Originally developed by Greg Ridgeway. Newer version available at github.com/gbm-developers/gbm3. |
| License: | GPL-2 |GPL-3 | file LICENSE [expanded from: GPL (≥ 2) | file LICENSE] |
| URL: | https://github.com/gbm-developers/gbm |
| BugReports: | https://github.com/gbm-developers/gbm/issues |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.1 |
| VignetteBuilder: | knitr |
| NeedsCompilation: | yes |
| Packaged: | 2024-06-26 12:33:00 UTC; greg_ |
| Author: | Greg Ridgeway |
| Maintainer: | Greg Ridgeway <gridge@upenn.edu> |
| Repository: | CRAN |
| Date/Publication: | 2024-06-28 06:20:02 UTC |
Generalized Boosted Regression Models (GBMs)
Description
This package implements extensions to Freund and Schapire's AdaBoostalgorithm and J. Friedman's gradient boosting machine. Includes regressionmethods for least squares, absolute loss, logistic, Poisson, Coxproportional hazards partial likelihood, multinomial, t-distribution,AdaBoost exponential loss, Learning to Rank, and Huberized hinge loss.This gbm package is no longer under further development. Considerhttps://github.com/gbm-developers/gbm3 for the latest version.
Details
Further information is available in vignette:browseVignettes(package = "gbm")
Author(s)
Greg Ridgewaygridge@upenn.edu with contributions byDaniel Edwards, Brian Kriegler, Stefan Schroedl, Harry Southworth,and Brandon Greenwell
References
Y. Freund and R.E. Schapire (1997) “A decision-theoreticgeneralization of on-line learning and an application to boosting,”Journal of Computer and System Sciences, 55(1):119-139.
G. Ridgeway (1999). “The state of boosting,”Computing Scienceand Statistics 31:172-181.
J.H. Friedman, T. Hastie, R. Tibshirani (2000). “Additive LogisticRegression: a Statistical View of Boosting,”Annals of Statistics28(2):337-374.
J.H. Friedman (2001). “Greedy Function Approximation: A GradientBoosting Machine,”Annals of Statistics 29(5):1189-1232.
J.H. Friedman (2002). “Stochastic Gradient Boosting,”Computational Statistics and Data Analysis 38(4):367-378.
TheMART website.
See Also
Useful links:
Baseline hazard function
Description
Computes the Breslow estimator of the baseline hazard function for aproportional hazard regression model.
Usage
basehaz.gbm(t, delta, f.x, t.eval = NULL, smooth = FALSE, cumulative = TRUE)Arguments
t | The survival times. |
delta | The censoring indicator. |
f.x | The predicted values of the regression model on the log hazardscale. |
t.eval | Values at which the baseline hazard will be evaluated. |
smooth | If |
cumulative | If |
Details
The proportional hazard model assumes h(t|x)=lambda(t)*exp(f(x)).gbm can estimate the f(x) component via partial likelihood.After estimating f(x),basehaz.gbm can compute the a nonparametricestimate of lambda(t).
Value
A vector of length equal to the length of t (or of lengtht.eval ift.eval is notNULL) containing the baselinehazard evaluated at t (or att.eval ift.eval is notNULL). Ifcumulative is set toTRUE then the returnedvector evaluates the cumulative hazard function at those values.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
References
N. Breslow (1972). "Discussion of 'Regression Models andLife-Tables' by D.R. Cox," Journal of the Royal Statistical Society, SeriesB, 34(2):216-217.
N. Breslow (1974). "Covariance analysis of censored survival data,"Biometrics 30:89-99.
See Also
Calibration plot
Description
An experimental diagnostic tool that plots the fitted values versus theactual average values. Currently only available whendistribution = "bernoulli".
Usage
calibrate.plot( y, p, distribution = "bernoulli", replace = TRUE, line.par = list(col = "black"), shade.col = "lightyellow", shade.density = NULL, rug.par = list(side = 1), xlab = "Predicted value", ylab = "Observed average", xlim = NULL, ylim = NULL, knots = NULL, df = 6, ...)Arguments
y | The outcome 0-1 variable. |
p | The predictions estimating E(y|x). |
distribution | The loss function used in creating |
replace | Determines whether this plot will replace or overlay thecurrent plot. |
line.par | Graphics parameters for the line. |
shade.col | Color for shading the 2 SE region. |
shade.density | The |
rug.par | Graphics parameters passed to |
xlab | x-axis label corresponding to the predicted values. |
ylab | y-axis label corresponding to the observed average. |
xlim,ylim | x- and y-axis limits. If not specified te function willselect limits. |
knots,df | These parameters are passed directly to |
... | Additional optional arguments to be passed onto |
Details
Uses natural splines to estimate E(y|p). Well-calibrated predictions implythat E(y|p) = p. The plot also includes a pointwise 95
Value
No return values.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
References
J.F. Yates (1982). "External correspondence: decomposition ofthe mean probability score," Organisational Behaviour and Human Performance30:132-156.
D.J. Spiegelhalter (1986). "Probabilistic Prediction in Patient Managementand Clinical Trials," Statistics in Medicine 5:421-433.
Examples
# Don't want R CMD check to think there is a dependency on rpart# so comment out the example#library(rpart)#data(kyphosis)#y <- as.numeric(kyphosis$Kyphosis)-1#x <- kyphosis$Age#glm1 <- glm(y~poly(x,2),family=binomial)#p <- predict(glm1,type="response")#calibrate.plot(y, p, xlim=c(0,0.6), ylim=c(0,0.6))Generalized Boosted Regression Modeling (GBM)
Description
Fits generalized boosted regression models. For technical details, see the vignette:utils::browseVignettes("gbm").
Usage
gbm( formula = formula(data), distribution = "bernoulli", data = list(), weights, var.monotone = NULL, n.trees = 100, interaction.depth = 1, n.minobsinnode = 10, shrinkage = 0.1, bag.fraction = 0.5, train.fraction = 1, cv.folds = 0, keep.data = TRUE, verbose = FALSE, class.stratify.cv = NULL, n.cores = NULL)Arguments
formula | A symbolic description of the model to be fit. The formulamay include an offset term (e.g. y~offset(n)+x). If |
distribution | Either a character string specifying the name of thedistribution to use or a list with a component Currently available options are If quantile regression is specified, If If "pairwise" regression is specified,
Note that splitting of instances into training and validation sets followsgroup boundaries and therefore only approximates the specified Weights can be used in conjunction with pairwise metrics, however it isassumed that they are constant for instances from the same group. For details and background on the algorithm, see e.g. Burges (2010). |
data | an optional data frame containing the variables in the model. Bydefault the variables are taken from |
weights | an optional vector of weights to be used in the fittingprocess. Must be positive but do not need to be normalized. If |
var.monotone | an optional vector, the same length as the number ofpredictors, indicating which variables have a monotone increasing (+1),decreasing (-1), or arbitrary (0) relationship with the outcome. |
n.trees | Integer specifying the total number of trees to fit. This is equivalent to the number of iterations and the number of basis functions in the additive expansion. Default is 100. |
interaction.depth | Integer specifying the maximum depth of each tree (i.e., the highest level of variable interactions allowed). A value of 1 implies an additive model, a value of 2 implies a model with up to 2-way interactions, etc. Default is 1. |
n.minobsinnode | Integer specifying the minimum number of observations in the terminal nodes of the trees. Note that this is the actual number of observations, not the total weight. |
shrinkage | a shrinkage parameter applied to each tree in theexpansion. Also known as the learning rate or step-size reduction; 0.001 to 0.1 usually work, but a smaller learning rate typically requires more trees.Default is 0.1. |
bag.fraction | the fraction of the training set observations randomlyselected to propose the next tree in the expansion. This introducesrandomnesses into the model fit. If |
train.fraction | The first |
cv.folds | Number of cross-validation folds to perform. If |
keep.data | a logical variable indicating whether to keep the data andan index of the data stored with the object. Keeping the data and indexmakes subsequent calls to |
verbose | Logical indicating whether or not to print out progress and performance indicators ( |
class.stratify.cv | Logical indicating whether or not the cross-validation should be stratified by class. Defaults to |
n.cores | The number of CPU cores to use. The cross-validation loopwill attempt to send different CV folds off to different cores. If |
Details
gbm.fit provides the link between R and the C++ gbm engine.gbm is a front-end togbm.fit that uses the familiar Rmodeling formulas. However,model.frame is very slow ifthere are many predictor variables. For power-users with many variables usegbm.fit. For general practicegbm is preferable.
This package implements the generalized boosted modeling framework. Boostingis the process of iteratively adding basis functions in a greedy fashion sothat each additional basis function further reduces the selected lossfunction. This implementation closely follows Friedman's Gradient BoostingMachine (Friedman, 2001).
In addition to many of the features documented in the Gradient BoostingMachine,gbm offers additional features including the out-of-bagestimator for the optimal number of iterations, the ability to store andmanipulate the resultinggbm object, and a variety of other lossfunctions that had not previously had associated boosting algorithms,including the Cox partial likelihood for censored data, the poissonlikelihood for count outcomes, and a gradient boosting implementation tominimize the AdaBoost exponential loss function. This gbm package is nolonger under further development. Considerhttps://github.com/gbm-developers/gbm3 for the latest version.
Value
Agbm.object object.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
Quantile regression code developed by Brian Krieglerbk@stat.ucla.edu
t-distribution, and multinomial code developed by Harry Southworth andDaniel Edwards
Pairwise code developed by Stefan Schroedlschroedl@a9.com
References
Y. Freund and R.E. Schapire (1997) “A decision-theoreticgeneralization of on-line learning and an application to boosting,”Journal of Computer and System Sciences, 55(1):119-139.
G. Ridgeway (1999). “The state of boosting,”Computing Scienceand Statistics 31:172-181.
J.H. Friedman, T. Hastie, R. Tibshirani (2000). “Additive LogisticRegression: a Statistical View of Boosting,”Annals of Statistics28(2):337-374.
J.H. Friedman (2001). “Greedy Function Approximation: A GradientBoosting Machine,”Annals of Statistics 29(5):1189-1232.
J.H. Friedman (2002). “Stochastic Gradient Boosting,”Computational Statistics and Data Analysis 38(4):367-378.
B. Kriegler (2007). Cost-Sensitive Stochastic Gradient Boosting Within a Quantitative Regression Framework. Ph.D. Dissertation. University of California at Los Angeles, Los Angeles, CA, USA. Advisor(s) Richard A. Berk.https://dl.acm.org/doi/book/10.5555/1354603.
C. Burges (2010). “From RankNet to LambdaRank to LambdaMART: AnOverview,” Microsoft Research Technical Report MSR-TR-2010-82.
See Also
gbm.object,gbm.perf,plot.gbm,predict.gbm,summary.gbm, andpretty.gbm.tree.
Examples
## A least squares regression example ## Simulate dataset.seed(101) # for reproducibilityN <- 1000X1 <- runif(N)X2 <- 2 * runif(N)X3 <- ordered(sample(letters[1:4], N, replace = TRUE), levels = letters[4:1])X4 <- factor(sample(letters[1:6], N, replace = TRUE))X5 <- factor(sample(letters[1:3], N, replace = TRUE))X6 <- 3 * runif(N) mu <- c(-1, 0, 1, 2)[as.numeric(X3)]SNR <- 10 # signal-to-noise ratioY <- X1 ^ 1.5 + 2 * (X2 ^ 0.5) + musigma <- sqrt(var(Y) / SNR)Y <- Y + rnorm(N, 0, sigma)X1[sample(1:N,size=500)] <- NA # introduce some missing valuesX4[sample(1:N,size=300)] <- NA # introduce some missing valuesdata <- data.frame(Y, X1, X2, X3, X4, X5, X6)# Fit a GBMset.seed(102) # for reproducibilitygbm1 <- gbm(Y ~ ., data = data, var.monotone = c(0, 0, 0, 0, 0, 0), distribution = "gaussian", n.trees = 100, shrinkage = 0.1, interaction.depth = 3, bag.fraction = 0.5, train.fraction = 0.5, n.minobsinnode = 10, cv.folds = 5, keep.data = TRUE, verbose = FALSE, n.cores = 1) # Check performance using the out-of-bag (OOB) error; the OOB error typically# underestimates the optimal number of iterationsbest.iter <- gbm.perf(gbm1, method = "OOB")print(best.iter)# Check performance using the 50% heldout test setbest.iter <- gbm.perf(gbm1, method = "test")print(best.iter)# Check performance using 5-fold cross-validationbest.iter <- gbm.perf(gbm1, method = "cv")print(best.iter)# Plot relative influence of each variablepar(mfrow = c(1, 2))summary(gbm1, n.trees = 1) # using first treesummary(gbm1, n.trees = best.iter) # using estimated best number of trees# Compactly print the first and last trees for curiosityprint(pretty.gbm.tree(gbm1, i.tree = 1))print(pretty.gbm.tree(gbm1, i.tree = gbm1$n.trees))# Simulate new dataset.seed(103) # for reproducibilityN <- 1000X1 <- runif(N)X2 <- 2 * runif(N)X3 <- ordered(sample(letters[1:4], N, replace = TRUE))X4 <- factor(sample(letters[1:6], N, replace = TRUE))X5 <- factor(sample(letters[1:3], N, replace = TRUE))X6 <- 3 * runif(N) mu <- c(-1, 0, 1, 2)[as.numeric(X3)]Y <- X1 ^ 1.5 + 2 * (X2 ^ 0.5) + mu + rnorm(N, 0, sigma)data2 <- data.frame(Y, X1, X2, X3, X4, X5, X6)# Predict on the new data using the "best" number of trees; by default,# predictions will be on the link scaleYhat <- predict(gbm1, newdata = data2, n.trees = best.iter, type = "link")# least squares errorprint(sum((data2$Y - Yhat)^2))# Construct univariate partial dependence plotsplot(gbm1, i.var = 1, n.trees = best.iter)plot(gbm1, i.var = 2, n.trees = best.iter)plot(gbm1, i.var = "X3", n.trees = best.iter) # can use index or name# Construct bivariate partial dependence plotsplot(gbm1, i.var = 1:2, n.trees = best.iter)plot(gbm1, i.var = c("X2", "X3"), n.trees = best.iter)plot(gbm1, i.var = 3:4, n.trees = best.iter)# Construct trivariate partial dependence plotsplot(gbm1, i.var = c(1, 2, 6), n.trees = best.iter, continuous.resolution = 20)plot(gbm1, i.var = 1:3, n.trees = best.iter)plot(gbm1, i.var = 2:4, n.trees = best.iter)plot(gbm1, i.var = 3:5, n.trees = best.iter)# Add more (i.e., 100) boosting iterations to the ensemblegbm2 <- gbm.more(gbm1, n.new.trees = 100, verbose = FALSE)Generalized Boosted Regression Modeling (GBM)
Description
Workhorse function providing the link between R and the C++ gbm engine.gbm is a front-end togbm.fit that uses the familiar Rmodeling formulas. However,model.frame is very slow ifthere are many predictor variables. For power-users with many variables usegbm.fit. For general practicegbm is preferable.
Usage
gbm.fit( x, y, offset = NULL, misc = NULL, distribution = "bernoulli", w = NULL, var.monotone = NULL, n.trees = 100, interaction.depth = 1, n.minobsinnode = 10, shrinkage = 0.001, bag.fraction = 0.5, nTrain = NULL, train.fraction = NULL, keep.data = TRUE, verbose = TRUE, var.names = NULL, response.name = "y", group = NULL)Arguments
x | A data frame or matrix containing the predictor variables. The number of rows in |
y | A vector of outcomes. The number of rows in |
offset | A vector of offset values. |
misc | An R object that is simply passed on to the gbm engine. It can be used for additional data for the specific distribution. Currently it is only used for passing the censoring indicator for the Cox proportional hazards model. |
distribution | Either a character string specifying the name of thedistribution to use or a list with a component Currently available options are If quantile regression is specified, If If "pairwise" regression is specified,
Note that splitting of instances into training and validation sets followsgroup boundaries and therefore only approximates the specified Weights can be used in conjunction with pairwise metrics, however it isassumed that they are constant for instances from the same group. For details and background on the algorithm, see e.g. Burges (2010). |
w | A vector of weights of the same length as the |
var.monotone | an optional vector, the same length as the number ofpredictors, indicating which variables have a monotone increasing (+1),decreasing (-1), or arbitrary (0) relationship with the outcome. |
n.trees | the total number of trees to fit. This is equivalent to thenumber of iterations and the number of basis functions in the additiveexpansion. |
interaction.depth | The maximum depth of variable interactions. A valueof 1 implies an additive model, a value of 2 implies a model with up to 2-way interactions, etc. Default is |
n.minobsinnode | Integer specifying the minimum number of observations in the trees terminal nodes. Note that this is the actual number of observations not the total weight. |
shrinkage | The shrinkage parameter applied to each tree in theexpansion. Also known as the learning rate or step-size reduction; 0.001 to 0.1 usually work, but a smaller learning rate typically requires more trees.Default is |
bag.fraction | The fraction of the training set observations randomlyselected to propose the next tree in the expansion. This introducesrandomnesses into the model fit. If |
nTrain | An integer representing the number of cases on which to train.This is the preferred way of specification for |
train.fraction | The first |
keep.data | Logical indicating whether or not to keep the data and an index of the data stored with the object. Keeping the data and index makes subsequent calls to |
verbose | Logical indicating whether or not to print out progress and performance indicators ( |
var.names | Vector of strings of length equal to the number of columns of |
response.name | Character string label for the response variable. |
group | The |
Details
This package implements the generalized boosted modeling framework. Boostingis the process of iteratively adding basis functions in a greedy fashion sothat each additional basis function further reduces the selected lossfunction. This implementation closely follows Friedman's Gradient BoostingMachine (Friedman, 2001).
In addition to many of the features documented in the Gradient BoostingMachine,gbm offers additional features including the out-of-bagestimator for the optimal number of iterations, the ability to store andmanipulate the resultinggbm object, and a variety of other lossfunctions that had not previously had associated boosting algorithms,including the Cox partial likelihood for censored data, the poissonlikelihood for count outcomes, and a gradient boosting implementation tominimize the AdaBoost exponential loss function.
Value
Agbm.object object.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
Quantile regression code developed by Brian Krieglerbk@stat.ucla.edu
t-distribution, and multinomial code developed by Harry Southworth andDaniel Edwards
Pairwise code developed by Stefan Schroedlschroedl@a9.com
References
Y. Freund and R.E. Schapire (1997) “A decision-theoreticgeneralization of on-line learning and an application to boosting,”Journal of Computer and System Sciences, 55(1):119-139.
G. Ridgeway (1999). “The state of boosting,”Computing Scienceand Statistics 31:172-181.
J.H. Friedman, T. Hastie, R. Tibshirani (2000). “Additive LogisticRegression: a Statistical View of Boosting,”Annals of Statistics28(2):337-374.
J.H. Friedman (2001). “Greedy Function Approximation: A GradientBoosting Machine,”Annals of Statistics 29(5):1189-1232.
J.H. Friedman (2002). “Stochastic Gradient Boosting,”Computational Statistics and Data Analysis 38(4):367-378.
B. Kriegler (2007). Cost-Sensitive Stochastic Gradient Boosting Within a Quantitative Regression Framework. Ph.D. Dissertation. University of California at Los Angeles, Los Angeles, CA, USA. Advisor(s) Richard A. Berk.https://dl.acm.org/doi/book/10.5555/1354603.
C. Burges (2010). “From RankNet to LambdaRank to LambdaMART: AnOverview,” Microsoft Research Technical Report MSR-TR-2010-82.
See Also
gbm.object,gbm.perf,plot.gbm,predict.gbm,summary.gbm, andpretty.gbm.tree.
Generalized Boosted Regression Modeling (GBM)
Description
Adds additional trees to agbm.object object.
Usage
gbm.more( object, n.new.trees = 100, data = NULL, weights = NULL, offset = NULL, verbose = NULL)Arguments
object | A |
n.new.trees | Integer specifying the number of additional trees to add to |
data | An optional data frame containing the variables in the model. Bydefault the variables are taken from |
weights | An optional vector of weights to be used in the fittingprocess. Must be positive but do not need to be normalized. If |
offset | A vector of offset values. |
verbose | Logical indicating whether or not to print out progress and performance indicators ( |
Value
Agbm.object object.
Examples
## A least squares regression example ## Simulate dataset.seed(101) # for reproducibilityN <- 1000X1 <- runif(N)X2 <- 2 * runif(N)X3 <- ordered(sample(letters[1:4], N, replace = TRUE), levels = letters[4:1])X4 <- factor(sample(letters[1:6], N, replace = TRUE))X5 <- factor(sample(letters[1:3], N, replace = TRUE))X6 <- 3 * runif(N) mu <- c(-1, 0, 1, 2)[as.numeric(X3)]SNR <- 10 # signal-to-noise ratioY <- X1 ^ 1.5 + 2 * (X2 ^ 0.5) + musigma <- sqrt(var(Y) / SNR)Y <- Y + rnorm(N, 0, sigma)X1[sample(1:N,size=500)] <- NA # introduce some missing valuesX4[sample(1:N,size=300)] <- NA # introduce some missing valuesdata <- data.frame(Y, X1, X2, X3, X4, X5, X6)# Fit a GBMset.seed(102) # for reproducibilitygbm1 <- gbm(Y ~ ., data = data, var.monotone = c(0, 0, 0, 0, 0, 0), distribution = "gaussian", n.trees = 100, shrinkage = 0.1, interaction.depth = 3, bag.fraction = 0.5, train.fraction = 0.5, n.minobsinnode = 10, cv.folds = 5, keep.data = TRUE, verbose = FALSE, n.cores = 1) # Check performance using the out-of-bag (OOB) error; the OOB error typically# underestimates the optimal number of iterationsbest.iter <- gbm.perf(gbm1, method = "OOB")print(best.iter)# Check performance using the 50% heldout test setbest.iter <- gbm.perf(gbm1, method = "test")print(best.iter)# Check performance using 5-fold cross-validationbest.iter <- gbm.perf(gbm1, method = "cv")print(best.iter)# Plot relative influence of each variablepar(mfrow = c(1, 2))summary(gbm1, n.trees = 1) # using first treesummary(gbm1, n.trees = best.iter) # using estimated best number of trees# Compactly print the first and last trees for curiosityprint(pretty.gbm.tree(gbm1, i.tree = 1))print(pretty.gbm.tree(gbm1, i.tree = gbm1$n.trees))# Simulate new dataset.seed(103) # for reproducibilityN <- 1000X1 <- runif(N)X2 <- 2 * runif(N)X3 <- ordered(sample(letters[1:4], N, replace = TRUE))X4 <- factor(sample(letters[1:6], N, replace = TRUE))X5 <- factor(sample(letters[1:3], N, replace = TRUE))X6 <- 3 * runif(N) mu <- c(-1, 0, 1, 2)[as.numeric(X3)]Y <- X1 ^ 1.5 + 2 * (X2 ^ 0.5) + mu + rnorm(N, 0, sigma)data2 <- data.frame(Y, X1, X2, X3, X4, X5, X6)# Predict on the new data using the "best" number of trees; by default,# predictions will be on the link scaleYhat <- predict(gbm1, newdata = data2, n.trees = best.iter, type = "link")# least squares errorprint(sum((data2$Y - Yhat)^2))# Construct univariate partial dependence plotsplot(gbm1, i.var = 1, n.trees = best.iter)plot(gbm1, i.var = 2, n.trees = best.iter)plot(gbm1, i.var = "X3", n.trees = best.iter) # can use index or name# Construct bivariate partial dependence plotsplot(gbm1, i.var = 1:2, n.trees = best.iter)plot(gbm1, i.var = c("X2", "X3"), n.trees = best.iter)plot(gbm1, i.var = 3:4, n.trees = best.iter)# Construct trivariate partial dependence plotsplot(gbm1, i.var = c(1, 2, 6), n.trees = best.iter, continuous.resolution = 20)plot(gbm1, i.var = 1:3, n.trees = best.iter)plot(gbm1, i.var = 2:4, n.trees = best.iter)plot(gbm1, i.var = 3:5, n.trees = best.iter)# Add more (i.e., 100) boosting iterations to the ensemblegbm2 <- gbm.more(gbm1, n.new.trees = 100, verbose = FALSE)Generalized Boosted Regression Model Object
Description
These are objects representing fittedgbms.
Value
initF | The "intercept" term, the initial predicted value towhich trees make adjustments. |
fit | A vector containing the fittedvalues on the scale of regression function (e.g. log-odds scale forbernoulli, log scale for poisson). |
train.error | A vector of lengthequal to the number of fitted trees containing the value of the lossfunction for each boosting iteration evaluated on the training data. |
valid.error | A vector of length equal to the number of fitted treescontaining the value of the loss function for each boosting iterationevaluated on the validation data. |
cv.error | If |
oobag.improve | A vector oflength equal to the number of fitted trees containing an out-of-bag estimateof the marginal reduction in the expected value of the loss function. Theout-of-bag estimate uses only the training data and is useful for estimatingthe optimal number of boosting iterations. See |
trees | A list containing the tree structures. The components are bestviewed using |
c.splits | A list of allthe categorical splits in the collection of trees. If the |
cv.fitted | If cross-validation was performed, the cross-validationpredicted values on the scale of the linear predictor. That is, the fittedvalues from the i-th CV-fold, for the model having been trained on the datain all other folds. |
Structure
The following components must be included in alegitimategbm object.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
See Also
GBM performance
Description
Estimates the optimal number of boosting iterations for agbm objectand optionally plots various performance measures
Usage
gbm.perf(object, plot.it = TRUE, oobag.curve = FALSE, overlay = TRUE, method)Arguments
object | A |
plot.it | An indicator of whether or not to plot the performancemeasures. Setting |
oobag.curve | Indicates whether to plot the out-of-bag performancemeasures in a second plot. |
overlay | If TRUE and oobag.curve=TRUE then a right y-axis is added tothe training and test error plot and the estimated cumulative improvement in the loss function is plotted versus the iteration number. |
method | Indicate the method used to estimate the optimal number ofboosting iterations. |
Value
gbm.perf Returns the estimated optimal number of iterations.The method of computation depends on themethod argument.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
See Also
Compute Information Retrieval measures.
Description
Functions to compute Information Retrieval measures for pairwise loss for asingle group. The function returns the respective metric, or a negativevalue if it is undefined for the given group.
Usage
gbm.roc.area(obs, pred)gbm.conc(x)ir.measure.conc(y.f, max.rank = 0)ir.measure.auc(y.f, max.rank = 0)ir.measure.mrr(y.f, max.rank)ir.measure.map(y.f, max.rank = 0)ir.measure.ndcg(y.f, max.rank)perf.pairwise(y, f, group, metric = "ndcg", w = NULL, max.rank = 0)Arguments
obs | Observed value. |
pred | Predicted value. |
x | Numeric vector. |
y,y.f,f,w,group,max.rank | Used internally. |
metric | What type of performance measure to compute. |
Details
For simplicity, we have no special handling for ties; instead, we break tiesrandomly. This is slightly inaccurate for individual groups, but should haveonly a small effect on the overall measure.
gbm.conc computes the concordance index: Fraction of all pairs (i,j)with i<j, x[i] != x[j], such that x[j] < x[i]
Ifobs is binary, thengbm.roc.area(obs, pred) =gbm.conc(obs[order(-pred)]).
gbm.conc is more general as it allows non-binary targets, but issignificantly slower.
Value
The requested performance measure.
Author(s)
Stefan Schroedl
References
C. Burges (2010). "From RankNet to LambdaRank to LambdaMART: AnOverview", Microsoft Research Technical Report MSR-TR-2010-82.
See Also
Cross-validate a gbm
Description
Functions for cross-validating gbm. These functions are used internally andare not intended for end-user direct usage.
Usage
gbmCrossVal( cv.folds, nTrain, n.cores, class.stratify.cv, data, x, y, offset, distribution, w, var.monotone, n.trees, interaction.depth, n.minobsinnode, shrinkage, bag.fraction, var.names, response.name, group)gbmCrossValErr(cv.models, cv.folds, cv.group, nTrain, n.trees)gbmCrossValPredictions( cv.models, cv.folds, cv.group, best.iter.cv, distribution, data, y)gbmCrossValModelBuild( cv.folds, cv.group, n.cores, i.train, x, y, offset, distribution, w, var.monotone, n.trees, interaction.depth, n.minobsinnode, shrinkage, bag.fraction, var.names, response.name, group)gbmDoFold( X, i.train, x, y, offset, distribution, w, var.monotone, n.trees, interaction.depth, n.minobsinnode, shrinkage, bag.fraction, cv.group, var.names, response.name, group, s)Arguments
cv.folds | The number of cross-validation folds. |
nTrain | The number of training samples. |
n.cores | The number of cores to use. |
class.stratify.cv | Whether or not stratified cross-validation samplesare used. |
data | The data. |
x | The model matrix. |
y | The response variable. |
offset | The offset. |
distribution | The type of loss function. See |
w | Observation weights. |
var.monotone | See |
n.trees | The number of trees to fit. |
interaction.depth | The degree of allowed interactions. See |
n.minobsinnode | See |
shrinkage | See |
bag.fraction | See |
var.names | See |
response.name | See |
group | Used when |
cv.models | A list containing the models for each fold. |
cv.group | A vector indicating the cross-validation fold for eachmember of the training set. |
best.iter.cv | The iteration with lowest cross-validation error. |
i.train | Items in the training set. |
X | Index (cross-validation fold) on which to subset. |
s | Random seed. |
Details
These functions are not intended for end-user direct usage, but are usedinternally bygbm.
Value
A list containing the cross-validation error and predictions.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
References
J.H. Friedman (2001). "Greedy Function Approximation: A GradientBoosting Machine," Annals of Statistics 29(5):1189-1232.
L. Breiman (2001).https://www.stat.berkeley.edu/users/breiman/randomforest2001.pdf.
See Also
gbm internal functions
Description
Helper functions for preprocessing data prior to building a"gbm"object.
Usage
guessDist(y)getCVgroup(distribution, class.stratify.cv, y, i.train, cv.folds, group)getStratify(strat, d)checkMissing(x, y)checkWeights(w, n)checkID(id)checkOffset(o, y)getVarNames(x)gbmCluster(n)Arguments
y | The response variable. |
class.stratify.cv | Whether or not to stratify, if provided by the user. |
i.train | Computed internally by |
cv.folds | The number of cross-validation folds. |
group | The group, if using |
strat | Whether or not to stratify. |
d,distribution | The distribution, either specified by the user orimplied. |
x | The design matrix. |
w | The weights. |
n | The number of cores to use in the cluster. |
id | The interaction depth. |
o | The offset. |
Details
These are functions used internally bygbm and not intended for direct use by the user.
Estimate the strength of interaction effects
Description
Computes Friedman's H-statistic to assess the strength of variableinteractions.
Usage
interact.gbm(x, data, i.var = 1, n.trees = x$n.trees)Arguments
x | A |
data | The dataset used to construct |
i.var | A vector of indices or the names of the variables for computethe interaction effect. If using indices, the variables are indexed in thesame order that they appear in the initial |
n.trees | The number of trees used to generate the plot. Only the first |
Details
interact.gbm computes Friedman's H-statistic to assess the relativestrength of interaction effects in non-linear models. H is on the scale of[0-1] with higher values indicating larger interaction effects. To connectto a more familiar measure, ifx_1 andx_2 are uncorrelatedcovariates with mean 0 and variance 1 and the model is of the form
y=\beta_0+\beta_1x_1+\beta_2x_2+\beta_3x_3
then
H=\frac{\beta_3}{\sqrt{\beta_1^2+\beta_2^2+\beta_3^2}}
Note that if the main effects are weak, the estimated H will be unstable.For example, if (in the case of a two-way interaction) neither main effectis in the selected model (relative influence is zero), the result will be0/0. Also, with weak main effects, rounding errors can result in values of H> 1 which are not possible.
Value
Returns the value ofH.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
References
J.H. Friedman and B.E. Popescu (2005). “PredictiveLearning via Rule Ensembles.” Section 8.1
See Also
Marginal plots of fitted gbm objects
Description
Plots the marginal effect of the selected variables by "integrating" out theother variables.
Usage
## S3 method for class 'gbm'plot( x, i.var = 1, n.trees = x$n.trees, continuous.resolution = 100, return.grid = FALSE, type = c("link", "response"), level.plot = TRUE, contour = FALSE, number = 4, overlap = 0.1, col.regions = viridis::viridis, ...)Arguments
x | A |
i.var | Vector of indices or the names of the variables to plot. Ifusing indices, the variables are indexed in the same order that they appearin the initial |
n.trees | Integer specifying the number of trees to use to generate the plot. Default is to use |
continuous.resolution | Integer specifying the number of equally space points at which to evaluate continuous predictors. |
return.grid | Logical indicating whether or not to produce graphics |
type | Character string specifying the type of prediction to plot on the vertical axis. See |
level.plot | Logical indicating whether or not to use a false color level plot ( |
contour | Logical indicating whether or not to add contour lines to thelevel plot. Only used when |
number | Integer specifying the number of conditional intervals to usefor the continuous panel variables. See |
overlap | The fraction of overlap of the conditioning variables. See |
col.regions | Color vector to be used if |
... | Additional optional arguments to be passed onto |
Details
plot.gbm produces low dimensional projections of thegbm.object by integrating out the variables not included inthei.var argument. The function selects a grid of points and usesthe weighted tree traversal method described in Friedman (2001) to do theintegration. Based on the variable types included in the projection,plot.gbm selects an appropriate display choosing amongst line plots,contour plots, andlattice plots. If the defaultgraphics are not sufficient the user may setreturn.grid = TRUE, storethe result of the function, and develop another graphic display moreappropriate to the particular example.
Value
Ifreturn.grid = TRUE, a grid of evaluation points and their average predictions. Otherwise, a plot is returned.
Note
More flexible plotting is available using thepartial andplotPartial functions.
References
J. H. Friedman (2001). "Greedy Function Approximation: A GradientBoosting Machine," Annals of Statistics 29(4).
B. M. Greenwell (2017). "pdp: An R Package for Constructing Partial Dependence Plots," The R Journal 9(1), 421–436.https://journal.r-project.org/archive/2017/RJ-2017-016/index.html.
See Also
partial,plotPartial,gbm, andgbm.object.
Predict method for GBM Model Fits
Description
Predicted values based on a generalized boosted model object
Usage
## S3 method for class 'gbm'predict(object, newdata, n.trees, type = "link", single.tree = FALSE, ...)Arguments
object | Object of class inheriting from ( |
newdata | Data frame of observations for which to make predictions |
n.trees | Number of trees used in the prediction. |
type | The scale on which gbm makes the predictions |
single.tree | If |
... | further arguments passed to or from other methods |
Details
predict.gbm produces predicted values for each observation innewdata using the the firstn.trees iterations of the boostingsequence. Ifn.trees is a vector than the result is a matrix witheach column representing the predictions from gbm models withn.trees[1] iterations,n.trees[2] iterations, and so on.
The predictions fromgbm do not include the offset term. The user mayadd the value of the offset to the predicted value if desired.
Ifobject was fit usinggbm.fit there will be noTerms component. Therefore, the user has greater responsibility tomake sure thatnewdata is of the same format (order and number ofvariables) as the one originally used to fit the model.
Value
Returns a vector of predictions. By default the predictions are onthe scale of f(x). For example, for the Bernoulli loss the returned value ison the log odds scale, poisson loss on the log scale, and coxph is on thelog hazard scale.
Iftype="response" thengbm converts back to the same scale asthe outcome. Currently the only effect this will have is returningprobabilities for bernoulli and expected counts for poisson. For the otherdistributions "response" and "link" return the same.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
See Also
Print gbm tree components
Description
gbm stores the collection of trees used to construct the model in acompact matrix structure. This function extracts the information from asingle tree and displays it in a slightly more readable form. This functionis mostly for debugging purposes and to satisfy some users' curiosity.
Usage
## S3 method for class 'gbm.tree'pretty(object, i.tree = 1)Arguments
object | a |
i.tree | the index of the tree component to extract from |
Value
pretty.gbm.tree returns a data frame. Each row corresponds toa node in the tree. Columns indicate
SplitVar | index of which variableis used to split. -1 indicates a terminal node. |
SplitCodePred | if thesplit variable is continuous then this component is the split point. If thesplit variable is categorical then this component contains the index of |
LeftNode | the index of therow corresponding to the left node. |
RightNode | the index of the rowcorresponding to the right node. |
ErrorReduction | the reduction in theloss function as a result of splitting this node. |
Weight | the totalweight of observations in the node. If weights are all equal to 1 then thisis the number of observations in the node. |
Author(s)
Greg Ridgewaygregridgeway@gmail.com
See Also
Print model summary
Description
Display basic information about agbm object.
Usage
## S3 method for class 'gbm'print(x, ...)show.gbm(x, ...)Arguments
x | an object of class |
... | arguments passed to |
Details
Prints some information about the model object. In particular, this methodprints the call togbm(), the type of loss function that was used,and the total number of iterations.
If cross-validation was performed, the 'best' number of trees as estimatedby cross-validation error is displayed. If a test set was used, the 'best'number of trees as estimated by the test set error is displayed.
The number of available predictors, and the number of those having non-zeroinfluence on predictions is given (which might be interesting in data miningapplications).
If multinomial, bernoulli or adaboost was used, the confusion matrix andprediction accuracy are printed (objects being allocated to the class withhighest probability for multinomial and bernoulli). These classificationsare performed on the entire training data using the model with the 'best'number of trees as described above, or the maximum number of trees if the'best' cannot be computed.
If the 'distribution' was specified as gaussian, laplace, quantile ort-distribution, a summary of the residuals is displayed. The residuals arefor the training data with the model at the 'best' number of trees, asdescribed above, or the maximum number of trees if the 'best' cannot becomputed.
Author(s)
Harry Southworth, Daniel Edwards
See Also
Examples
data(iris)iris.mod <- gbm(Species ~ ., distribution="multinomial", data=iris, n.trees=2000, shrinkage=0.01, cv.folds=5, verbose=FALSE, n.cores=1)iris.mod#data(lung)#lung.mod <- gbm(Surv(time, status) ~ ., distribution="coxph", data=lung,# n.trees=2000, shrinkage=0.01, cv.folds=5,verbose =FALSE)#lung.modQuantile rug plot
Description
Marks the quantiles on the axes of the current plot.
Usage
## S3 method for class 'rug'quantile(x, prob = 0:10/10, ...)Arguments
x | A numeric vector. |
prob | The quantiles of x to mark on the x-axis. |
... | Additional optional arguments to be passed onto |
Value
No return values.
Author(s)
Greg Ridgewaygregridgeway@gmail.com.
See Also
Examples
x <- rnorm(100)y <- rnorm(100)plot(x, y)quantile.rug(x)Reconstruct a GBM's Source Data
Description
Helper function to reconstitute the data for plots and summaries. Thisfunction is not intended for the user to call directly.
Usage
reconstructGBMdata(x)Arguments
x | a |
Value
Returns a data used to fit the gbm in a format that can subsequentlybe used for plots and summaries
Author(s)
Harry Southworth
See Also
Methods for estimating relative influence
Description
Helper functions for computing the relative influence of each variable inthe gbm object.
Usage
relative.influence(object, n.trees, scale. = FALSE, sort. = FALSE)permutation.test.gbm(object, n.trees)gbm.loss(y, f, w, offset, dist, baseline, group = NULL, max.rank = NULL)Arguments
object | a |
n.trees | the number of trees to use for computations. If not provided,the the function will guess: if a test set was used in fitting, the numberof trees resulting in lowest test set error will be used; otherwise, ifcross-validation was performed, the number of trees resulting in lowestcross-validation error will be used; otherwise, all trees will be used. |
scale. | whether or not the result should be scaled. Defaults to |
sort. | whether or not the results should be (reverse) sorted.Defaults to |
y,f,w,offset,dist,baseline | For |
group,max.rank | Used internally when |
Details
This is not intended for end-user use. These functions offer the differentmethods for computing the relative influence insummary.gbm.gbm.loss is a helper function forpermutation.test.gbm.
Value
By default, returns an unprocessed vector of estimated relativeinfluences. If thescale. andsort. arguments are used,returns a processed version of the same.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
References
J.H. Friedman (2001). "Greedy Function Approximation: A GradientBoosting Machine," Annals of Statistics 29(5):1189-1232.
L. Breiman (2001).https://www.stat.berkeley.edu/users/breiman/randomforest2001.pdf.
See Also
Summary of a gbm object
Description
Computes the relative influence of each variable in the gbm object.
Usage
## S3 method for class 'gbm'summary( object, cBars = length(object$var.names), n.trees = object$n.trees, plotit = TRUE, order = TRUE, method = relative.influence, normalize = TRUE, ...)Arguments
object | a |
cBars | the number of bars to plot. If |
n.trees | the number of trees used to generate the plot. Only the first |
plotit | an indicator as to whether the plot is generated. |
order | an indicator as to whether the plotted and/or returned relativeinfluences are sorted. |
method | The function used to compute the relative influence. |
normalize | if |
... | other arguments passed to the plot function. |
Details
Fordistribution="gaussian" this returns exactly the reduction ofsquared error attributable to each variable. For other loss functions thisreturns the reduction attributable to each variable in sum of squared errorin predicting the gradient on each iteration. It describes the relativeinfluence of each variable in reducing the loss function. See the referencesbelow for exact details on the computation.
Value
Returns a data frame where the first component is the variable nameand the second is the computed relative influence, normalized to sum to 100.
Author(s)
Greg Ridgewaygregridgeway@gmail.com
References
J.H. Friedman (2001). "Greedy Function Approximation: A GradientBoosting Machine," Annals of Statistics 29(5):1189-1232.
L. Breiman(2001).https://www.stat.berkeley.edu/users/breiman/randomforest2001.pdf.
See Also
Test thegbm package.
Description
Run tests ongbm functions to perform logical checks andreproducibility.
Usage
test.gbm()Details
The function uses functionality in theRUnit package. A fairly smallvalidation suite is executed that checks to see that relative influenceidentifies sensible variables from simulated data, and that predictions fromGBMs with Gaussian, Cox or binomial distributions are sensible,
Value
An object of classRUnitTestData. See the help forRUnit for details.
Note
The test suite is not comprehensive.
Author(s)
Harry Southworth
See Also
Examples
# Uncomment the following lines to run - commented out to make CRAN happy#library(RUnit)#val <- validate.texmex()#printHTMLProtocol(val, "texmexReport.html")