Function pre now allows use of function randomForest (frompackage of same name) for rule induction though argumentrandomForest.
Functions singleplot and pairplot now plot observed values forpredictor(s) of interest as a rug on the x-axis, and invisibly returnthe evaluated partial dependence function(s) for further use.
Bug fixed in computation of importances: factors included aslinear terms did not contribute to variable importances. Now theydo.
Function mi_pre added, to allow for fitting a single predictionrule ensemble on multiply imputed datasets.
Function prune_pre added, to extract optimal penalty values foran ensemble of given size.
Adaptive lasso has been implemented in function pre. Argumentsad_alpha and ad_penalty allow for fitting the final rule ensemble usingadaptive lasso. See documentation of function pre andvignette(“relaxed”, “pre”).
Partial dependence plotting functions singleplot and pairplot nowalso support multinomial and multivariate outcomes.
Vignette added detailing how function pre’s computatation timecan be reduced.
Implemented relaxed lasso fits and added vignette on relaxedfits, see vignette(“relaxed”, “pre”).
More informative warning message if no rules could bederived.
Added vignette explaining how to tune parameters of functionpre() using package caret.
Added vignette on how to deal with missing values.
Added support for supplying a tibble instead of a data.frame tofunction pre().
Function importance is now an S3method.
Argument cex.axis of function (method) importance is now passedcorrectly.
Argument weights of functions pre() and gpe() now correctlypassed to rule induction functions.
Added progress bar indicating tree fitting progress.
Argument singleconditions implemented in function pre()(experimental).
Minor changes for compatibility with changed data.frame functionin R version 4.0.0.
References to paper in Journal of Statistical Softwareincluded.
Bugs fixed in caret_pre_model: tuning of penalty.par.val argumentalways yielded results for “lambda.1se” only. Results are now correctlyreturned for “lambda.min” and “lambda.1se”. caret’s varImp() andpredictors() not supported (perhaps temporarily), as these would alwaysemploy default penalty.par.val of “lambda.1se”.
Bugs fixed in explain().
Added support for sparse rule matrix, which can be invokedthrough sparse argument in pre(). If sparse = TRUE, memory usage will bereduced and computation speed may be improved for largedatasets.
Added function explain(), which provides (graphical) explanationsof the ensemble’s predictions at the individual observationlevel.
Added support for survival responses (i.e., family = “cox”) inpre()
Added summary methods for pre and gpe.
Extended support to all response variable types available inpre() for functions plot(), importance() and cvpre().
plot.pre now allows for specifying separate plotting colors forrules with positive and negative coefficients.
coef and print methods for pre now return descriptions for theintercept (and factor variables), thanks to suggestion by StephenMilborrow.
Bug fix in pre(): ordered factors no longer yield error.Implemented new argument ‘ordinal’ in pre(), which specifies how orderedfactors should be processed.
Bug fix in cvpre(): pclass argument now processedcorrectly.
Bug fix in cvpre(): previously, SDs insteas of SEs were returnedfor binary classification. Accurate standard errors are returnednow.
Bugs fixed in coef.pre(), print.pre(), plot.pre() andimportance() when tree.unbiased = FALSE, thanks to a bug report byStephen Milborrow.
Function pre() now also supports multinomial and multivariategaussian response variables.
Function pre() now has argument ‘tree.unbiased’; if set to FALSE,the CART algorithm (as implemented in package ‘rpart’) is employed forrule induction.
Argument ‘maxdepth’ of function pre() allows for specifyingvarying maximum depth across trees, through specifying a vector oflength ntrees, or a random number generating function. See?maxdepth.sampler for details.
Added dataset ‘carrillo’
By default, a gradient boosting approach is now taken for allresponse types. That is, partykit::ctree() and a learning rate of .01 isemployed by default. Alternatively, glmtree() can be employed for treeinduction by sprecifying use.grad = FALSE.
The ‘family’ argument in pre() now takes character strings aswell as glm family objects.
Functions pairplot() and interact() now use HCL instead of highlysaturated HSV colors as default plotting colors.
Bug fixed in plot.pre: Node directions are now in accordance withrule definition.
Bug fixed in predict.pre: No error printed when response variableis not supplied.
Function gpe() added, which fits general prediction ensembles. Bydefault, it fits an ensemble of rules, linear and hinge functions.Function gpe() allows for specifying custom baselearner generatingfunctions and a custom fitting function for the final model.
Numerous bugs fixed, yielding faster computation times andclearer plots with more customization options.
Added support for count responses. Function pre() now has a‘family’ argument, which should be set to ‘poisson’ for count outcomes(the ‘family’ argument is set automatically to ‘gaussian’ for numericresponse variables and to ‘binomial’ for binary response variables(factors)).
A gradient boosting approach for binary outcomes is applied, bydefault, substantially reducing computation times. This can be turnedoff through the ‘use.grad’ argument in function pre().
The default of the ‘learnrate’ argument of function pre() hasbeen changed to .01, by default. Before, it was .01 for continuousoutcomes, but 0 for binary outcomes, to reduce computation time. Withgradient boosting implemented, computation time is muchreduced.
Argument ‘tree.control’ in function pre() allows for passingarguments to partykit tree fitting functions.
Arguments for the cv.glmnet() function are directly passedthrough better use of ellipsis (…). Most importantly, this means thatargument ‘mod.sel.crit’ cannot be used anymore and should be referred toas ‘type.measure’ (which will be directly passed to cv.glmnet).Similarly, ‘thres’ and ‘standardize’ are not explicit arguments offunction pre() anymore and can now be directly passed to cv.glmnet()using ellipsis (…).
Better use of sample weights: weights specified with the‘weights’ argument in pre() are now used as weights in the subsamplingprocedure, instead of as observation weights in the tree-fittingprocedure.
Added corplot() function, which shows the correlation between thebaselearners in the ensemble.
Function pairplot() returns a heatmap by default, a 3D or contourplot can also be requested.
Appearance of plot resulting from interaction()improved.
Added print() and plot() method for objects of classpre.
Added support for using functions like factor() and log() informula statement of function pre(). (thanks to Bill Venables forsuggesting this)
Added support for parallel computating in functions pre(),cvpre(), bsnullinteract() and interact().
Winsorizing points used for the linear terms are reported in thedescription of the base learners returned by coef() and importance().(Thanks to Rishi Sadhir for suggesting this)
Added README file.
Legend included in plot for interaction test statistics.
Fixed importance() function to allow for selecting final ensemblewith different value than ‘lambda.1se’.
Cleaned up all occurrences of set.seed()
Fixed cvpre() function: penalty.par.val argument nowincluded
Many minor bug fixes.