This notebook provides a detailed overview over theplasso package and its two main functionsplasso andcv.plasso which were developed inthe course ofKnaus (2022). This package is strongly orientedaround theglmnet package and rests on its standardfunctionglmnet in its very basis. Related theory andalgorithms are described inFriedman, Hastie, andTibshirani (2010).

Getting started

The very latest version of the package can be installed from itsGithub page. For theinstallation you will need thedevtools package. The latest‘official’ version can be installed from CRAN using`install.packages()’. We recommend the latter.

General dependencies are:glmnet,Matrix,methods,parallel,doParallel,foreach anditerators.

Code

library(devtools)devtools::install_github("MCKnaus/plasso")install.packages("plasso")

Loadplasso usinglibrary().

Code

library(plasso)

The package generally provides two functionsplasso andcv.plasso which are both built on top of theglmnet functionality. Specifically, aglmnetobject lives within both functions and also in their outputs (list itemlasso_full).

The termplasso refers to a Post-Lasso model whichestimates a least squares algorithm only for the active (i.e. non-zero)coefficients of a previously estimated Lasso models. This follows theidea that we want to do selection but without shrinkage.

The package comes with some simulated data representing the followingDGP:

The covariates matrix\(X\) consistsof 10 variables whose effect size one the target\(Y\) is defined by the vector\(\boldsymbol{\pi} = [1, -0.83, 0.67, -0.5, 0.33,-0.17, 0, ..., 0]'\) where the first six effect sizesdecrease in absolute terms continuously from 1 to 0 and alternate intheir sign. The true causal effect of all other covariates is 0. Thevariables in\(X\) follow a normaldistribution with mean zero while the covariance matrix follows aToeplitz matrix, which is characterized by having constant diagonals:\[\boldsymbol{\Sigma} = \begin{bmatrix} 1 & 0.7 & 0.7^2 & ...& 0.7^{9} \\ 0.7 & 1 & 0.7 & ... & 0.7^{8} \\ 0.7^2& 0.7 & 1 & ... & 0.7^{7} \\ \vdots & \vdots &\vdots & \ddots & \vdots \\ 0.7^{9} & 0.7^{8} & 0.7^{7}& ... & 1 \end{bmatrix}\]

The target\(\boldsymbol{y}\) isthen a linear transformation of\(\boldsymbol{X}\) plus a vector of standardnormal random variables. Each element of\(\boldsymbol{y}\) is given by:\[y_i = \boldsymbol{X}_i \boldsymbol{\pi} + \varepsilon_i\] where\(\varepsilon_i \sim\mathcal{N}(0,4)\).

Code

data(toeplitz)y=as.matrix(toeplitz[,1])X= toeplitz[,-1]

plasso

plasso returns least squares estimates for all lambdavalues of a standardglmnet object for both a simple Lassoand a Post-Lasso model.

Code

p= plasso::plasso(X,y)

You can plot the coefficient paths for both the Post-Lasso model aswell as the underlying ‘original’ Lasso model. This nicely illustratesthe difference between the Lasso and Post-Lasso models where the latteris characterized by jumps in its coefficient paths every time a newvariable enters the active set.

Code

plot(p,lasso=FALSE,xvar="lambda")

Code

plot(p,lasso=TRUE,xvar="lambda")

We can also have a look at which coefficients are active for a chosenlambda value. Here, the difference between Post-Lasso and Lasso becomesclearly visible. For the Lasso model, there is not only featureselection but shrinkage which results in the active coefficients beingsmaller than for the Post-Lasso model:

Code

coef_p=coef(p,s=0.01)as.vector(coef_p$plasso)

##  [1]  0.1438137  1.0187628 -0.6214926  0.4673645 -0.2300834 -0.3575276##  [7]  0.2180390  0.1180676 -0.2138268  0.1975462 -0.1047983

Code

as.vector(coef_p$lasso)

##  [1]  0.14498611  0.98729386 -0.56374511  0.40656768 -0.20023679 -0.33156564##  [7]  0.18985685  0.08930237 -0.16087044  0.13798825 -0.06639638

cv.plasso

Thecv.plasso function uses cross-validation todetermine the performance of different values for thelambda penalty term for both models (Post-Lasso and Lasso).The returned output of classcv.plasso includes the meansquared errors.

When applying thesummary method and setting thedefault parameter as FALSE, you can get some informativeoutput considering the optimal choice of lambda.

Code

p.cv= plasso::cv.plasso(X,y,kf=5)summary(p.cv,default=FALSE)

## ## Call:##  plasso::cv.plasso(x = X, y = y, kf = 5)## ## Lasso:##  Minimum CV MSE Lasso:  15.35##  Lambda at minimum:  0.03247##  Active variables at minimum:  (Intercept) X1 X2 X3 X4 X5 X6 X7 X8 X9## Post-Lasso:##  Minimum CV MSE Post-Lasso:  15.36##  Lambda at minimum:  0.006085##  Active variables at minimum:  (Intercept) X1 X2 X3 X4 X5 X6 X7 X8 X9 X10

Using theplot method extends the basicglmnet visualization by the cross-validated MSEs for thePost-Lasso model.

Code

plot(p.cv,legend_pos="left",legend_size=0.5)

We can use the following code to get the optimal lambda value (forthe Post-Lasso model here) and the associated coefficients at that valueof\(\lambda\).

Code

p.cv$lambda_min_pl

## [1] 0.006084554

Code

coef_pcv=coef(p.cv,S="optimal")as.vector(coef_pcv$plasso)

##  [1]  0.1438137  1.0187628 -0.6214926  0.4673645 -0.2300834 -0.3575276##  [7]  0.2180390  0.1180676 -0.2138268  0.1975462 -0.1047983

Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. 2010.“Regularization Paths for Generalized Linear Models via CoordinateDescent.”Journal of Statistical Software 33 (1): 1–22.https://doi.org/10.18637/jss.v033.i01.

Knaus, Michael C. 2022.“Double machinelearning-based programme evaluation underunconfoundedness.”The Econometrics Journal 25(3): 602–27.https://doi.org/10.1093/ectj/utac015.

Movatterモバイル変換

An introduction to plasso

Michael Knaus

Stefan Glaisner

Oktober 28, 2025

Getting started

plasso

cv.plasso