**By Gabriel Vasconcelos**

## Motivation

If you are close to the data science world you probably heard about LASSO. It stands for Least Absolute Shrinkage and Selection Operator. The LASSO is a model that uses a penalization on the size of the parameters in the objective function to try to exclude irrelevant variables from the model. It has two very natural uses, the first is variable selection, and the second is forecasting. Since normally the LASSO will select much less variables than Ordinary Least Squares (OLS), its forecast will have much less variance at the cost of a small amount of bias in sample.

One of the most important features of the LASSO is that in can deal with much more variables than observations, I am talking about thousands of variables. This is one of the main reasons for its recent popularity. Only in the last 6 days five related packages were published in CRAN (April 1-6).

## Example

In this example I am going to use one of the most popular LASSO packages, the glmnet. It allows us to estimate the LASSO very fast and select the best model using cross-validation. In my experience, especially in a time-series context, it is better to select the best model using information criterion such as the BIC. It is faster and avoids some complications of cross-validation in time-series.

The package HDeconometrics (under development on GitHub) uses the glmnet package to estimate the LASSO and selects the best model using an information criterion chosen by the user. The data we are going to use is also available in the package. This data was used by Garcia, Medeiros and Vasconcelos (2017). We are going to use the LASSO to forecast the Brazilian inflation.

```
#library(devtools)
#install_github("gabrielrvsc/HDeconometrics")
library(HDeconometrics)
data("BRinf")
data=embed(BRinf,2)
y=data[,1]; x=data[,-c(1:ncol(BRinf))]
## == Break the data into in-sample and out-of-sample
y.in=y[1:100]; y.out=y[-c(1:100)]
x.in=x[1:100,]; x.out=x[-c(1:100),]
## == LASSO == ##
lasso=ic.glmnet(x.in,y.in,crit = "bic")
plot(lasso$glmnet,"lambda",ylim=c(-2,2))
```

```
plot(lasso)
```

The first plot above shows the variables going to zero as we increase the penalty in the objective function of the LASSO. The Second plot shows the BIC curve and the selected model. Now we can calculate the forecast:

```
## == Forecasting == ##
pred.lasso=predict(lasso,newdata=x.out)
plot(y.out, type="l")
lines(pred.lasso, col=2)
```

## adaptive LASSO

The LASSO has an adaptive version that has some better properties regarding variable selection. Note that this does not always means better forecast. The idea behind the model is to use some previously know information to select the variables more efficiently. This information is, in general, the coefficients estimated by LASSO or some other model.

```
## = adaLASSO = ##
tau=1
first.step.coef=coef(lasso)[-1]
penalty.factor=abs(first.step.coef+1/sqrt(nrow(x)))^(-tau)
adalasso=ic.glmnet(x.in,y.in,crit="bic",penalty.factor=penalty.factor)
pred.adalasso=predict(adalasso,newdata=x.out)
plot(y.out, type="l")
lines(pred.lasso, col=2)
lines(pred.adalasso, col=4)
```

```
## = comparing the errors = ##
c(LASSO=sqrt(mean((y.out-pred.lasso)^2)), adaLASSO=sqrt(mean((y.out-pred.adalasso)^2)))
```

## LASSO adaLASSO ## 0.1810612 0.1678397

The adaLASSO produced a more precise forecast in this case. In general, the adaLASSO is better than the simple LASSO for forecasting. However, this is not an absolute true. I have seen many cases where the simple LASSO did better.

## More information

If you are interested in going deeper, here are some suggestions:

[1] Bühlmann, Peter, and Sara Van De Geer. Statistics for high-dimensional data: methods, theory and applications. Springer Science & Business Media, 2011.

[2] Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for

Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL http://www.jstatsoft.org/v33/i01/

[3] Marcio Garcia, Marcelo C. Medeiros , Gabriel F. R. Vasconcelos (2017). Real-time inflation forecasting with high-dimensional models: The case of Brazil. Internationnal Journal of Forecasting, in press.

Hey, I would love to try working through your example here, but unfortunately the install_github command doesn’t work for me. Perhaps the newest version of R (3.4.1) doesn’t support it? Or do I need install another package to be able to use this command?

When I try to run the install_github command, Rstudio returns: “could not find function “install_github”. I’m also unable to install the package called “HDeconometrics”… Rstudio returns: “package ‘HDeconometrics’ is not available (for R version 3.4.1)”. (Though perhaps HDeconometrics is meant to be installed via the first command?)

If there’s a solution, let me know! Thanks.

LikeLike

You need to install the package devtools.

LikeLike

I’ve run the code on my data to get the reduced features. Is there a straight forward way to extract that information? Thanks.

LikeLike

Yes . coef(yourmodel)

LikeLike

I got this with coef(mylasso).

(Intercept) X1 X2 X3 X4 X5 X6 X7

1.53 0.00 0.00 0.00 0.00 0.00 0.00 0.00

It doesn’t seemed very informational.

I’m expecting X2 being selected by smallest lambda value and the minimum feature selected.

LikeLike