How is Elastic Net used?

Question:

This is a beginner question on regularization with regression. Most information about Elastic Net and Lasso Regression online replicates the information from Wikipedia or the original 2005 paper by Zou and Hastie (Regularization and variable selection via the elastic net).

Resource for simple theory? Is there a simple and easy explanation somewhere about what it does, when and why reguarization is neccessary, and how to use it – for those who are not statistically inclined? I understand that the original paper is the ideal source if you can understand it, but is there somewhere that more simply the problem and solution?

How to use in sklearn? Is there a step by step example showing why elastic net is chosen (over ridge, lasso, or just simple OLS) and how the parameters are calculated? Many of the examples on sklearn just include alpha and rho parameters directly into the prediction model, for example:

from sklearn.linear_model import ElasticNet
alpha = 0.1
enet = ElasticNet(alpha=alpha, rho=0.7)
y_pred_enet = enet.fit(X_train, y_train).predict(X_test)

However, they don’t explain how these were calculated. How do you calculate the parameters for the lasso or net?

Asked By: Zach

||

Answers:

The documentation is lacking. I created a new issue to improve it. As Andreas said the best resource is probably ESL II freely available online as PDF.

To automatically tune the value of alpha it is indeed possible to use ElasticNetCV which will spare redundant computation as apposed to using GridSearchCV in the ElasticNet class for tuning alpha. In complement, you can use a regular GridSearchCV for finding the optimal value of rho. See the docstring of ElasticNetCV fore more details.

As for Lasso vs ElasticNet, ElasticNet will tend to select more variables hence lead to larger models (also more expensive to train) but also be more accurate in general. In particular Lasso is very sensitive to correlation between features and might select randomly one out of 2 very correlated informative features while ElasticNet will be more likely to select both which should lead to a more stable model (in terms of generalization ability so new samples).

Answered By: ogrisel

I will try helping you out with the question ‘What is ElasticNet?’

The Elastic-Net is a regularised regression method that linearly combines both penalties (i.e.) L1 and L2 of the Lasso and Ridge regression methods.
It is useful when there are multiple correlated features. The difference between Lass and Elastic-Net lies in the fact that Lasso is likely to pick one of these features at random while elastic-net is likely to pick both at once.

The below listed two links have got wonderful explanations for ElasticNet.

  1. ElasticNet- TutorialsPoint
  2. Lasso, Ridge and Elastic Net Regularization