Machine Learning : Elastic Net Regression

2 min readJan 24, 2024

Previously, we talked about lasso and ridge regression. If we were to combine the best of both of those worlds, we would get

Elastic Net Regression

It is a linear regression model that combines both L1 (Lasso) and L2 (Ridge) regularization techniques.
Designed to address some of the limitations of Lasso and Ridge regression by introducing a mixing parameter (l1_ratiol1_ratio) that controls the contribution of L1 and L2 penalties.

This is visible in the equation for elastic net regression as follows:

β0 is the intercept term.
β is the vector of coefficients for the features.
xi is the feature vector for the ith observation.
yi is the target variable for the ith observation.
∥β∥1 is the L1 norm (sum of absolute values of coefficients).
∥β∥22 is the L2 norm squared (sum of squared values of coefficients).
α is the regularization parameter that controls the overall strength of the penalty.
l1_ratiol1_ratio is the mixing parameter that determines the ratio of L1 to L2 penalty. It ranges from 0 to 1.

Use Lasso when you suspect that many features are irrelevant or redundant. Lasso performs feature selection by driving some coefficients to exactly zero.
Use Ridge when you have a high-dimensional dataset with multicollinearity among features. Ridge helps to mitigate multicollinearity by adding a penalty term to the squared magnitudes of the coefficients.
Use Elastic Net when you want a combination of L1 and L2 regularization.

Elastic Net combines the benefits of both Lasso and Ridge, providing a compromise between feature selection and handling multicollinearity.