Personally, I am wary of using the Lasso for feature selection for two reasons. 1) If the aim is model predictive power, then dropping variables through an automatic process like Lasso is usually unhelpful. In such cases, the interest in Lasso is presumably to prevent overfitting, but you don't want to lose accuracy either. Ridge regularization is often better for this purpose.

2) If the aim is statistical inference, the indiscriminate manner with which the Lasso drops a variable is problematic. A highly co-linear variable is not "useless". If the Lasso finds two highly co-linear variables, for example, it will just drop one of them randomly. Sometimes an inferential model requires highly co-linear variables because it makes sense and you still want to estimate the coefficient values across them, such as the impact of parental income (feature 1) and parental education (feature 2) on a person's income level (target variable). Or you need to model these effects in a more sophisticated manner or model, rather than randomly dropping one of the variables.

Financier by profession. Economist by training. Data scientist & essayist by inclination.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store