How to lasso: Least Absolute Shrinkage and Selection Operator

Drew Altschul, Department of Psychology

The Problem of Model Building

How can we eliminate the subjectivity of model selection?

The Problem of Model Building

How can we eliminate the subjectivity of model selection?
- Inherently subjective stepwise procedures

The Problem of Model Building

How can we eliminate the subjectivity of model selection?
- Inherently subjective stepwise procedures
- which aren't even good!

The Problem of Model Building

How can we eliminate the subjectivity of model selection?
- Inherently subjective stepwise procedures
- which aren't even good!
Can modern computing power improve model selection procedures?

The Problem of Model Building

How can we eliminate the subjectivity of model selection?
- Inherently subjective stepwise procedures
- which aren't even good!
Can modern computing power improve model selection procedures?
The lasso

The Problem of Model Building

How can we eliminate the subjectivity of model selection?
- Inherently subjective stepwise procedures
- which aren't even good!
Can modern computing power improve model selection procedures?
The lasso
- shrinking variable estimates to zero

What is the lasso?

Regularization method

What is the lasso?

Regularization method
Penalizes estimates

What is the lasso?

Regularization method
Penalizes estimates
- reduce complexity

What is the lasso?

Regularization method
Penalizes estimates
- reduce complexity
- increase generalizability

What is the lasso?

Regularization method
Penalizes estimates
- reduce complexity
- increase generalizability
Similar to ridge regression

What is the lasso?

Regularization method
Penalizes estimates
- reduce complexity
- increase generalizability
Similar to ridge regression
- but lasso can shrink to zero

What is the lasso?

Regularization method
Penalizes estimates
- reduce complexity
- increase generalizability
Similar to ridge regression
- but lasso can shrink to zero
- and usually fits better

What does the lasso do?

Basic linear model: \( \hat{y} = b_0 + b_1*x_1+ b_2*x_2 + ... b_p*x_p \)

What does the lasso do?

Basic linear model: \( \hat{y} = b_0 + b_1*x_1+ b_2*x_2 + ... b_p*x_p \)

The lasso fits this with criteria

minimize: \( \sum (y - \hat{y})^2 \)
such that: \( \sum |b_j| \leq s \)

What does the lasso do?

Basic linear model: \( \hat{y} = b_0 + b_1*x_1+ b_2*x_2 + ... b_p*x_p \)

The lasso fits this with criteria

minimize: \( \sum (y - \hat{y})^2 \)
such that: \( \sum |b_j| \leq s \)
when \( s \) is large, the constraint has no effect and the solution is the usual multiple regression
and \( s \) becomes small, the coefficients are shrunk, sometimes even to 0

Features of the lasso

Estimates are biased due to penalization

Features of the lasso

Estimates are biased due to penalization
- Bias-variance trade-off

Features of the lasso

optional caption text

Features of the lasso

Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased

Features of the lasso

Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased
- Does not produce standard errors

Features of the lasso

Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased
- Does not produce standard errors
Preforms well when data are sparse

Features of the lasso

Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased
- Does not produce standard errors
Preforms well when data are sparse
Copes with correlated variables

Features of the lasso

Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased
- Does not produce standard errors
Preforms well when data are sparse
Copes with correlated variables
Can fit data with more variables than observations