Drew Altschul, Department of Psychology

- How can we eliminate the subjectivity of model selection?

- How can we eliminate the subjectivity of model selection?
- Inherently subjective stepwise procedures

- How can we eliminate the subjectivity of model selection?
- Inherently subjective stepwise procedures
- which aren't even good!

- How can we eliminate the subjectivity of model selection?
- Inherently subjective stepwise procedures
- which aren't even good!

- Can modern computing power improve model selection procedures?

- How can we eliminate the subjectivity of model selection?
- Inherently subjective stepwise procedures
- which aren't even good!

- Can modern computing power improve model selection procedures?
- The lasso

- How can we eliminate the subjectivity of model selection?
- Inherently subjective stepwise procedures
- which aren't even good!

- Can modern computing power improve model selection procedures?
- The lasso
- shrinking variable estimates to zero

- Regularization method

- Regularization method
- Penalizes estimates

- Regularization method
- Penalizes estimates
- reduce complexity

- Regularization method
- Penalizes estimates
- reduce complexity
- increase generalizability

- Regularization method
- Penalizes estimates
- reduce complexity
- increase generalizability

- Similar to ridge regression

- Regularization method
- Penalizes estimates
- reduce complexity
- increase generalizability

- Similar to ridge regression
- but lasso can shrink to zero

- Regularization method
- Penalizes estimates
- reduce complexity
- increase generalizability

- Similar to ridge regression
- but lasso can shrink to zero
- and usually fits better

Basic linear model: \( \hat{y} = b_0 + b_1*x_1+ b_2*x_2 + ... b_p*x_p \)

Basic linear model: \( \hat{y} = b_0 + b_1*x_1+ b_2*x_2 + ... b_p*x_p \)

The lasso fits this with criteria

- minimize: \( \sum (y - \hat{y})^2 \)
- such that: \( \sum |b_j| \leq s \)

Basic linear model: \( \hat{y} = b_0 + b_1*x_1+ b_2*x_2 + ... b_p*x_p \)

The lasso fits this with criteria

- minimize: \( \sum (y - \hat{y})^2 \)
such that: \( \sum |b_j| \leq s \)

when \( s \) is large, the constraint has no effect and the solution is the usual multiple regression

and \( s \) becomes small, the coefficients are shrunk, sometimes even to 0

- Estimates are biased due to penalization

- Estimates are biased due to penalization
- Bias-variance trade-off

- Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased

- Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased
- Does not produce standard errors

- Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased
- Does not produce standard errors

- Preforms well when data are sparse

- Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased
- Does not produce standard errors

- Preforms well when data are sparse
- Copes with correlated variables

- Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased
- Does not produce standard errors

- Preforms well when data are sparse
- Copes with correlated variables
- Can fit data with more variables than observations

- Estimates are biased due to penalization
- Bias-variance trade-off
- lasso estimates usually more accurate, if biased
- Does not produce standard errors

- Preforms well when data are sparse
- Copes with correlated variables
- Can fit data with more variables than observations
- particularly good at this is a variant called the Elastic Net

Elastic nets use a mixing parameter \( \alpha \) to combine lasso and ridge regression

Elastic nets use a mixing parameter \( \alpha \) to combine lasso and ridge regression

- Parameter optimization

Elastic nets use a mixing parameter \( \alpha \) to combine lasso and ridge regression

- Parameter optimization
- Prediction

Elastic nets use a mixing parameter \( \alpha \) to combine lasso and ridge regression

- Parameter optimization
- Prediction

Package `glmnet`

Elastic nets use a mixing parameter \( \alpha \) to combine lasso and ridge regression

- Parameter optimization
- Prediction

Package `glmnet`

glmnet works with binomial, multinomial, poisson, & cox models

- Cross Validation
`tune {e1071}`

- Mixed Models
`glmmLasso`

- Genomics
`LDlasso`

- Bayesian
`EBglmnet`

- Hazard models
`ahaz`

- Significance testing
`covTest`

- SEM
`regSEM`

`sparseSEM`

`qgraph`

- Variable, or rather, stability selection
`stabs`

`c060`

If you're using glmnet to its fullest potential, in many cases you won't need variable selection anymore

If you're using glmnet to its fullest potential, in many cases you won't need variable selection anymore

But if you do

But if you do

Stability selection will allow you to use these regularization techniques to identify which variables consistently contribute to the model

- the lasso is a flexible, effective tool

- the lasso is a flexible, effective tool
- for both prediction modeling

- the lasso is a flexible, effective tool
- for both prediction modeling
- and variable selection

- the lasso is a flexible, effective tool
- for both prediction modeling
- and variable selection

- highly generalizable

- the lasso is a flexible, effective tool
- for both prediction modeling
- and variable selection

- highly generalizable
- good support in R

- the lasso is a flexible, effective tool
- for both prediction modeling
- and variable selection

- highly generalizable
- good support in R
- not hard to get the hang of

@dremalt