Online resources

Slides and R code that produced them are online: https://github.com/tjmahr/Psych710_BayesLecture

I gave a similar, more code-heavy version of this talk to the R Users Group: https://github.com/tjmahr/MadR_RStanARM

Overview

Background

About me

  • I am dissertator in Communication Sciences and Disorders
  • I study word recognition in preschoolers
  • For statistics, I mostly do multilevel logistic regression models
  • R enthusiast

I was once in your shoes

I learned stats and R in this course with Markus Brauer and John Curtin.

I still refer to the slides from this course on contrast codes.

But now I’m a “Bayesian”.

A timeline

August 2015: The “Crisis” in Psychology

Open Science Collaboration (2015) tries to replicate 100 studies published in 3 psychology different journals in 2008.

  • Boil a study down to 1 test statistic and 1 effect size.
  • Replicate the study.
  • Compare replication’s test statistic and effect size against original.
Scatter plot of original vs replicated effect sizes

Scatter plot of original vs replicated effect sizes

  • Approximately 36% of the studies are replicated (same test statistic).
  • On average, effect sizes in replications are half that of the original studies.

Reactions

I don’t know how to turn off the figure labeling feature

I don’t know how to turn off the figure labeling feature

Reactions

  • We’re doomed.
  • Most findings are probably false, and we knew that already.
  • No, this is business as usual.
  • Any credible discipline has to do this kind of house-cleaning from time to time.

Lots of hand wringing and soul searching

Some reactionary:

Some constructive:

Crisis made me think more about questionable practices

All those unintentional acts and rituals to appease the Statistical Significance gods.

HARKing
Hypothesizing after results are known.
Telling a story to fit the data.
Garden of forking data
Conducting countless sub-tests and sub-analyses on the data.
p-hacking
Doing these tests in order to find a significant effect.
Selective reporting
Reporting only the tests that yielded a significant result.

My sense

The usual way of doing things is insecure.

  • Perfectly fine if you know what you’re doing.
  • Works great if you pre-register analyses. Provides error control.
  • But vulnerable to exploitation.
  • And many people don’t know what they’re doing.

My response to the crisis

I want to avoid these questionable practices.

I want to level up my stats and explore new techniques.

  • Maybe more robust estimation techniques?
  • Maybe machine learning techniques to complement conventional analyses?

I want something less finicky than statistical significance.

  • p-values don’t mean what many people think they mean.
  • Neither do confidence intervals.
  • Statistical significance is not related to practical significance.

December 2015

Cover of Data Analysis USing Regression and Multilevel/Hierarchical Models

Cover of Data Analysis USing Regression and Multilevel/Hierarchical Models

I started reading the Gelman and Hill book.

  • This is the book for the arm package.
  • Still the best treatment of multilevel models in R despite being 10 years old.

It emphasizes estimation, uncertainty and simulation.

Midway through, the book pivots to Bayesian estimation. (Multilevel models are kinda Bayesian because they borrow information across different clusters.)

January 2016

I’m down a rabbit hole, writing Stan (Bayesian) models to fit the models from the ARM book, and there is an influx of Bayesian tools for R.

I eat all this up. I become a convert.

Long story short

The replication crisis sparked my curiosity, and a wave of new tools and resources made it really easy to get started with Bayesian stats.

My goal with this approach has been to make better, more honest scientific summaries of observed data.

Classical regression versus Bayesian regression in a few plots

The data

# Some toy data
davis <- car::Davis %>% filter(100 < height) %>% as_data_frame()
davis
#> # A tibble: 199 × 5
#>       sex weight height repwt repht
#>    <fctr>  <int>  <int> <int> <int>
#> 1       M     77    182    77   180
#> 2       F     58    161    51   159
#> 3       F     53    161    54   158
#> 4       M     68    177    70   175
#> 5       F     59    157    59   155
#> 6       M     76    170    76   165
#> 7       M     76    167    77   165
#> 8       M     69    186    73   180
#> 9       M     71    178    71   175
#> 10      M     65    171    64   170
#> # ... with 189 more rows

The data