Bayesian Regression Models with RStanARM

TJ Mahr
Sept. 21, 2016

Madison R Users Group

Github repository
@tjmahr

Overview

August 2015: The "Crisis" in Psychology

Open Science Collaboration (2015) tries to replicate 100 studies published in 3 psychology different journals in 2008.

Scatter plot of original vs replicated effect sizes

  • Boil a study down to 1 test statistic and 1 effect size.
    • Compare replication's test and effect size against original.
  • Approximately 36% of the studies are replicated (same test statistic.
  • On average, effect sizes in replications are half that of the original studies.

Reactions

We're doomed.

You sit on a throne of lies!

Most findings are probably false.

No, this is business as usual.

Busy body penguin with a hat and briefcase

Any credible discipline has to do this kind of house-cleaning from time to time.

Lots of hand-wringing and soul-searching

Some reactionary:

dumpster fire dot gif

Some constructive:

I'm in the business-as-usual camp, by the way

Better research practices are catching on.

  • Pre-registration
  • Power analyses and other foresight.
  • Meta-analytic tools to assess the health of a field
  • Better disclosure of other unanalyzed measurements

Crisis made me think more of questionable data analysis practices

Basically, unintentional acts and rituals to appease Statistical Significance gods.

  • HARKing (hypothesizing after results are known)
    • Telling a story to fit the data.
  • Garden of forking data
    • Conducting countless sub-tests and sub-analyses on the data
  • p-hacking
    • Doing these tests in order to find a significant effect.
  • Selective reporting
    • Reporting only the tests that yielded a significant result.

My response to the crisis

I want to avoid these errors.

  • I want to level up my stats and explore new techniques.
    • Maybe more robust estimation techniques
    • Maybe machine learning techniques to complement conventional analyses.
  • I want something less finicky than statistical significance.
    • p-values don't mean what many people think they mean.
    • Neither do confidence intervals.
    • Statistical significance is not related to practical significance.

December 2015

  • I started reading the Gelman and Hill book.
  • It emphasizes estimation, uncertainty and simulation.
  • Midway through, the book pivots to Bayesian estimation.
  • I roll with it.

Cover of Data Analysis USing Regression and Multilevel/Hierarchical Models

I try to download the example BUGS scripts, but they are no longer supported.

Instead, I find a page with the title:

Use Stan instead.


Okay, I'll give it try…

January 2016

I'm down a rabbit hole, writing Stan models to fit the models from the ARM book, and there is an influx of Bayesian tools for R.

I eat all this up. I become a convert.

Long story short

The replication crisis sparked my curiosity, and a wave of new tools and resources made it really easy to get started with Bayesian stats.

No formal coursework. I learned about the math/statistics from playing with code.

My goal with these toosl has been to make better, more honest scientific summaries of observed data.

Next