Bayesian Regression Models with RStanARM

TJ Mahr
Sept. 21, 2016

Madison R Users Group

Github repository
@tjmahr

Overview

August 2015: The "Crisis" in Psychology

Open Science Collaboration (2015) tries to replicate 100 studies published in 3 psychology different journals in 2008.

Scatter plot of original vs replicated effect sizes

Boil a study down to 1 test statistic and 1 effect size.
- Compare replication's test and effect size against original.
Approximately 36% of the studies are replicated (same test statistic.
On average, effect sizes in replications are half that of the original studies.

Reactions

We're doomed.

Most findings are probably false.

No, this is business as usual.

Busy body penguin with a hat and briefcase

Any credible discipline has to do this kind of house-cleaning from time to time.

Lots of hand-wringing and soul-searching

Some reactionary:

Replication creates an industry for incompetent hacks.
Here come the methodological terrorists!

dumpster fire dot gif

Some constructive:

Everything is f'ed – so what else is new?
Increased rigor and openness are a good thing.

I'm in the business-as-usual camp, by the way

Better research practices are catching on.

Pre-registration
Power analyses and other foresight.
Meta-analytic tools to assess the health of a field
Better disclosure of other unanalyzed measurements

Crisis made me think more of questionable data analysis practices

Basically, unintentional acts and rituals to appease Statistical Significance gods.

HARKing (hypothesizing after results are known)
- Telling a story to fit the data.
Garden of forking data
- Conducting countless sub-tests and sub-analyses on the data
p-hacking
- Doing these tests in order to find a significant effect.
Selective reporting
- Reporting only the tests that yielded a significant result.

My response to the crisis

I want to avoid these errors.

I want to level up my stats and explore new techniques.
- Maybe more robust estimation techniques
- Maybe machine learning techniques to complement conventional analyses.
I want something less finicky than statistical significance.
- p-values don't mean what many people think they mean.
- Neither do confidence intervals.
- Statistical significance is not related to practical significance.

December 2015

I started reading the Gelman and Hill book.
- This is the book for the arm package.
It emphasizes estimation, uncertainty and simulation.
Midway through, the book pivots to Bayesian estimation.
I roll with it.

Cover of Data Analysis USing Regression and Multilevel/Hierarchical Models

I try to download the example BUGS scripts, but they are no longer supported.

Instead, I find a page with the title:

Use Stan instead.

Okay, I'll give it try…

January 2016

I'm down a rabbit hole, writing Stan models to fit the models from the ARM book, and there is an influx of Bayesian tools for R.

Statistical Rethinking, a book that reteaches regression from a Bayesian perspective with R and Stan, is released.
New version of brms is released. This package converts R model code into Stan programs.
RStanARM is released.
This blog post circulates: “R Users Will Now Inevitably Become Bayesians”.

I eat all this up. I become a convert.

Long story short

The replication crisis sparked my curiosity, and a wave of new tools and resources made it really easy to get started with Bayesian stats.

No formal coursework. I learned about the math/statistics from playing with code.

My goal with these toosl has been to make better, more honest scientific summaries of observed data.