Subreddit communities often develop extensive rules governing the kinds of participation they allow. But does making those policies visible have any effect on the rule-compliance of newcomers to the community? And might there also be a side effect on the number of people who participate? Over 29 days in September, I worked with the moderators of r/science to test this question with a field experiment, an A/B test where we posted sticky comments to some threads and not to others.
TLDR: Yes! Adding a sticky comment with the rules has a positive 7.3 percentage point effect on the chance that a newcomer’s comment will comply with the rules, on average across r/science, holding all else constant. Furthermore, posting the rules increases the incidence rate of newcomer comments by 38.1% on average.
But there’s a catch! In followup analyses, I found that sticky comments had opposite effects in AMAs (question-answer discussions with prominent scientists) compared to non-AMA posts. Posting the rules to a non-AMA thread caused a 59% increase in the incidence rate of newcomer comments, but in AMA threads, sticky comments caused a 65.5% decrease on average, the opposite outcome. Sticky comments also affected the amount of moderator work per post. Posting a sticky comment increased the incidence rate of all comment removals by 36.1% in non-AMA posts and decreased the incidence rate by 28.6% in AMA posts on average across r/science.
This experiment was a collaboration between J. Nathan Matias and the moderators of r/science as part of my PhD research at the MIT Media Lab & Center for Civic Media. A full description of the experiment procedures is at osf.io/knb48/.
To test the effect of sticky comments, I wrote a bot, /u/CivilServantBot, that continuously monitors all comments in the subreddit, including the actions of moderators. The full experiment design is at osf.io/jhkcf/; here is a brief summary. During the experiment, this bot randomly assigned sticky comments within the group of non-AMA posts and also within the group of AMA posts. Partway through, we tweaked the bot to wait for two minutes before commenting, so any posts removed by AutoModerator are not included. All replies to the sticky comments were automatically removed.
Because any thread had an equal chance to receive a sticky comment, we are able to make a causal inference about the effect of the sticky comment on the outcomes we care about:
We also monitored many other variables about a post, including the number of minutes that each post spent on the top page for the subreddit, as well as reddit as a whole.
Here is the sticky comment we posted to non-AMA threads:
This is the sticky comment we posted to AMA threads, a sticky comment that r/science was already using.
We ran the experiment for 2210 non-AMA posts and 24 AMA posts, and then stopped. Looking back, I found two bot software glitches in two different posts and pulled out the two blocks of 10 observations that they were in, leaving a total of 2214 posts in the analysis. The final analysis includes 20385 newcomer comments, which were 29% of all 68414 comments during that period.
The full methods for describing the statistics are described in the pre-analysis plan for the experiment osf.io/jhkcf/. Here are the model results.
I tested the effect of sticky comments on newcomers by using a logistic regression model (Model 5). The details of the model are reported below the table. Notice that the number of threads is 883 in this model. Since this model looks at individual newcomer comments, it will not include threads where no newcomers posted. Statisticians will notice that I used a random intercepts model on the thread ID to account for the fact that we were assigning comments to threads rather than individual commenters.
ccv1 <- glmer(visible ~ 1 + (1 | link_id), data = newcomer.comments, family = binomial, nAGQ=2)
ccv2 <- glmer(visible ~ post.visible + (1 | link_id), data = newcomer.comments, family = binomial, nAGQ=2)
ccv3 <- glmer(visible ~ post.visible + post.ama + (1 | link_id), data = newcomer.comments, family = binomial, nAGQ=2)
ccv4 <- glmer(visible ~ post.visible + post.ama + post.sub.top.minutes.ln + (1 | link_id), data = newcomer.comments, family = binomial, nAGQ=2)
ccv5 <- glmer(visible ~ post.visible + post.ama + post.sub.top.minutes.ln + post.treatment + (1 | link_id), data = newcomer.comments, family = binomial, nAGQ=2)
htmlreg(list(ccv1, ccv2, ccv3, ccv4, ccv5), include.deviance = TRUE)
Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | |
---|---|---|---|---|---|
(Intercept) | 0.29*** | -0.10 | -0.11 | 0.24 | 0.02 |
(0.08) | (0.13) | (0.13) | (0.14) | (0.16) | |
post.visibleTrue | 0.61*** | 0.59*** | 1.06*** | 1.09*** | |
(0.17) | (0.17) | (0.19) | (0.18) | ||
post.amaTRUE | 0.21 | 0.74* | 0.79* | ||
(0.37) | (0.36) | (0.36) | |||
post.sub.top.minutes.ln | -0.12*** | -0.12*** | |||
(0.02) | (0.02) | ||||
post.treatment | 0.44** | ||||
(0.16) | |||||
AIC | 23775.05 | 23763.91 | 23765.59 | 23723.28 | 23717.64 |
BIC | 23790.90 | 23787.68 | 23797.28 | 23762.90 | 23765.18 |
Deviance | 22477.45 | 22473.98 | 22476.06 | 22480.89 | 22489.44 |
Log Likelihood | -11885.53 | -11878.96 | -11878.79 | -11856.64 | -11852.82 |
Num. obs. | 20385 | 20385 | 20385 | 20385 | 20385 |
Num. groups: link_id | 830 | 830 | 830 | 830 | 830 |
Variance: link_id.(Intercept) | 2.72 | 2.69 | 2.67 | 2.48 | 2.40 |
Variance: Residual | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
p < 0.001, p < 0.01, p < 0.05 |
## Generalized linear mixed model fit by maximum likelihood (Adaptive
## Gauss-Hermite Quadrature, nAGQ = 2) [glmerMod]
## Family: binomial ( logit )
## Formula: visible ~ post.visible + post.ama + post.sub.top.minutes.ln +
## post.treatment + (1 | link_id)
## Data: newcomer.comments
##
## AIC BIC logLik deviance df.resid
## 23717.6 23765.2 -11852.8 23705.6 20379
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -9.4160 -0.9027 -0.0923 0.8778 7.7786
##
## Random effects:
## Groups Name Variance Std.Dev.
## link_id (Intercept) 2.401 1.549
## Number of obs: 20385, groups: link_id, 830
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.01852 0.16166 0.115 0.90880
## post.visibleTrue 1.08907 0.18388 5.923 3.17e-09 ***
## post.amaTRUE 0.79147 0.35940 2.202 0.02765 *
## post.sub.top.minutes.ln -0.11916 0.01777 -6.704 2.02e-11 ***
## post.treatment 0.43841 0.15774 2.779 0.00545 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) pst.vT p.TRUE ps....
## post.vsblTr -0.468
## post.amTRUE 0.028 -0.057
## pst.sb.tp.. -0.297 -0.425 -0.204
## post.trtmnt -0.500 0.072 0.050 -0.052
I tested the effect of sticky comments on the number of newcomers by using a zero-inflated Poisson count model (Model 3). The zero-inflated model allows us to account for the large number of posts with zero newcomer comments. Out of all 2214 posts, only 821 had at least one newcomer comment. To slightly over-simplify, the zero-inflated model allows us to first predict posts with zeroes, and out of the remaining subset, predict the number of comments. Here is the final set of models. Since the model is trying to detect differences in the number of comments, I include many “regression adjustment covariates” to see what the effect is once we account for other differences. For example, some topics just get more attention than others; that’s why the model includes a predictor for each individual flair category. You can see the experiment effect in the “TREAT” variable at the very bottom right. That is the number that I then convert to the final “incidence rate ratio.”
zcm1 <- zeroinfl(newcomer.comments ~ 1 | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
zcm2 <- zeroinfl(newcomer.comments ~ visible + post.ama + post.flair + post.sub.top.minutes.ln + weekend + post.hour + I(post.hour^2) | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
zcm3 <- zeroinfl(newcomer.comments ~ visible + post.ama + post.flair + post.sub.top.minutes.ln + weekend + post.hour + I(post.hour^2) + TREAT | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
htmlreg(list(zcm1, zcm2, zcm3), include.deviance = TRUE)
Model 1 | Model 2 | Model 3 | |
---|---|---|---|
Count model: (Intercept) | 3.21*** | -2.65*** | -2.76*** |
(0.01) | (0.14) | (0.14) | |
Zero model: (Intercept) | 0.83*** | -2.73*** | -2.68*** |
(0.07) | (0.39) | (0.39) | |
Zero model: visibleTrue | 0.37*** | 1.02** | 1.01** |
(0.10) | (0.33) | (0.33) | |
Zero model: post.sub.top.minutes.ln | -0.23*** | 0.07** | 0.07* |
(0.01) | (0.03) | (0.03) | |
Count model: visibleTrue | -0.58*** | -0.57*** | |
(0.02) | (0.02) | ||
Count model: post.amaTRUE | 0.19*** | 0.26*** | |
(0.03) | (0.03) | ||
Count model: post.flairanimalsci | 0.80*** | 0.82*** | |
(0.14) | (0.14) | ||
Count model: post.flairanthro | 1.07*** | 1.02*** | |
(0.15) | (0.15) | ||
Count model: post.flairastro | -0.39* | -0.36* | |
(0.16) | (0.16) | ||
Count model: post.flairbio | -0.35* | -0.40** | |
(0.14) | (0.15) | ||
Count model: post.flaircancer | 0.57*** | 0.56*** | |
(0.16) | (0.16) | ||
Count model: post.flairchem | 0.16 | 0.19 | |
(0.15) | (0.15) | ||
Count model: post.flaircompsci | -0.63*** | -0.60*** | |
(0.17) | (0.17) | ||
Count model: post.flairearthsci | 0.52*** | 0.57*** | |
(0.15) | (0.15) | ||
Count model: post.flaireng | -0.02 | -0.04 | |
(0.15) | (0.15) | ||
Count model: post.flairenv | -0.45** | -0.41** | |
(0.15) | (0.15) | ||
Count model: post.flairepi | 0.83*** | 0.89*** | |
(0.15) | (0.15) | ||
Count model: post.flairgeo | 0.33* | 0.34* | |
(0.16) | (0.16) | ||
Count model: post.flairhealth | 1.38*** | 1.38*** | |
(0.14) | (0.14) | ||
Count model: post.flairmath | 1.21*** | 1.14*** | |
(0.17) | (0.17) | ||
Count model: post.flairmed | 0.92*** | 0.91*** | |
(0.14) | (0.14) | ||
Count model: post.flairmeta | 0.67*** | 0.74*** | |
(0.15) | (0.15) | ||
Count model: post.flairnano | -0.21 | -0.08 | |
(0.17) | (0.17) | ||
Count model: post.flairneuro | 0.21 | 0.14 | |
(0.14) | (0.14) | ||
Count model: post.flairpaleo | 0.31* | 0.38* | |
(0.15) | (0.15) | ||
Count model: post.flairphysics | 0.83*** | 0.94*** | |
(0.15) | (0.15) | ||
Count model: post.flairpsych | 0.04 | 0.04 | |
(0.14) | (0.14) | ||
Count model: post.flairsoc | 1.44*** | 1.44*** | |
(0.14) | (0.14) | ||
Count model: post.sub.top.minutes.ln | 0.53*** | 0.52*** | |
(0.00) | (0.00) | ||
Count model: weekendTrue | 0.17*** | 0.23*** | |
(0.02) | (0.02) | ||
Count model: post.hour | 0.23*** | 0.22*** | |
(0.01) | (0.01) | ||
Count model: I(post.hour^2) | -0.01*** | -0.01*** | |
(0.00) | (0.00) | ||
Count model: TREAT | 0.32*** | ||
(0.01) | |||
AIC | 88772.86 | 40939.89 | 40465.64 |
Log Likelihood | -44382.43 | -20437.95 | -20199.82 |
Num. obs. | 2214 | 2214 | 2214 |
p < 0.001, p < 0.01, p < 0.05 |
When I chart the number of newcomer comments, we see something very odd; it looks like the effect might have been different for non-AMAs and AMA threads. Since I did block-randomization, it’s possible to compare AMAs and non-AMAs.
I didn’t anticipate this possibility at all, so it’s not part of the pre-analysis plan. But if I add an interaction term to the model to tease out any differences(Model 2), I see that posting the rules to a non-AMA thread caused a 59% increase in the incidence rate of newcomer comments, but in AMA threads, sticky comments caused a 65.5% decrease on average, the opposite outcome. This table shows the two models next to each other:
zcm4 <- zeroinfl(newcomer.comments ~ visible + post.ama + post.flair + post.sub.top.minutes.ln + weekend + post.hour + I(post.hour^2) + TREAT + post.ama:TREAT | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
htmlreg(list(zcm3, zcm4), include.deviance = TRUE)
Model 1 | Model 2 | |
---|---|---|
Count model: (Intercept) | -2.76*** | -2.84*** |
(0.14) | (0.14) | |
Count model: visibleTrue | -0.57*** | -0.58*** |
(0.02) | (0.02) | |
Count model: post.amaTRUE | 0.26*** | 0.83*** |
(0.03) | (0.03) | |
Count model: post.flairanimalsci | 0.82*** | 0.92*** |
(0.14) | (0.14) | |
Count model: post.flairanthro | 1.02*** | 1.03*** |
(0.15) | (0.15) | |
Count model: post.flairastro | -0.36* | -0.32* |
(0.16) | (0.16) | |
Count model: post.flairbio | -0.40** | -0.24 |
(0.15) | (0.15) | |
Count model: post.flaircancer | 0.56*** | 0.58*** |
(0.16) | (0.16) | |
Count model: post.flairchem | 0.19 | 0.08 |
(0.15) | (0.15) | |
Count model: post.flaircompsci | -0.60*** | -0.75*** |
(0.17) | (0.17) | |
Count model: post.flairearthsci | 0.57*** | 0.62*** |
(0.15) | (0.15) | |
Count model: post.flaireng | -0.04 | 0.02 |
(0.15) | (0.15) | |
Count model: post.flairenv | -0.41** | -0.32* |
(0.15) | (0.15) | |
Count model: post.flairepi | 0.89*** | 0.95*** |
(0.15) | (0.15) | |
Count model: post.flairgeo | 0.34* | 0.37* |
(0.16) | (0.16) | |
Count model: post.flairhealth | 1.38*** | 1.41*** |
(0.14) | (0.14) | |
Count model: post.flairmath | 1.14*** | 1.14*** |
(0.17) | (0.17) | |
Count model: post.flairmed | 0.91*** | 0.88*** |
(0.14) | (0.14) | |
Count model: post.flairmeta | 0.74*** | 0.38* |
(0.15) | (0.15) | |
Count model: post.flairnano | -0.08 | -0.00 |
(0.17) | (0.17) | |
Count model: post.flairneuro | 0.14 | 0.15 |
(0.14) | (0.14) | |
Count model: post.flairpaleo | 0.38* | 0.45** |
(0.15) | (0.15) | |
Count model: post.flairphysics | 0.94*** | 1.01*** |
(0.15) | (0.15) | |
Count model: post.flairpsych | 0.04 | 0.07 |
(0.14) | (0.15) | |
Count model: post.flairsoc | 1.44*** | 1.39*** |
(0.14) | (0.14) | |
Count model: post.sub.top.minutes.ln | 0.52*** | 0.52*** |
(0.00) | (0.01) | |
Count model: weekendTrue | 0.23*** | 0.26*** |
(0.02) | (0.02) | |
Count model: post.hour | 0.22*** | 0.22*** |
(0.01) | (0.01) | |
Count model: I(post.hour^2) | -0.01*** | -0.01*** |
(0.00) | (0.00) | |
Count model: TREAT | 0.32*** | 0.46*** |
(0.01) | (0.02) | |
Zero model: (Intercept) | -2.68*** | -2.54*** |
(0.39) | (0.38) | |
Zero model: visibleTrue | 1.01** | 0.95** |
(0.33) | (0.32) | |
Zero model: post.sub.top.minutes.ln | 0.07* | 0.06* |
(0.03) | (0.03) | |
Count model: post.amaTRUE:TREAT | -1.53*** | |
(0.06) | ||
AIC | 40465.64 | 39722.94 |
Log Likelihood | -20199.82 | -19827.47 |
Num. obs. | 2214 | 2214 |
p < 0.001, p < 0.01, p < 0.05 |
Another possible problem is that AMAs and non-AMAs might need to be modeled differently. AMAs never get zero comments, so the distribution of the dependent variable is very different for them. They are never removed by moderators, and they never happen on weekends, so some of the regression adjustment covariates don’t apply in their case. Furthermore, there are only 24 AMAs in the sample, so they may get overpowered by all of the non-AMA threads in the final result. Those differences are very clear in the following log-transformed chart of newcomer comments per post:
To confirm if posting the rules to AMAs had a negative effect on the number of newcomers, I fit a series of negative binomial model, removing covariates that just don’t apply to AMAs. In this result (Model 3), I find that posting sticky comments to AMAs caused a 65.1% reduction in the number of newcomer coments, an estimate that is very close to the poisson model’s result.
nb1 <- glm.nb(newcomer.comments ~ 1, data=subset(exs.posts, post.ama==TRUE))
nb2 <- glm.nb(newcomer.comments ~ visible + post.sub.top.minutes.ln, data=subset(exs.posts, post.ama==TRUE))
nb3 <- glm.nb(newcomer.comments ~ visible + post.sub.top.minutes.ln + TREAT, data=subset(exs.posts, post.ama==TRUE))
htmlreg(list(nb1, nb2, nb3), include.deviance = TRUE)
Model 1 | Model 2 | Model 3 | |
---|---|---|---|
(Intercept) | 4.48*** | -42.45*** | -24.43* |
(0.25) | (11.67) | (12.08) | |
visibleTrue | 0.45 | -0.40 | |
(1.09) | (1.08) | ||
post.sub.top.minutes.ln | 4.12*** | 2.63* | |
(1.03) | (1.04) | ||
TREAT | -1.05* | ||
(0.48) | |||
AIC | 264.85 | 260.03 | 258.59 |
BIC | 267.20 | 264.74 | 264.48 |
Log Likelihood | -130.42 | -126.02 | -124.29 |
Deviance | 28.59 | 27.23 | 27.00 |
Num. obs. | 24 | 24 | 24 |
p < 0.001, p < 0.01, p < 0.05 |
There are several reasons why we might see a negative effect in AMAs and a positive one elsewhere. Here are some possibilities:
If subreddits are interested, further experiments could allow us to compare the performance of different kinds of messages.
The main hypotheses focused on the behavior of newcomers, but what was the overall outcome across all commenters?
First, was there an effect on the rule-compliance across all commenters? Yes. Across r/science, the sticky comment had a positive 2.2 percentage point effect on the chance of any comment being accepted by moderators.
occv1 <- glmer(visible ~ 1 + (1 | link_id), data = exs.comments, family = binomial, nAGQ=2)
occv5 <- glmer(visible ~ post.visible + post.ama + post.sub.top.minutes.ln + post.treatment + (1 | link_id), data = exs.comments, family = binomial, nAGQ=2)
htmlreg(list(occv1, occv5), include.deviance = TRUE)
Model 1 | Model 2 | |
---|---|---|
(Intercept) | 1.40*** | 1.58*** |
(0.06) | (0.11) | |
post.visibleTrue | 0.45*** | |
(0.12) | ||
post.amaTRUE | 0.77* | |
(0.34) | ||
post.sub.top.minutes.ln | -0.16*** | |
(0.01) | ||
post.treatment | 0.23* | |
(0.11) | ||
AIC | 75003.99 | 74846.05 |
BIC | 75022.27 | 74900.88 |
Deviance | 72263.32 | 72255.36 |
Log Likelihood | -37500.00 | -37417.02 |
Num. obs. | 68717 | 68717 |
Num. groups: link_id | 1872 | 1872 |
Variance: link_id.(Intercept) | 2.64 | 2.44 |
Variance: Residual | 1.00 | 1.00 |
p < 0.001, p < 0.01, p < 0.05 |
Next, was there an effect on the number of commenters overall? In a zero-inflated poisson regression model, posting a sticky comment to a non-AMA threads increased the incidence rate of comments by 57.1%, while posting sticky comments to AMA threads reduced the incidence rate of comments by -54.1%, on average across r/science..
ozcm1 <- zeroinfl(num.comments ~ 1 | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
ozcm2 <- zeroinfl(num.comments ~ visible + post.ama + post.flair + post.sub.top.minutes.ln + weekend + post.hour + I(post.hour^2) | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
ozcm3 <- zeroinfl(num.comments ~ visible + post.ama + post.flair + post.sub.top.minutes.ln + weekend + post.hour + I(post.hour^2) + TREAT | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
ozcm4 <- zeroinfl(num.comments ~ visible + post.ama + post.flair + post.sub.top.minutes.ln + weekend + post.hour + I(post.hour^2) + TREAT + post.ama:TREAT | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
htmlreg(list(ozcm1, ozcm4), include.deviance = TRUE)
Model 1 | Model 2 | |
---|---|---|
Count model: (Intercept) | 3.62*** | -0.68*** |
(0.00) | (0.05) | |
Zero model: (Intercept) | -2.78*** | -5.12*** |
(0.14) | (0.54) | |
Zero model: visibleTrue | 2.12*** | 1.99*** |
(0.16) | (0.59) | |
Zero model: post.sub.top.minutes.ln | -0.18*** | 0.07 |
(0.02) | (0.04) | |
Count model: visibleTrue | -0.77*** | |
(0.01) | ||
Count model: post.amaTRUE | 0.75*** | |
(0.02) | ||
Count model: post.flairanimalsci | 0.37*** | |
(0.05) | ||
Count model: post.flairanthro | 0.45*** | |
(0.06) | ||
Count model: post.flairastro | -0.66*** | |
(0.06) | ||
Count model: post.flairbio | -0.56*** | |
(0.05) | ||
Count model: post.flaircancer | 0.04 | |
(0.07) | ||
Count model: post.flairchem | -0.19** | |
(0.06) | ||
Count model: post.flaircompsci | -1.03*** | |
(0.07) | ||
Count model: post.flairearthsci | 0.27*** | |
(0.06) | ||
Count model: post.flaireng | -0.29*** | |
(0.05) | ||
Count model: post.flairenv | -0.58*** | |
(0.05) | ||
Count model: post.flairepi | 0.57*** | |
(0.06) | ||
Count model: post.flairgeo | 0.40*** | |
(0.06) | ||
Count model: post.flairhealth | 0.95*** | |
(0.05) | ||
Count model: post.flairmath | 0.51*** | |
(0.07) | ||
Count model: post.flairmed | 0.39*** | |
(0.05) | ||
Count model: post.flairmeta | 0.07 | |
(0.06) | ||
Count model: post.flairnano | -0.34*** | |
(0.07) | ||
Count model: post.flairneuro | -0.34*** | |
(0.05) | ||
Count model: post.flairpaleo | -0.33*** | |
(0.06) | ||
Count model: post.flairphysics | 0.52*** | |
(0.06) | ||
Count model: post.flairpsych | -0.31*** | |
(0.05) | ||
Count model: post.flairsoc | 1.03*** | |
(0.05) | ||
Count model: post.sub.top.minutes.ln | 0.48*** | |
(0.00) | ||
Count model: weekendTrue | 0.22*** | |
(0.01) | ||
Count model: post.hour | 0.20*** | |
(0.00) | ||
Count model: I(post.hour^2) | -0.01*** | |
(0.00) | ||
Count model: TREAT | 0.45*** | |
(0.01) | ||
Count model: post.amaTRUE:TREAT | -1.23*** | |
(0.03) | ||
AIC | 365986.19 | 139646.84 |
Log Likelihood | -182989.10 | -69789.42 |
Num. obs. | 2214 | 2214 |
p < 0.001, p < 0.01, p < 0.05 |
While the above models tell us things about the behavior of commenters, they don’t give us a sense of whether the sticky comments increase or decrease the overall amount of moderation work. In a zero-inflated poisson regression model, posting a sticky comment increased the incidence rate of all comment removals by 36.1% in non-AMA posts, and decreased the incidence rate by 28.6% in AMA posts on average across r/science. The model results are below. Note that this result includes all actions by the AutoModerator; I would need to recalculate the dependent variable to ask this question for human moderators.
ocrm1 <- zeroinfl(num.comments.removed ~ 1 | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
ozcrm2 <- zeroinfl(num.comments.removed ~ visible + post.ama + post.flair + post.sub.top.minutes.ln + weekend + post.hour + I(post.hour^2) | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
ozcrm3 <- zeroinfl(num.comments.removed ~ visible + post.ama + post.flair + post.sub.top.minutes.ln + weekend + post.hour + I(post.hour^2) + TREAT | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
ozcrm4 <- zeroinfl(num.comments.removed ~ visible + post.ama + post.flair + post.sub.top.minutes.ln + weekend + post.hour + I(post.hour^2) + TREAT + post.ama:TREAT | visible + post.sub.top.minutes.ln, data=exs.posts, dist = "poisson")
htmlreg(list(ozcrm1, ozcrm4), include.deviance = TRUE)
Model 1 | Model 2 | |
---|---|---|
Count model: (Intercept) | 3.42*** | -3.06*** |
(0.01) | (0.16) | |
Zero model: (Intercept) | 0.70*** | -2.41*** |
(0.07) | (0.36) | |
Zero model: visibleTrue | 0.56*** | 0.90** |
(0.10) | (0.32) | |
Zero model: post.sub.top.minutes.ln | -0.25*** | 0.04 |
(0.01) | (0.03) | |
Count model: visibleTrue | -0.83*** | |
(0.02) | ||
Count model: post.amaTRUE | -0.07 | |
(0.04) | ||
Count model: post.flairanimalsci | 1.45*** | |
(0.16) | ||
Count model: post.flairanthro | 1.29*** | |
(0.17) | ||
Count model: post.flairastro | -0.09 | |
(0.18) | ||
Count model: post.flairbio | 0.43** | |
(0.17) | ||
Count model: post.flaircancer | 1.45*** | |
(0.18) | ||
Count model: post.flairchem | 0.62*** | |
(0.17) | ||
Count model: post.flaircompsci | -0.55** | |
(0.20) | ||
Count model: post.flairearthsci | 0.86*** | |
(0.17) | ||
Count model: post.flaireng | 0.43* | |
(0.17) | ||
Count model: post.flairenv | 0.13 | |
(0.17) | ||
Count model: post.flairepi | 1.15*** | |
(0.17) | ||
Count model: post.flairgeo | 0.98*** | |
(0.18) | ||
Count model: post.flairhealth | 2.02*** | |
(0.16) | ||
Count model: post.flairmath | 1.61*** | |
(0.18) | ||
Count model: post.flairmed | 1.61*** | |
(0.16) | ||
Count model: post.flairmeta | 1.51*** | |
(0.17) | ||
Count model: post.flairnano | 0.42* | |
(0.18) | ||
Count model: post.flairneuro | 0.48** | |
(0.17) | ||
Count model: post.flairpaleo | 0.69*** | |
(0.17) | ||
Count model: post.flairphysics | 1.43*** | |
(0.17) | ||
Count model: post.flairpsych | 0.85*** | |
(0.16) | ||
Count model: post.flairsoc | 2.05*** | |
(0.16) | ||
Count model: post.sub.top.minutes.ln | 0.54*** | |
(0.00) | ||
Count model: weekendTrue | 0.23*** | |
(0.02) | ||
Count model: post.hour | 0.21*** | |
(0.01) | ||
Count model: I(post.hour^2) | -0.01*** | |
(0.00) | ||
Count model: TREAT | 0.31*** | |
(0.01) | ||
Count model: post.amaTRUE:TREAT | -0.65*** | |
(0.06) | ||
AIC | 116854.33 | 53505.14 |
Log Likelihood | -58423.16 | -26718.57 |
Num. obs. | 2214 | 2214 |
p < 0.001, p < 0.01, p < 0.05 |
Many people have made this experiment possible. Merry Mou wrote much of the CivilServant code along with me. Betsy Paluck and Donald Green offered feedback on the pre-analysis plan and modeling approach. Ethan Zuckerman, my advisor, has encouraged and supported this work throughout. Michael Bernstein suggested the second hypothesis on the number of commenters. r/science suggested the sticky comment experiment in the first place, and moderators also offered detailed feedback and suggestions on the experiment procedures throughout. Thanks everyone!
[1] Susan M. Reiter and William Samuel. Littering as a Function of Prior Litter and The Presence or Absence of Prohibitive Signs. Journal of Applied Social Psychology, 10:45–55, 1980.
[2] Yvonne AW De Kort, L. Teddy McCalley, and Cees JH Midden. Persuasive trash cans: Activation of littering norms by design. Environment and Behavior, 2008.
[3] Harold H. Dawley, John Morrison, and Sudie Carrol. The Effect of Differently Worded No- Smoking Signs on Smoking Behavior. International Journal of the Addictions, 16(8):1467– 1471, January 1981.
[4] Noah J. Goldstein, Robert B. Cialdini, and Vladas Griskevicius. A Room with a Viewpoint: Using Social Norms to Motivate Environmental Conservation in Hotels. Journal of Consumer Research, 35(3):472–482, October 2008.
[5] Leonard Bickman and Susan K. Green. Situational Cues and Crime Reporting: Do Signs Make a Difference?1. Journal of Applied Social Psychology, 7:1–18, March 1977.
[6] Robert B. Cialdini, Carl A. Kallgren, and Raymond R. Reno. A focus theory of normative conduct: A theoretical refinement and reevaluation of the role of norms in human behavior. Advances in experimental social psychology, 24(20):1–243, 1991.
[7] Carsten Eickhoff and Arjen de Vries. How crowdsourcable is your task. In Proceedings of the workshop on crowdsourcing for search and data mining (CSDM) at the fourth ACM interna- tional conference on web search and data mining (WSDM), pages 11–14, 2011.
[8] John M. Levine, Richard L. Moreland, and Hoon-Seok Choi. Group socialization and new- comer innovation. Blackwell handbook of social psychology: Group processes, 3:86–106, 2001.
[9] Alan S. Gerber and Donald P. Green. Field experiments: Design, analysis, and interpretation. WW Norton, 2012.
[10] J. Scott Long. Regression models for categorical and limited dependent variables. Advanced Quantitative Techniques in the Social Sciences, 7, 1997.
[11] Wise, K., Hamman, B., & Thorson, K. (2006). Moderation, response rate, and message interactivity: Features of online communities and their effects on intent to participate. Journal of Computer‐Mediated Communication, 12(1), 24-41.