This assignment is based on an assignment created by German Rodriguez. Topics covered include:
For this assignment you will be extending the analysis in this article:
Greene and Schaffer (1992) “Leave to Appeal and Leave to Commence Judicial Review in Canada’s Refugee-Determination System: Is the Process Fair?” International Journal of Refugee Law, 4.1: 71-83.
Here’s how the authors describe the paper: “The purpose of this study is to determine whether refugees and other persons who are applying for leave to appeal a decision of the Convention Refugee Determination Division of the Canadian Immigration and Refugee Board, a decision of the Appeal Division of the Immigration and Refugee Board, or to commence an action for judicial review have a relatively equal chance to convince the Federal Court of Appeal of the merits of their applications.”"
The data and documentation were made avaialble by John Fox. I have stored a copy of the data in Greene.txt in the subfolder data.
library(dplyr)
##
## Attaching package: 'dplyr'
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
data <- read.table("data/Greene.txt", header=TRUE)
data <- tbl_df(data)
Before doing anything else, just play with the data to get a sense of it.
# convert decision to a number rather a factor
data$appeal_granted <- as.numeric(data$decision=="yes")
summarize(data, yes_rate = mean(appeal_granted))
## Source: local data frame [1 x 1]
##
## yes_rate
## 1 0.296875
library(ggplot2)
p <- ggplot(data, aes(judge))
p + geom_bar(stat="bin")
data_to_plot <- group_by(data, judge) %>%
summarize(yes_rate = mean(appeal_granted)) %>%
arrange(desc(yes_rate))
data_to_plot$judge <- reorder(data_to_plot$judge, data_to_plot$yes_rate)
p <- ggplot(data_to_plot, aes(x = yes_rate, y = judge))
p + geom_point()
data$judge <- relevel(data$judge, ref="Iacobucci")
fit <- glm(appeal_granted ~ judge, data = data, family = "binomial")
library(broom)
tidy(fit)
## term estimate std.error statistic p.value
## 1 (Intercept) -2.1594842 0.6097150 -3.5417930 0.0003974172
## 2 judgeDesjardins 2.2464956 0.6774021 3.3163401 0.0009120477
## 3 judgeHeald 1.3385037 0.7089861 1.8879124 0.0590377139
## 4 judgeHugessen 0.7323679 0.6892646 1.0625352 0.2879927768
## 5 judgeMacGuigan 1.0224057 0.6704056 1.5250554 0.1272452851
## 6 judgeMahoney 1.8912203 0.7123896 2.6547555 0.0079365903
## 7 judgeMarceau 2.5649493 0.7337704 3.4955748 0.0004730419
## 8 judgePratte 0.3677248 0.7524605 0.4886965 0.6250566009
## 9 judgeStone 1.0200500 0.7326337 1.3923056 0.1638298693
## 10 judgeUrie 1.9771627 0.8593131 2.3008641 0.0213993135
fit_tidy <- tidy(fit)
predicted_probability <- 1 / (1 + exp(-(fit_tidy[fit_tidy$term=="(Intercept)", "estimate"] + fit_tidy[fit_tidy$term=="judgeMarceau", "estimate"])))
print(paste("Predicted probability for Judge Marceau is:", predicted_probability))
## [1] "Predicted probability for Judge Marceau is: 0.599999999999999"
data$location <- relevel(data$location, ref="other")
data$language <- relevel(data$language, ref="French")
fit <- glm(appeal_granted ~ judge + rater + location + language, data = data, family = "binomial")
tidy(fit)
## term estimate std.error statistic p.value
## 1 (Intercept) -3.1099344 0.8821732 -3.5253103 4.229871e-04
## 2 judgeDesjardins 2.5118507 0.7134964 3.5204812 4.307645e-04
## 3 judgeHeald 1.3288574 0.7348309 1.8083852 7.054657e-02
## 4 judgeHugessen 1.2371875 0.7559655 1.6365660 1.017212e-01
## 5 judgeMacGuigan 1.3108339 0.7030106 1.8646005 6.223742e-02
## 6 judgeMahoney 1.7326906 0.7400300 2.3413788 1.921267e-02
## 7 judgeMarceau 3.4303647 0.8025491 4.2743364 1.917074e-05
## 8 judgePratte 0.6970888 0.7892341 0.8832471 3.771028e-01
## 9 judgeStone 0.9768491 0.7582761 1.2882500 1.976590e-01
## 10 judgeUrie 2.3313476 0.8969069 2.5993195 9.340879e-03
## 11 rateryes 1.4247201 0.2610852 5.4569166 4.844737e-08
## 12 locationMontreal -0.4276074 0.6158915 -0.6942903 4.875002e-01
## 13 locationToronto 0.1029737 0.3672029 0.2804273 7.791497e-01
## 14 languageEnglish 0.3268456 0.5580564 0.5856856 5.580868e-01
fit_tidy <- tidy(fit)
predictor_value <- fit_tidy[fit_tidy$term=="(Intercept)", "estimate"] +
fit_tidy[fit_tidy$term=="rateryes", "estimate"] +
fit_tidy[fit_tidy$term=="judgeMarceau", "estimate"] +
fit_tidy[fit_tidy$term=="locationToronto", "estimate"] +
fit_tidy[fit_tidy$term=="languageEnglish", "estimate"]
predicted_probability <- 1 / (1 + exp(-(predictor_value)))
print(paste("The predicted probability for this case is:", predicted_probability))
## [1] "The predicted probability for this case is: 0.897979154560357"
## R version 3.1.2 (2014-10-31)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] broom_0.3.5 ggplot2_1.0.0 dplyr_0.4.1
##
## loaded via a namespace (and not attached):
## [1] assertthat_0.1 colorspace_1.2-4 DBI_0.3.1 digest_0.6.8
## [5] evaluate_0.5.5 formatR_1.0 grid_3.1.2 gtable_0.1.2
## [9] htmltools_0.2.6 knitr_1.9 labeling_0.3 lazyeval_0.1.10
## [13] magrittr_1.5 MASS_7.3-40 mnormt_1.5-1 munsell_0.4.2
## [17] parallel_3.1.2 plyr_1.8.1 proto_0.3-10 psych_1.5.1
## [21] Rcpp_0.11.4 reshape2_1.4.1 rmarkdown_0.5.1 scales_0.2.4
## [25] stringr_0.6.2 tidyr_0.2.0 tools_3.1.2 yaml_2.1.13