Last updated: 2018-06-18
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
✔ Environment: empty
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
✔ Seed:
set.seed(1)
The command set.seed(1)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
✔ Session information: recorded
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
✔ Repository version: 14f28d4
wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Unstaged changes:
Modified: analysis/hettablesim
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 6314ce0 | Gao Wang | 2018-06-16 | Relabel ‘test’ to ‘strong’ in data and code |
html | 6eee6a9 | Peter Carbonetto | 2018-06-06 | Updated the webpages for a bunch of R Markdown files after minor revisions. |
Rmd | 2b0db9b | Peter Carbonetto | 2018-06-06 | Misc. revisions to Rmd files. |
Rmd | 5222572 | Peter Carbonetto | 2018-06-06 | Some misc. updates to the R Markdown files. |
html | 48f7ba8 | Peter Carbonetto | 2018-06-06 | Created new webpage for HeterogeneityTables analysis. |
Rmd | 9079466 | Peter Carbonetto | 2018-06-06 | wflow_publish(c(“gtex.Rmd”, “HeterogeneityTables.Rmd”)) |
Rmd | 35ca901 | Peter Carbonetto | 2018-06-06 | Revised data/results loading steps in HeterogeneityTables.Rmd. |
Rmd | 4625ae8 | Peter Carbonetto | 2018-06-06 | Renamed HeterogeneityTables analysis files. |
html | 4625ae8 | Peter Carbonetto | 2018-06-06 | Renamed HeterogeneityTables analysis files. |
Here we summarize overall sharing of effects by sign and by magnitude. Compare the table at the bottom of this page against Table 2 in the manuscript.
Because a major feature of these data is that brain tissues generally show more similar effects than non-brain tissues, we also compute results separately from subsets of brain and non-brain tissues.
First, we load some functions defined for mash analyses.
source("../code/normfuncs.R")
This is the threshold used to determine which genes have at least one significant effect across tissues.
thresh <- 0.05
Load some GTEx summary statistics, as well as some of the results generated from the mash analysis of the GTEx data.
out <- readRDS("../data/MatrixEQTLSumStats.Portable.Z.rds")
maxbeta <- out$strong.b
maxz <- out$strong.z
standard.error <- out$strong.s
out <- readRDS(paste("../output/MatrixEQTLSumStats.Portable.Z.coved.K3.P3",
"lite.single.expanded.V1.posterior.rds",sep = "."))
pm.mash <- out$posterior.means
pm.mash.beta <- pm.mash*standard.error
lfsr <- out$lfsr
lfsr[lfsr < 0] <- 0
tissue.names <- as.character(read.table("../data/abbreviate.names.txt")[,2])
colnames(lfsr) <- tissue.names
Load the results generated from the mash analysis of the GTEx data after removing the data from the brain tissues.
lfsr.nobrain <- read.table("../output/nobrainlfsr.txt")[,-1]
colnames(lfsr.nobrain) <- tissue.names[-c(7:16)]
pm.mash.nobrain <-
as.matrix(read.table("../output/nobrainposterior.means.txt")[,-1]) *
standard.error[,-c(7:16)]
Load the results generated from the mash analysis of the GTEx data for the brain tissues only.
lfsr.brain.only <- read.table("../output/brainonlylfsr.txt")[,-1]
colnames(lfsr.brain.only) <- tissue.names[c(7:16)]
pm.mash.brain.only <-
as.matrix(read.table("../output/brainonlyposterior.means.txt")[,-1]) *
standard.error[,c(7:16)]
Compute the amount of eQTL sharing by sign, in all tissues, and separately in brain and non-brain tissues. “Sharing by” sign requires that the effect has the same sign as the strongest effect among tissues.
sigmat <- (lfsr<=thresh)
nsig <- rowSums(sigmat)
signall <- mean(het.norm(pm.mash.beta[nsig>0,])>0)
sigmat <- (lfsr[,-c(7:16)]<=thresh)
nsig <- rowSums(sigmat)
signall.nobrain <- mean(het.norm(pm.mash.beta[nsig,-c(7:16)])>0)
sigmat <- (lfsr[,c(7:16)]<=thresh)
nsig <- rowSums(sigmat)
signall.brainonly <- mean(het.norm(pm.mash.beta[nsig>0,c(7:16)])>0)
sigmat <- (lfsr.nobrain<=thresh)
nsig <- rowSums(sigmat)
signnobrain <- mean(het.norm(pm.mash.nobrain[nsig>0,])>0)
sigmat <- (lfsr.brain.only<=thresh)
nsig <- rowSums(sigmat)
signbrainonly <- mean(het.norm(pm.mash.brain.only[nsig>0,])>0)
Compute the amount of sharing by magnitude, in all tissues, and separately in brain and non-brain tissues. “Sharing by Magnitude” requires that the effect is also within a factor of 2 of the strongest effect.
sigmat <- (lfsr<=thresh)
nsig <- rowSums(sigmat)
magall <- mean(het.norm(pm.mash.beta[nsig>0,])>0.5)
sigmat <- (lfsr[,-c(7:16)]<=thresh)
nsig <- rowSums(sigmat)
magall.excludingbrain <- mean(het.norm(pm.mash.beta[nsig>0,-c(7:16)]) > 0.5)
sigmat <- (lfsr[,c(7:16)]<=thresh)
nsig <- rowSums(sigmat)
magall.brainonly <- mean(het.norm(pm.mash.beta[nsig>0,c(7:16)]) > 0.5)
sigmat <- (lfsr.nobrain<=thresh)
nsig <- rowSums(sigmat)
magnobrain <- mean(het.norm(pm.mash.nobrain[nsig>0,]) > 0.5)
sigmat <- (lfsr.brain.only<=thresh)
nsig <- rowSums(sigmat)
magbrain <- mean(het.norm(pm.mash.brain.only[nsig>0,]) > 0.5)
Summarize these calculations in a single table. The numbers in parentheses are obtained by the secondary mash analyses on the brain-only and non-brain tissue subsets.
round(matrix(rbind(c(signall,signall.nobrain,signnobrain,
signall.brainonly,signbrainonly),
c(magall,magall.excludingbrain,magnobrain,
magall.brainonly,magbrain)),
nrow = 2,ncol = 5,
dimnames = list(c("shared by sign","shared by magnitude"),
c("all tissues","non-brain","(non-brain)",
"brain","(brain)"))),
digits = 3)
# all tissues non-brain (non-brain) brain (brain)
# shared by sign 0.850 0.849 0.882 0.959 0.984
# shared by magnitude 0.359 0.398 0.445 0.764 0.859
The results confirm extensive eQTL sharing among tissues, particularly among the brain tissues; sharing in sign exceeds 85% in all cases, and is as high as 96% among the brain tissues.
Sharing in magnitude is inevitably lower, because sharing in magnitude implies sharing in sign. Overall, on average 36% of tissues show an effect within a factor of 2 of the strongest effect at each top eQTL.
However, within brain tissues this number increases to 76%. That is, not only do eQTLs tend to be shared among the brain tissues, but the effect sizes tend to be quite homogeneous.
sessionInfo()
# R version 3.4.3 (2017-11-30)
# Platform: x86_64-apple-darwin15.6.0 (64-bit)
# Running under: macOS High Sierra 10.13.5
#
# Matrix products: default
# BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
# LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
#
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# loaded via a namespace (and not attached):
# [1] workflowr_1.0.1.9000 Rcpp_0.12.17 digest_0.6.15
# [4] rprojroot_1.3-2 R.methodsS3_1.7.1 backports_1.1.2
# [7] git2r_0.21.0 magrittr_1.5 evaluate_0.10.1
# [10] stringi_1.1.7 whisker_0.3-2 R.oo_1.21.0
# [13] R.utils_2.6.0 rmarkdown_1.9 tools_3.4.3
# [16] stringr_1.3.0 yaml_2.1.18 compiler_3.4.3
# [19] htmltools_0.3.6 knitr_1.20
This reproducible R Markdown analysis was created with workflowr 1.0.1.9000