<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml">

<head>

<meta charset="utf-8" />
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />


<meta name="author" content="Donghyung Lee" />

<meta name="date" content="2018-07-31" />

<title>Reviewer 2 Simulation</title>

<script src="site_libs/jquery-1.11.3/jquery.min.js"></script>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link href="site_libs/bootstrap-3.3.5/css/united.min.css" rel="stylesheet" />
<script src="site_libs/bootstrap-3.3.5/js/bootstrap.min.js"></script>
<script src="site_libs/bootstrap-3.3.5/shim/html5shiv.min.js"></script>
<script src="site_libs/bootstrap-3.3.5/shim/respond.min.js"></script>
<script src="site_libs/navigation-1.1/tabsets.js"></script>
<link href="site_libs/highlightjs-9.12.0/textmate.css" rel="stylesheet" />
<script src="site_libs/highlightjs-9.12.0/highlight.js"></script>
<link href="site_libs/font-awesome-4.5.0/css/font-awesome.min.css" rel="stylesheet" />

<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
  pre:not([class]) {
    background-color: white;
  }
</style>
<script type="text/javascript">
if (window.hljs) {
  hljs.configure({languages: []});
  hljs.initHighlightingOnLoad();
  if (document.readyState && document.readyState === "complete") {
    window.setTimeout(function() { hljs.initHighlighting(); }, 0);
  }
}
</script>



<style type="text/css">
h1 {
  font-size: 34px;
}
h1.title {
  font-size: 38px;
}
h2 {
  font-size: 30px;
}
h3 {
  font-size: 24px;
}
h4 {
  font-size: 18px;
}
h5 {
  font-size: 16px;
}
h6 {
  font-size: 12px;
}
.table th:not([align]) {
  text-align: left;
}
</style>


</head>

<body>

<style type = "text/css">
.main-container {
  max-width: 940px;
  margin-left: auto;
  margin-right: auto;
}
code {
  color: inherit;
  background-color: rgba(0, 0, 0, 0.04);
}
img {
  max-width:100%;
  height: auto;
}
.tabbed-pane {
  padding-top: 12px;
}
button.code-folding-btn:focus {
  outline: none;
}
</style>


<style type="text/css">
/* padding for bootstrap navbar */
body {
  padding-top: 51px;
  padding-bottom: 40px;
}
/* offset scroll position for anchor links (for fixed navbar)  */
.section h1 {
  padding-top: 56px;
  margin-top: -56px;
}

.section h2 {
  padding-top: 56px;
  margin-top: -56px;
}
.section h3 {
  padding-top: 56px;
  margin-top: -56px;
}
.section h4 {
  padding-top: 56px;
  margin-top: -56px;
}
.section h5 {
  padding-top: 56px;
  margin-top: -56px;
}
.section h6 {
  padding-top: 56px;
  margin-top: -56px;
}
</style>

<script>
// manage active state of menu based on current page
$(document).ready(function () {
  // active menu anchor
  href = window.location.pathname
  href = href.substr(href.lastIndexOf('/') + 1)
  if (href === "")
    href = "index.html";
  var menuAnchor = $('a[href="' + href + '"]');

  // mark it active
  menuAnchor.parent().addClass('active');

  // if it's got a parent navbar menu mark it active as well
  menuAnchor.closest('li.dropdown').addClass('active');
});
</script>


<div class="container-fluid main-container">

<!-- tabsets -->
<script>
$(document).ready(function () {
  window.buildTabsets("TOC");
});
</script>

<!-- code folding -->






<div class="navbar navbar-default  navbar-fixed-top" role="navigation">
  <div class="container">
    <div class="navbar-header">
      <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
        <span class="icon-bar"></span>
        <span class="icon-bar"></span>
        <span class="icon-bar"></span>
      </button>
      <a class="navbar-brand" href="index.html">iasvaExamples</a>
    </div>
    <div id="navbar" class="navbar-collapse collapse">
      <ul class="nav navbar-nav">
        <li>
  <a href="index.html">Home</a>
</li>
<li>
  <a href="about.html">About</a>
</li>
<li>
  <a href="license.html">License</a>
</li>
      </ul>
      <ul class="nav navbar-nav navbar-right">
        <li>
  <a href="https://github.com/dleelab/iasvaExamples">
    <span class="fa fa-github"></span>
     
  </a>
</li>
      </ul>
    </div><!--/.nav-collapse -->
  </div><!--/.container -->
</div><!--/.navbar -->

<!-- Add a small amount of space between sections. -->
<style type="text/css">
div.section {
  padding-top: 12px;
}
</style>

<div class="fluid-row" id="header">



<h1 class="title toc-ignore">Reviewer 2 Simulation</h1>
<h4 class="author"><em>Donghyung Lee</em></h4>
<h4 class="date"><em>2018-07-31</em></h4>

</div>

<div id="TOC">
<ul>
<li><a href="#simulation-1">Simulation 1</a></li>
<li><a href="#compare-to-just-naively-taking-the-first-pc-of-the-residual-matrix.">Compare to just naively taking the first PC of the residual matrix.</a></li>
<li><a href="#compare-to-sva">Compare to SVA</a></li>
<li><a href="#simulation-2">Simulation 2</a></li>
<li><a href="#corrected-simulation-1-with-low-correlation-btw-known-and-hidden-factors">Corrected Simulation 1 with low correlation btw known and hidden factors</a></li>
<li><a href="#corrected-simulation-1-with-high-correlation-btw-known-and-hidden-factors">Corrected Simulation 1 with high correlation btw known and hidden factors</a></li>
<li><a href="#session-information">Session information</a></li>
</ul>
</div>

<p><strong>Last updated:</strong> 2018-07-31</p>
<strong>workflowr checks:</strong> <small>(Click a bullet for more information)</small>
<ul>
<li>
<p><details> <summary> <strong style="color:red;">✖</strong> <strong>R Markdown file:</strong> uncommitted changes </summary> The R Markdown is untracked by Git. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run <code>wflow_publish</code> to commit the R Markdown file and build the HTML.</p>
</details>
</li>
<li>
<p><details> <summary> <strong style="color:blue;">✔</strong> <strong>Environment:</strong> empty </summary></p>
<p>Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.</p>
</details>
</li>
<li>
<p><details> <summary> <strong style="color:blue;">✔</strong> <strong>Seed:</strong> <code>set.seed(20180731)</code> </summary></p>
<p>The command <code>set.seed(20180731)</code> was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.</p>
</details>
</li>
<li>
<p><details> <summary> <strong style="color:blue;">✔</strong> <strong>Session information:</strong> recorded </summary></p>
<p>Great job! Recording the operating system, R version, and package versions is critical for reproducibility.</p>
</details>
</li>
<li>
<p><details> <summary> <strong style="color:blue;">✔</strong> <strong>Repository version:</strong> <a href="https://github.com/dleelab/iasvaExamples/tree/e0cbded44bacfcefd286ef233b3a1ac52325a78c" target="_blank">e0cbded</a> </summary></p>
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated. <br><br> Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use <code>wflow_publish</code> or <code>wflow_git_commit</code>). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
<pre><code>
Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    data/.DS_Store
    Ignored:    inst/.DS_Store
    Ignored:    inst/doc/.DS_Store
    Ignored:    vignettes/.DS_Store

Untracked files:
    Untracked:  analysis/Brain_scRNASeq_neuron_vs_oligodendrocyte_single_run.Rmd
    Untracked:  analysis/detecting_hidden_heterogeneity.Rmd
    Untracked:  analysis/hidden_heterogeneity_glioblastoma.Rmd
    Untracked:  analysis/reviewer2_sim.Rmd
    Untracked:  analysis/scRNASeq_simulation.Rmd
    Untracked:  analysis/sim_study_KF_HF_genes_overlap.Rmd
    Untracked:  analysis/tSNE_post_IA-SVA_3celltypes.Rmd
    Untracked:  analysis/tSNE_post_IA-SVA_Xin_Islets.Rmd

Unstaged changes:
    Modified:   analysis/_site.yml
    Modified:   analysis/index.Rmd

</code></pre>
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes. </details>
</li>
</ul>
<hr />
<div id="simulation-1" class="section level2">
<h2>Simulation 1</h2>
<pre class="r"><code># Setting up the known and unknown factors. 
set.seed(10000) 
known &lt;- runif(1000) 
unknown &lt;- runif(1000, 0, 0.1) # weak unknown factor 

# Generating Poisson count data from log-normal means. 
ngenes &lt;- 5000 
known.effect &lt;- rnorm(ngenes, sd=2) 
unknown.effect &lt;- rnorm(ngenes, sd=2) 
mat &lt;- outer(known.effect, known) + 
outer(unknown.effect, unknown) + 
2 # to get decent-sized counts 

counts &lt;- matrix(rpois(length(mat), lambda=2^mat), nrow=ngenes) 
library(SummarizedExperiment) 
se &lt;- SummarizedExperiment(list(counts=counts)) 

# Running IA-SVA, version 0.99.3. 
library(iasva) 
design &lt;- model.matrix(~known) 
res &lt;- iasva(se, design, num.sv=5, permute = FALSE) 
# IA-SVA running...
# 
# SV 1 Detected!
# 
# SV 2 Detected!
# 
# SV 3 Detected!
# 
# SV 4 Detected!
# 
# SV 5 Detected!
# 
# # of significant surrogate variables: 5
plot(res$sv[,1], unknown) </code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim1-1.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>
cor(res$sv[,1], unknown) # these SVs are not the unknown factor (correlation ~= 0) 
# [1] 0.05699341
cor(res$sv[,2], unknown) 
# [1] -0.1317231
cor(res$sv[,3], unknown) 
# [1] 0.06301151
cor(res$sv[,4], unknown) 
# [1] -0.08444932
cor(res$sv[,5], unknown) 
# [1] 0.1239618

cor(res$sv[,1], known) # ... but are instead the known factor (correlation ~= +/-1) 
# [1] -0.9988192
cor(res$sv[,2], known) 
# [1] 0.9971117
cor(res$sv[,3], known) 
# [1] -0.9558875
cor(res$sv[,4], known) 
# [1] 0.9982102
cor(res$sv[,5], known) 
# [1] -0.9934994</code></pre>
</div>
<div id="compare-to-just-naively-taking-the-first-pc-of-the-residual-matrix." class="section level2">
<h2>Compare to just naively taking the first PC of the residual matrix.</h2>
<pre class="r"><code>library(limma) 
# 
# Attaching package: &#39;limma&#39;
# The following object is masked from &#39;package:BiocGenerics&#39;:
# 
#     plotMA
resid &lt;- removeBatchEffect(log(assay(se)+1), covariates=known) 
pr.out &lt;- prcomp(t(resid), rank.=1) 
plot(pr.out$x[,1], unknown) </code></pre>
<p><img src="figure/reviewer2_sim.Rmd/firstPC_residual-1.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>cor(pr.out$x[,1], unknown) # close to (-)1 
# [1] 0.9766822
cor(pr.out$x[,1], known) # close to zero. 
# [1] 2.35573e-17</code></pre>
</div>
<div id="compare-to-sva" class="section level2">
<h2>Compare to SVA</h2>
<pre class="r"><code>library(sva)
mod1 &lt;- model.matrix(~known)
mod0 &lt;- cbind(mod1[,1])
sva.res = svaseq(counts,mod1,mod0, n.sv=5)$sv
# Number of significant surrogate variables is:  5 
# Iteration (out of 5 ):1  2  3  4  5
plot(sva.res[,1], unknown)</code></pre>
<p><img src="figure/reviewer2_sim.Rmd/existing_methods-1.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>plot(sva.res[,1], known)</code></pre>
<p><img src="figure/reviewer2_sim.Rmd/existing_methods-2.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>cor(sva.res[,1], unknown) # -0.55
# [1] -0.5522043
cor(sva.res[,1], known) # close to zero. 
# [1] -0.04603016</code></pre>
</div>
<div id="simulation-2" class="section level2">
<h2>Simulation 2</h2>
<pre class="r"><code>set.seed(10000) 
known &lt;- runif(1000) 
unknown &lt;- runif(1000) 

ngenes &lt;- 5000 
known.effect &lt;- rnorm(ngenes, sd=2) 
unknown.effect &lt;- rnorm(ngenes, sd=2) 
mat &lt;- outer(known.effect, known) + 
outer(unknown.effect, unknown) + 
2 # to get decent-sized counts 

counts &lt;- matrix(rpois(length(mat), lambda=2^mat), nrow=ngenes) 
library(SummarizedExperiment) 
se &lt;- SummarizedExperiment(list(counts=counts)) 

library(iasva) 
design &lt;- model.matrix(~known) 
res &lt;- iasva(se, design, num.sv=5, permute = FALSE) 
# IA-SVA running...
# 
# SV 1 Detected!
# 
# SV 2 Detected!
# 
# SV 3 Detected!
# 
# SV 4 Detected!
# 
# SV 5 Detected!
# 
# # of significant surrogate variables: 5

cor(res$sv[,1], unknown) # first SV is the unknown factor. 
# [1] -0.9960557
cor(res$sv[,2], unknown) 
# [1] 0.7637275
cor(res$sv[,3], unknown) 
# [1] 0.7564822
cor(res$sv[,4], unknown) 
# [1] -0.7629373
cor(res$sv[,5], unknown) 
# [1] 0.4760279

cor(res$sv[,1], known) 
# [1] -0.004959888
cor(res$sv[,2], known) # but SVs 2-5 are correlated with the known factor... 
# [1] -0.7014209
cor(res$sv[,3], known) 
# [1] -0.7076734
cor(res$sv[,4], known) 
# [1] -0.476617
cor(res$sv[,5], known) 
# [1] -0.8898334


# Compare to SVA
library(sva)
mod1 &lt;- model.matrix(~known)
mod0 &lt;- cbind(mod1[,1])
sva.res = svaseq(counts,mod1,mod0, n.sv=3)$sv
# Number of significant surrogate variables is:  3 
# Iteration (out of 5 ):1  2  3  4  5
plot(sva.res[,1], unknown)</code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim2-1.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>plot(sva.res[,1], known)</code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim2-2.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>cor(sva.res[,1], unknown) # close to (-)1 
# [1] 0.989786
cor(sva.res[,2], unknown)
# [1] 0.03346006
cor(sva.res[,3], unknown) 
# [1] -0.01770404

cor(sva.res[,1], known) # close to zero. 
# [1] 0.0007217181
cor(sva.res[,2], known)
# [1] -0.04076436
cor(sva.res[,3], known)
# [1] -0.04114449</code></pre>
</div>
<div id="corrected-simulation-1-with-low-correlation-btw-known-and-hidden-factors" class="section level2">
<h2>Corrected Simulation 1 with low correlation btw known and hidden factors</h2>
<pre class="r"><code># Setting up the known and unknown factors. 
set.seed(10000) 
sample.size &lt;- 100
ngenes &lt;- 5000
# CORRECTEION: Here, we simulate factors with two levels (0 or 1).
factor.prop &lt;- 0.5
flip.prob &lt;- 0.5 ## to make no correlation btw known and hidden factors
known &lt;- c(rep(-1,each=sample.size*factor.prop),rep(1,each=sample.size-(sample.size*factor.prop)))
coinflip = rbinom(sample.size,size=1,prob=flip.prob)
unknown = known*coinflip + -known*(1-coinflip)
cor(known, unknown)
# [1] -0.06030227

# Generating Poisson count data from log-normal means. 
ngenes &lt;- 5000 
known.effect &lt;- rnorm(ngenes, sd=2) 
unknown.effect &lt;- rnorm(ngenes, sd=2) ## CORRECTION: to make factor with weach effect size, we use 1 as SD here.

mat &lt;- outer(known.effect, known) + outer(unknown.effect, unknown) + 2 # to get decent-sized counts 

counts &lt;- matrix(rpois(length(mat), lambda=2^mat), nrow=ngenes)
se &lt;- SummarizedExperiment(assays=counts) 

# Running IA-SVA, version 0.99.3. 
library(iasva) 
design &lt;- model.matrix(~known) 
## CORRECTION: IASVA doesn&#39;t accept the constant term of design matrix, so we use design[,-1] instead of design
res &lt;- iasva(se, as.matrix(design[,-1]), num.sv=5, permute=FALSE) #num.p=20, threads=8) 
# IA-SVA running...
# 
# SV 1 Detected!
# 
# SV 2 Detected!
# 
# SV 3 Detected!
# 
# SV 4 Detected!
# 
# SV 5 Detected!
# 
# # of significant surrogate variables: 5
plot(res$sv[,1], unknown) </code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim1_corrected-1.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>
cor(res$sv[,1], unknown) # these SVs are not the unknown factor (correlation ~= 0) 
# [1] -0.9978671
cor(res$sv[,2], unknown) 
# [1] -0.8506371
cor(res$sv[,3], unknown) 
# [1] -0.9022145
cor(res$sv[,4], unknown) 
# [1] -0.8897027
cor(res$sv[,5], unknown) 
# [1] 0.7309927

cor(res$sv[,1], known) # ... but are instead the known factor (correlation ~= +/-1) 
# [1] 0.1251865
cor(res$sv[,2], known) 
# [1] 0.5759605
cor(res$sv[,3], known) 
# [1] 0.4847508
cor(res$sv[,4], known) 
# [1] 0.5043688
cor(res$sv[,5], known) 
# [1] -0.7206372

# Compare to just naively taking the first PC of the residual matrix. 
library(limma) 
resid &lt;- removeBatchEffect(log(assay(se)+1), covariates=known) 
pr.out &lt;- prcomp(t(resid), rank.=1) 
plot(pr.out$x[,1], unknown) </code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim1_corrected-2.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>cor(pr.out$x[,1], unknown) # close to (-)1 
# [1] -0.9981718
cor(pr.out$x[,1], known) # close to zero. 
# [1] -7.096324e-18


# Compare to SVA
library(sva)
mod1 &lt;- model.matrix(~known)
mod0 &lt;- cbind(mod1[,1])
sva.res = svaseq(counts,mod1,mod0, n.sv=3)$sv
# Number of significant surrogate variables is:  3 
# Iteration (out of 5 ):1  2  3  4  5
plot(sva.res[,1], unknown)</code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim1_corrected-3.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>plot(sva.res[,1], known)</code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim1_corrected-4.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>cor(sva.res[,1], unknown) # close to (-)1 
# [1] 0.9976468
cor(sva.res[,1], known) # close to zero. 
# [1] 0.005334052</code></pre>
</div>
<div id="corrected-simulation-1-with-high-correlation-btw-known-and-hidden-factors" class="section level2">
<h2>Corrected Simulation 1 with high correlation btw known and hidden factors</h2>
<pre class="r"><code># Setting up the known and unknown factors. 
set.seed(10000) 
sample.size &lt;- 100 ##CORRECTION, to reduce the computational burden, we used smaller sample size.
ngenes &lt;- 5000
# CORRECTEION: Here, we simulate factors with two levels (0 or 1).
factor.prop &lt;- 0.5
flip.prob &lt;- 0.8 ## to make no correlation btw known and hidden factors
known &lt;- c(rep(-1,each=sample.size*factor.prop),rep(1,each=sample.size-(sample.size*factor.prop)))
coinflip = rbinom(sample.size,size=1,prob=flip.prob)
unknown = known*coinflip + -known*(1-coinflip)
cor(known, unknown)
# [1] 0.6940221

# Generating Poisson count data from log-normal means. 
ngenes &lt;- 5000 
known.effect &lt;- rnorm(ngenes, sd=2) 
unknown.effect &lt;- rnorm(ngenes, sd=2) ## CORRECTION: to make factor with weach effect size, we use 1 as SD here.

mat &lt;- outer(known.effect, known) + outer(unknown.effect, unknown) + 2 # to get decent-sized counts 

counts &lt;- matrix(rpois(length(mat), lambda=2^mat), nrow=ngenes)
se &lt;- SummarizedExperiment(assays=counts) 

# Running IA-SVA, version 0.99.3. 
library(iasva) 
design &lt;- model.matrix(~known) 
## CORRECTION: IASVA doesn&#39;t accept the constant term of design matrix, so we use design[,-1] instead of design
res &lt;- iasva(se, as.matrix(design[,-1]), num.sv=5, permute=FALSE) #num.p=20, threads=8) 
# IA-SVA running...
# 
# SV 1 Detected!
# 
# SV 2 Detected!
# 
# SV 3 Detected!
# 
# SV 4 Detected!
# 
# SV 5 Detected!
# 
# # of significant surrogate variables: 5

plot(res$sv[,1], unknown) </code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim1_corrected_high_corr-1.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>cor(res$sv[,1], unknown) # accurately estimate the hidden factor (correlation ~= +/-1) 
# [1] 0.9787781
cor(res$sv[,2], unknown) 
# [1] 0.9238274
cor(res$sv[,3], unknown) 
# [1] 0.9272298
cor(res$sv[,4], unknown) 
# [1] 0.9283517
cor(res$sv[,5], unknown) 
# [1] -0.8907946

cor(res$sv[,1], known) # ... but are instead the known factor (correlation ~= +/-1) 
# [1] 0.8263641
cor(res$sv[,2], known) 
# [1] 0.9164418
cor(res$sv[,3], known) 
# [1] 0.91305
cor(res$sv[,4], known) 
# [1] 0.9116705
cor(res$sv[,5], known) 
# [1] -0.9448956

res1 &lt;- iasva(se, as.matrix(design[,-1]), threads=8) 
# IA-SVA running...
# 
# SV 1 Detected!
# 
# SV 2 Detected!
# 
# # of significant surrogate variables: 2
plot(res1$sv[,1], unknown) </code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim1_corrected_high_corr-2.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>cor(res1$sv[,1], unknown) # accurately estimate the hidden factor (correlation ~= +/-1) 
# [1] 0.9787781
cor(res1$sv[,2], unknown) # 
# [1] -0.9238274

# Compare to just naively taking the first PC of the residual matrix. 
library(limma) 
resid &lt;- removeBatchEffect(log(assay(se)+1), covariates=known) 
pr.out &lt;- prcomp(t(resid), rank.=1) 
plot(pr.out$x[,1], unknown) </code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim1_corrected_high_corr-3.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>cor(pr.out$x[,1], unknown) # close to (-)1 
# [1] 0.719475
cor(pr.out$x[,1], known) # close to zero. 
# [1] 3.376059e-16


# Compare to SVA
library(sva)
mod1 &lt;- model.matrix(~known)
mod0 &lt;- cbind(mod1[,1])
sva.res = svaseq(counts,mod1,mod0, n.sv=3)$sv
# Number of significant surrogate variables is:  3 
# Iteration (out of 5 ):1  2  3  4  5
plot(sva.res[,1], unknown)</code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim1_corrected_high_corr-4.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>plot(sva.res[,1], known)</code></pre>
<p><img src="figure/reviewer2_sim.Rmd/sim1_corrected_high_corr-5.png" width="672" style="display: block; margin: auto;" /></p>
<pre class="r"><code>cor(sva.res[,1], unknown) # close to (-)1 
# [1] 0.7275183
cor(sva.res[,1], known) # close to zero. 
# [1] 0.01369329</code></pre>
</div>
<div id="session-information" class="section level2">
<h2>Session information</h2>
<pre class="r"><code>sessionInfo()
# R version 3.5.0 (2018-04-23)
# Platform: x86_64-apple-darwin15.6.0 (64-bit)
# Running under: macOS Sierra 10.12.6
# 
# Matrix products: default
# BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
# LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
# 
# locale:
# [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
# 
# attached base packages:
# [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
# [8] methods   base     
# 
# other attached packages:
#  [1] limma_3.36.2                SummarizedExperiment_1.10.1
#  [3] DelayedArray_0.6.1          matrixStats_0.53.1         
#  [5] Biobase_2.40.0              GenomicRanges_1.32.3       
#  [7] GenomeInfoDb_1.16.0         IRanges_2.14.10            
#  [9] S4Vectors_0.18.3            BiocGenerics_0.26.0        
# [11] sva_3.28.0                  BiocParallel_1.14.2        
# [13] genefilter_1.62.0           mgcv_1.8-23                
# [15] nlme_3.1-137                iasva_0.99.3               
# 
# loaded via a namespace (and not attached):
#  [1] Rcpp_0.12.17           compiler_3.5.0         git2r_0.21.0          
#  [4] workflowr_1.0.1        XVector_0.20.0         R.methodsS3_1.7.1     
#  [7] R.utils_2.6.0          bitops_1.0-6           tools_3.5.0           
# [10] zlibbioc_1.26.0        bit_1.1-14             digest_0.6.15         
# [13] memoise_1.1.0          RSQLite_2.1.1          annotate_1.58.0       
# [16] evaluate_0.10.1        lattice_0.20-35        Matrix_1.2-14         
# [19] DBI_1.0.0              yaml_2.1.19            GenomeInfoDbData_1.1.0
# [22] stringr_1.3.1          knitr_1.20             cluster_2.0.7-1       
# [25] bit64_0.9-7            rprojroot_1.3-2        grid_3.5.0            
# [28] AnnotationDbi_1.42.1   survival_2.42-3        XML_3.98-1.11         
# [31] rmarkdown_1.9          irlba_2.3.2            blob_1.1.1            
# [34] magrittr_1.5           whisker_0.3-2          splines_3.5.0         
# [37] backports_1.1.2        htmltools_0.3.6        xtable_1.8-2          
# [40] stringi_1.2.2          RCurl_1.95-4.10        R.oo_1.22.0</code></pre>
</div>

<!-- Adjust MathJax settings so that all math formulae are shown using
TeX fonts only; see
http://docs.mathjax.org/en/latest/configuration.html.  This will make
the presentation more consistent at the cost of the webpage sometimes
taking slightly longer to load. Note that this only works because the
footer is added to webpages before the MathJax javascript. -->
<script type="text/x-mathjax-config">
  MathJax.Hub.Config({
    "HTML-CSS": { availableFonts: ["TeX"] }
  });
</script>

<hr>
<p>
  This reproducible <a href="http://rmarkdown.rstudio.com">R Markdown</a>
  analysis was created with
  <a href="https://github.com/jdblischak/workflowr">workflowr</a> 1.0.1
</p>
<hr>



</div>

<script>

// add bootstrap table styles to pandoc tables
function bootstrapStylePandocTables() {
  $('tr.header').parent('thead').parent('table').addClass('table table-condensed');
}
$(document).ready(function () {
  bootstrapStylePandocTables();
});


</script>

<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
  (function () {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src  = "https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
    document.getElementsByTagName("head")[0].appendChild(script);
  })();
</script>

</body>
</html>