Last updated: 2018-11-05
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
✔ Environment: empty
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
✔ Seed:
set.seed(20180714)
The command set.seed(20180714)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
✔ Session information: recorded
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
✔ Repository version: 7e20469
wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: docs/.DS_Store
Ignored: docs/figure/.DS_Store
Untracked files:
Untracked: analysis/gd_notes.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
Let \(Y\) have zeroes where data is missing and let \(Z\) be an indicator matrix with ones where data is non-missing and zeroes where data is missing. Write
\[ \begin{aligned} R^2 &= Y^2 - 2 Y \odot (EL) (EF)' + Z \odot ((EL) (EF)')^2 + Z \odot (EL^2) (EF^2)' - Z \odot (EL)^2 ((EF)^2)' \\ &= Y^2 - 2 Y \odot (EL) (EF)' + Z \odot (EL^2) (EF^2)' \end{aligned} \]
Let \(m\) be the number of non-missing entries in \(Y\). We can write
\[\begin{aligned} \sigma^2 = \frac{1}{m} ( \text{sum}(Y^2) - 2 \cdot \text{sum}((EL) \odot Y (EF)) + \text{sum}((EL^2) \odot Z (EF^2)) ) \end{aligned}\]
Now let \(m\) be an \(n\)-vector, where \(m_i\) denotes the number of non-missing entries in row \(Y_{i \bullet}\). \(\sigma^2\) is now an \(n\)-vector:
\[\begin{aligned} \sigma^2 = (1 / m) \odot (\text{rowSums}(Y^2) - 2 \cdot \text{rowSums}((EL) \odot Y (EF)) + \text{rowSums}((EL^2) \odot Z (EF^2)) \end{aligned}\]
With \(m\) a \(p\)-vector, where \(m_j\) denotes the number of non-missing entries in column \(Y_{\bullet j}\):
\[\begin{aligned} \sigma^2 = (1 / m) \odot (\text{colSums}(Y^2) - 2 \cdot \text{colSums}(t(EF) \odot (t(EL))Y) + \text{colSums}(t(EF^2) \odot (t(EL^2))Z)) \end{aligned}\]
(I write it in this form to avoid transposing \(Y\).)
I each case, the most expensive operations are the two multiplications of an \(n \times p\) matrix by a \(p \times k\) matrix. The one involving \(Z\) is very cheap if there is no missing data. The old implementation required three multiplications of a \(n \times k\) by a \(k \times p\) matrix. More importantly, though, the new implementation does not require the formation of any new \(n \times p\) matrices.
In the course of optimizing a single factor/loading pair, one can exploit the fact that only one column in each of \(EL\), \(EF\), \(EL^2\), and \(EF^2\) change.
Instead of updating loading 1, then factor 1, then loading 2, etc., one can much more efficiently update all loadings (potentially in parallel), then all factors.
s2
sThe old algorithm calculates s2 = 1/(tau %*% f$EF2[, k])
, where the entries of tau
corresponding to missing data have been set to zero.
Let \(S\) be the matrix of s2
s, with the \(i\)th column giving the s2
s for the \(i\)th loading. Then for var_type = constant
or var_type = by_row
, \(S\) may be calculated as \[ S = \sigma^2 / Z(EF^2), \] where “\(/\)” is elementwise for var_type = by_row
. Again the calculation is simplified if there is no missing data: here, \(Z(EF^2)\) is simply a matrix where each row is the \(k\)-vector \(\text{colSums}(EF^2)\).
For var_type = by_column
, \[ S = 1 / Z(\tau \odot_b EF^2), \] where \(\odot_b\) denotes that broadcasting is to be performed (in the sense that R
uses it).
This is essentially the same as before, but tau
does not need to be stored as a matrix.
x
sThe old algorithm calculates x = ((Rk * tau) %*% f$EF[, k]) * s2
, which requires storing and updating Rk
. For var_type = constant
and var_type = by_row
, one may write:
\[ X_{\bullet k} = \tau \odot (Y - \sum_{i: i \ne k} Z \odot (EL)_{\bullet i}(EF)_{\bullet i}') (EF)_{\bullet k} \odot S_{\bullet k}. \]
Note that \((Z \odot (EL)_{\bullet i} (EF)_{\bullet i}') (EF)_{\bullet k}\) can be written as \[ (EL)_{\bullet i} \odot Z ((EF)_{\bullet i} \odot (EF)_{\bullet k}) \] so \[\left(\sum_{i: i \ne k} Z \odot (EL)_{\bullet i}(EF)_{\bullet i}'\right) (EF)_{\bullet k} = \text{rowSums} \left((EL)_{\bullet -k} \odot Z((EF)_{\bullet k} \odot_b (EF)_{\bullet -k}) \right).\]
Again, the performance savings are not likely to be substantial unless there is no missing data, but this method does not require the formation of any new \(n \times p\) matrices.
For var_type = by_column
, write \[ X_{\bullet k} = (Y - \sum_{i: i \ne k} Z \odot (EL)_{\bullet i}(EF)_{\bullet i}') (\tau \odot (EF)_{\bullet k}) \odot S_{\bullet k}. \]
The above could also be implemented as follows:
Calculate the \(n \times k\) matrix \(W = (Y - Z \odot (EL) (EF)')(EF)\). When there is missing data, this does require the temporary formation of a new \(n \times p\) matrix, but it does not need to be stored.
\((Z \odot (EL)_{\bullet k} (EF)_{\bullet k}') (EF)_{\bullet k}\) can be written as \[ (EL)_{\bullet k} \odot Z ((EF)_{\bullet k}^2). \] Form the \(n \times k\) matrix \(U = (EL) \odot Z(EF)^2\) in one go.
For \(k = 1\), calculate \[ X_{\bullet 1} = \tau \odot (W_{\bullet 1} + U_{\bullet 1}) \odot S_{\bullet 1}\] and solve the EBNM problem. This changes \((EL)_{\bullet k}\), so \(W\) needs to be updated:
\[\begin{aligned} W^{\text{new}} &= W^{\text{old}} + \left( Z \odot ((EL)_{\bullet k}^{\text{old}} - (EL)_{\bullet k}^{\text{new}}) (EF)_{\bullet k}' \right) (EF) \\ &= W^{\text{old}} + ((EL)_{\bullet k}^{\text{old}} - (EL)_{\bullet k}^{\text{new}}) \odot Z((EF)_{\bullet k} \odot_b (EF)), \end{aligned}\]
Then repeat for \(k = 2, 3, \ldots\)
But if one simply omits the update of of \(W\), parallelization is simple. If the updates of \((EL)\) are small, then this omission might be justified.
Factor updates can easily be obtained by reversing the roles of, on the one hand, \(EL\) and \(EF\) and, on the other, var_type = by_row
and var_type = by_column
.
TODO
With this implementation, the only \(n \times p\) matrices that ever need to be handled are \(Y\) and, if data is missing, \(Z\) (and \(Z\) can be stored as a matrix of integers). So it should be possible to fit a flash object using only slightly more memory than that required to store the data itself.
sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] workflowr_1.0.1 Rcpp_0.12.19 digest_0.6.15
[4] rprojroot_1.3-2 R.methodsS3_1.7.1 backports_1.1.2
[7] git2r_0.21.0 magrittr_1.5 evaluate_0.10.1
[10] stringi_1.2.4 whisker_0.3-2 R.oo_1.21.0
[13] R.utils_2.6.0 rmarkdown_1.8 tools_3.4.3
[16] stringr_1.3.0 yaml_2.1.17 compiler_3.4.3
[19] htmltools_0.3.6 knitr_1.20
This reproducible R Markdown analysis was created with workflowr 1.0.1