Knockoff on Small Signals

Last updated: 2018-02-07

Code version: 7f18ee7

Introduction

In the Knockoff paper simulations, \(\beta\)’s are either \(0\) or \(A\). Here we are replicating the results, and investigating how well Knockoff deal with small signals.

In the following simulations, we always have \(n = 3000\), \(p = 1000\), \(X_{n \times p}\) has independent columns simulated from \(N(0, 1)\) and then normalized to have \(\|X_j\|_2^2 \equiv 1\). For a certain \(\beta\), \(Y_n \sim N(X_{n\times p}\beta_p, I_n)\). Out of \(p = 1000\) \(\beta_j\)’s, here are three scenarios.

Scenario 1: \(950\) are 0, \(25\) are \(3.5\), \(25\) are \(-3.5\). (replicating a data point on Fig 3 of the Knockoff paper)
Scenario 2: \(900\) are 0, \(100\) are \((N(0, 5.19^2) \wedge 3.5) \vee (-3.5)\) so that \(50\) of them are expected to be larger than \(3.5\) or smaller than \(-3.5\). And then we truncate these large signals to make them to be \(3.5\) or \(-3.5\).
Scenario 3: Similar to Scenario 2, only that \(800\) are 0, \(200\) are \((N(0, 3.04^2) \wedge 3.5) \vee (-3.5)\) so we still have on average \(50\) of them are expected to be \(3.5\) or \(-3.5\).

Scenario 1: 50 large signals, no small signals, 950 zeroes.

Scenario 2: 50 large signals, 50 small signals, 900 zeroes.

Scenario 3: 50 large signals, 150 small signals, 800 zeroes.

FDP.BH	FDP.Knockoff	Power.BH	Power.Knockoff	Power.Large.BH	Power.Large.Knockoff	Power.Small.BH	Power.Small.Knockoff
0.0914	0.0623	0.4198	0.4398	0.4198	0.4398	NA	NA
0.0880	0.0458	0.2970	0.2623	0.4748	0.4132	0.1210	0.1126
0.0754	0.0342	0.2216	0.1641	0.5206	0.3929	0.1236	0.0886

Fixed \(X\) simulations

Scenario 1: 50 large signals, no small signals, 950 zeroes.

Scenario 2: 50 large signals, 50 small signals, 900 zeroes.

Scenario 3: 50 large signals, 150 small signals, 800 zeroes.

FDP.BH	FDP.Knockoff	FDP.Knockoff.Plus	Power.BH	Power.Knockoff	Power.Knockoff.Plus	Power.Large.BH	Power.Large.Knockoff	Power.Large.Knockoff.Plus	Power.Small.BH	Power.Small.Knockoff	Power.Small.Knockoff.Plus
0.0783	0.0645	0.0399	0.4290	0.5292	0.3730	0.4290	0.5292	0.3730	NA	NA	NA
0.0901	0.0605	0.0389	0.2927	0.3260	0.2307	0.4592	0.5194	0.3684	0.1262	0.1326	0.0930
0.0767	0.0457	0.0330	0.2406	0.2118	0.1637	0.5300	0.4908	0.3916	0.1441	0.1187	0.0877

Session information

sessionInfo()

R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.2

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_2.2.1 knitr_1.19   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14     magrittr_1.5     munsell_0.4.3    colorspace_1.3-2
 [5] rlang_0.1.6      stringr_1.2.0    highr_0.6        plyr_1.8.4      
 [9] tools_3.4.3      grid_3.4.3       gtable_0.2.0     git2r_0.21.0    
[13] htmltools_0.3.6  yaml_2.1.16      lazyeval_0.2.1   rprojroot_1.3-2 
[17] digest_0.6.14    tibble_1.4.1     evaluate_0.10.1  rmarkdown_1.8   
[21] labeling_0.3     stringi_1.1.6    compiler_3.4.3   pillar_1.0.1    
[25] scales_0.5.0     backports_1.1.2

This R Markdown site was created with workflowr

`Knockoff` on Small Signals

Lei Sun

2018-02-05

Introduction

Scenario 1: 50 large signals, no small signals, 950 zeroes.

Scenario 2: 50 large signals, 50 small signals, 900 zeroes.

Scenario 3: 50 large signals, 150 small signals, 800 zeroes.

Fixed \(X\) simulations

Scenario 1: 50 large signals, no small signals, 950 zeroes.

Scenario 2: 50 large signals, 50 small signals, 900 zeroes.

Scenario 3: 50 large signals, 150 small signals, 800 zeroes.

Session information