Missing value using mash
:
If we want to have reasonable posterior mean, we need to use EE mode. Because in the EZ model, multiplying back the standard errors causes the problem. The missing data have large standard error. It will pruduce huge posteiror mean.
With missing values, the covariance structure learnt from the model is weired sometimes. The weights do not shrink to zero.
Suppose some of the rows in the data are totally missing. With the large errors for those missing values, the EE model ignores the information in those missing positions. In contrast, the EZ model cannot distinguish the nearly 0 z scores caused by the small observed effects from those caused by the large errors.
The large number of conditions is the main cause of the weired weights.LargeR
The small sample size could be the other reason. Increasing the sample size could improve the estimated weights. However, decreasing might also obtain the correct weights. When the number of conditions is large, we need more data to provide information. When the sample size is small, the model may not stable and the weights may not reliable. Sample Size
But in Miss Whole Row, the EE model with R = 60, deleting missing values results in non-zero weights. However, I expect the reuslt from data containing missing values is similar with the reuslt from data deleting missing values. Because the missing rows contain almost no information.
The Flash hierarchical model on Movie Lens data: Flash_Movie
Estimate cor in mashr
:
This R Markdown site was created with workflowr