Missing value using mash
:
If we want to have reasonable posterior mean, we need to use EE mode. Because in the EZ model, multiplying back the standard errors causes the problem. The missing data have large standard error. It will pruduce huge posteiror mean.
With missing values, the covariance structure learnt from the model is weired sometimes. The weights do not shrink to zero.
Suppose some of the rows in the data are totally missing. With the large errors for those missing values, the EE model ignores the information in those missing positions. In contrast, the EZ model cannot distinguish the nearly 0 z scores caused by the small observed effects from those caused by the large errors.
The large number of conditions is the main cause of the weired weights.LargeR
The small sample size could be the other reason. Increasing the sample size could improve the estimated weights. However, decreasing might also obtain the correct weights. When the number of conditions is large, we need more data to provide information. When the sample size is small, the model may not stable and the weights may not reliable. Sample Size
But in Miss Whole Row, the EE model with R = 60, deleting missing values results in non-zero weights. However, I expect the reuslt from data containing missing values is similar with the reuslt from data deleting missing values. Because the missing rows contain almost no information.
The Flash hierarchical model on Movie Lens data: Flash_Movie
Mean:
Recover the discarded column: MeanSignalRecover
The estimated covariance in the the previous analysis was based on the data including the discarded column.
Another way is estimating the covariance based on the full data. If we discard the ith column, then discard the ith row and ith column of the estimated covariance matrices. The mashcommonbaseline
method is robust of the choice of the discarded column.
When we increase R: MeanSignalR=50
This R Markdown site was created with workflowr