Let \(\{ y_i \}_{i=1}^n\) denote the coverage at position \(i\) for \(n\) loci across the chromosome/genome.
Let \(\{ s_i \}_{i=1}^n\) denote the segment number for each of those locations where \(s_i \in \{ 1, \dots, S \}\) and \(S\) is the total number of segments.
\begin{align} p(y_i | s_i = j, w, \mu, \sigma^2 ) = (1-\rho_{j}) \underbrace{\sum_{c=1}^{C} w_{0,c} f( y_i; \mu_{0,c}, \beta_{0,c} )}_{\text{Common}} + \rho_{j} \underbrace{ \sum_{k=1}^{K} w_{j,k} f( y_i; \mu_{j,k}, \beta_{j,k} ) }_{\text{Segment-specific}} \end{align}where \(f\) is the density function for the Normal distribution, \(\sum_{k=1}^K w_{j,k} = 1\) and \(\sum_{c=1}^{C} w_{0,c} = 1\).
These equations are based on Variational Bayes for 1-dimensional Mixture Models (2000) (gzip download) by W.D. Penny and S.J. Roberts which is itself a 1 dimensional case of the mixture model proposed in A Variational Bayesian Framework for Graphical Models (2000) by Hagai Attias.
Compute the probability segment specific component k is responsible for data point y
\[ \tilde{\gamma}_{i,k}= \tilde{w}_{j,k} \tilde{\beta}_{j,k}^{1/2} exp[ - \frac{1}{2} \bar{\beta}_{j,k} (y_i^2 + m_{j,k}^2 + \nu_{j,k} -2 m_{j,k} y_i) ] \]
\[ log(\tilde{w}_{j,k}) = \Psi(\lambda_{j,k}) - \Psi(\sum_{k=1}^K\lambda_{j,k}) \]
\[ log(\tilde{\beta}_{j,k}) = \Psi(c_{j,k}) + log(b_{j,k}) \]
\[ \bar{\beta}_{j,k} = c_{j,k} b_{j,k} \]
Normalise by sum of all components \[ \gamma_{i,k} = \frac{\tilde{\gamma}_{i,k}}{\sum_{k=1}^K\tilde{\gamma}_{i,k}} \]
Compute the probability component c common to all segments is responsible for data point y
\[ \tilde{\phi}_{i,c}= \tilde{w}_{0,c} \tilde{\beta}_{0,c}^{1/2} exp[ - \frac{1}{2} \bar{\beta}_{0,c} (y_i^2 + m_{0,c}^2 + \nu_{0,c} -2 m_{0,c} y_i) ] \]
\[ log(\tilde{w}_{0,c}) = \Psi(\lambda_{0,c}) - \Psi(\sum_{c=1}^C\lambda_{0,c}) \]
\[ log(\tilde{\beta}_{0,c}) = \Psi(c_{0,c}) + log(b_{0,c}) \]
\[ \bar{\beta}_{0,c} = c_{0,c} b_{0,c} \]
Normalise by sum of all components \[ \phi_{i,c} = \frac{\tilde{\phi}_{i,c}}{\sum_{c=1}^C\tilde{\phi}_{i,c}} \]
Compute probability data point is in common rather than segment sepecific component
\[ \psi_i = \frac{(1-\rho_j) \sum_{c=1}^C\tilde{\phi}_{0,c}}{(1-\rho_j) \sum_{c=1}^C\tilde{\phi}_{0,c} + \rho_j \sum_{k=1}^K\tilde{\gamma}_{i,k}} \]
\[ log(\rho_j) = \Psi(\kappa_j) - \Psi(\sum_{j=1}^J\kappa_j) \]
Proportion of data in component c|common or k|specific for each segment \[ \bar{w}_{j,k} = \frac{\sum_{i:s_i=j} (1-\psi_i) \gamma_{i,k}}{\sum_{i:s_i=j} (1-\psi_i)} \]
\[ \bar{w}_{0,c} = \frac{\sum_{i=1}^n \psi_i \phi_{i,c}}{\sum_{i=1}^n \psi_i} \]
\[ \rho_j = 1 - \frac{\sum_{i:s_i=j} \psi_i}{\sum_{i:s_i=j} 1} \]
number of data points in component c / k for each segment \[ \bar{N}_{j,k} = \bar{w}_{j,k} \sum_{i:s_i=j} (1-\psi_i) \]
\[ \bar{N}_{0,c} = \bar{w}_{0,c} \sum_{i=1}^n \psi_i \]
\[ \bar{N}_{\rho_j} = \rho_j \sum_{i:s_i=j} 1 \]
Data values weighted by probability of each component \[ \bar{y}_{j,k} = \frac{\sum_{i:s_i=j} (1-\psi_i) \gamma_{i,k} y_n}{\sum_{i:s_i=j} (1-\psi_i)} \]
\[ \tilde{y}_{j,k}^2 = \frac{\sum_{i:s_i=j} (1-\psi_i) \gamma_{i,k} y^2_n}{\sum_{i:s_i=j} (1-\psi_i)} \]
\[ \bar{y}_{0,c} = \frac{\sum_{i=1}^n \psi_i \phi_{0,c} y_n}{\sum_{i=1}^n \psi_i} \]
\[ \tilde{y}_{0,c}^2 = \frac{\sum_{i=1}^n \psi_i \phi_{0,c} y^2_n}{\sum_{i=1}^n \psi_i} \]
Mixing weight hyperparameter update \[ \lambda_{j,k} = \bar{N}_{j,k} + \lambda_0 \]
\[ \lambda_{0,c} = \bar{N}_{0,c} + \lambda_0 \]
\[ \kappa_j = \bar{N}_{\rho_j} + \kappa_0 \]
Variance of component c / k for each segment \[ \tilde{\sigma}_{j,k}^2 = \tilde{y}_{j,k}^2 + \bar{w}_{j,k}( m_{j,k}^2 + \nu_{j,k} ) - 2 m_{j,k} \bar{y}_{j,k} \]
\[ \tilde{\sigma}_{0,c}^2 = \tilde{y}_{0,c}^2 + \bar{w}_{0,c}( m_{0,c}^2 + \nu_{0,c} ) - 2 m_{0,c} \bar{y}_{0,c} \]
Precision hyperparameter updates \[ \frac{1}{b_{j,k}} = \frac{\tilde{\sigma}_{j,k}^2}{2} (\sum_{i:s_i=j} (1-\psi_i)) + \frac{1}{b_0} \]
\[ \frac{1}{b_{0,c}} = \frac{\tilde{\sigma}_{0,c}^2}{2} (\sum_{i=1}^n \psi_i) + \frac{1}{b_0} \]
\[ c_{j,k} = \frac{\bar{N}_{j,k}}{2} + c_0 \]
\[ c_{0,c} = \frac{\bar{N}_{0,c}}{2} + c_0 \]
Mean hyperparameter updates \[ m_{data}(j,k) = \bar{y}_{j,k}/\bar{w}_{j,k} \]
\[ m_{data}(0,c) = \bar{y}_{0,c}/\bar{w}_{0,c} \]
\[ t_{data}(j,k) = \bar{N}_{j,k} \bar{\beta}_{j,k} \]
\[ t_{data}(0,c) = \bar{N}_{0,c} \bar{\beta}_{0,c} \]
\[ \tau = \frac{1}{\nu} \]
\[ \tau_{j,k} = \tau_0 + \tau_{data}(j,k) \]
\[ \tau_{0,c} = \tau_0 + \tau_{data}(0,c) \]
\[ m_{j,k} = \frac{\tau_0}{\tau_{j,k}} m_0 + \frac{\tau_{data}(j,k)}{\tau_{j,k}} m_{data}(j,k) \]
\[ m_{0,c} = \frac{\tau_0}{\tau_{0,c}} m_0 + \frac{\tau_{data}(0,c)}{\tau_{0,c}} m_{data}(0,c) \]