5. Los excombatientes en las calles: el fenómeno de las ligas
5.1 La aparición de las ligas de excombatientes
In this section, we give comparison between using HMM priors for the NMF gains matrix with two other prior models. The first prior model we compared with is the exponential distribution prior model. The second prior model is the GMM without considering the temporal prior information of the source signals.
The case of using the exponential distribution with parameter ϕ as a prior for the NMF gains matrix is equivalent to enforcing sparsity on the NMF gains matrix [73]. The sparse NMF is defined as [65,73]
C (G) = DIS(V || BG) + ϕ
X
m,n
Gm,n, (4.29)
where ϕ is the regularization parameter. The gain update rule of G can be found as follows:
G ← G ⊗
BT V
(BG)2
BTBG1 + ϕ. (4.30)
The update rule in Equation (4.30) is found based on maximizing the likelihood of the gains matrix under the exponential prior distribution. We obtained the best results in this experiment when ϕ = 0.0001 for both sources in the training and separation stages.
The second prior model that we used in this comparison is using GMMs as priors for the gains matrix as shown in Chapter 3. The NMF solution for the gains matrix is encouraged to increase its log-likelihood with the trained GMM prior as follows:
C = DIS(V || BG) − R2(G), (4.31)
where R2(G) is the weighted sum of the log-likelihoods of the log-normalized columns
of the gains matrix G. R2(G) can be written as follows:
R2(G) = 2
X
z=1
ηzΓ(Gz), (4.32)
where Γ(Gz) is the log-likelihood for the submatrix Gz for source z. We obtained
the best results in this experiment when η = 0.1 in the training and separation stage. Table4.2shows the separation results of using GMM as a prior for different number of Gaussian components for both sources.
Table 4.2: SNR in dB for the estimated speech signal for using GMM prior models
SMR GMM GMM GMM GMM
dB K = 4 K = 16 K = 20 K = 32
-5 3.60 3.64 3.73 3.65
0 5.81 5.93 5.94 5.90
5 8.51 8.53 8.53 8.52
Table4.3shows comparison between using: HMMs, GMMs, and sparsity or exponential distribution as gain priors. For HMM prior we show the results with number of states |Q| = 16 and GMM components K = 4. We can see from the table that, using HMMs prior gives slightly better results than GMM because HMM is able to capture the tem- poral structure of the source signal while GMM ignoring the dynamics behavior of the signals. The HMM and GMM give better results than the sparsity or the exponential prior since the exponential distribution is incapable of capturing both the dynamics and the multi-mode structure that are related to the audio signals.
Table 4.3: SNR in dB for the estimated speech signal for using different prior models SMR Just HMM GMM Sparsity dB NMF |Q| = 16, K = 4 K = 20 -5 2.88 4.07 3.73 3.06 0 5.50 6.13 5.94 5.85 5 8.36 8.65 8.53 8.51
4.6
Conclusion
In this chapter, we introduced a new regularized NMF algorithm for single channel source separation. The energy independent HMM prior models were incorporated with NMF solutions to improve the separation performance. In future work, we will consider supervised training for the prior HMMs.
Regularized NMF using MMSE
estimates under GMM priors
with online learning for the
uncertainties
5.1
Motivations and overview
In Chapters 3 and 4 the gains matrix during the separation stage was guided to follow the prior information by maximizing its likelihood with a trained prior model. The prior model was applied on the NMF solutions without evaluating the actual need for prior information. From the results in Tables 3.1 to 3.5 in Chapter 3 we can see that, in many cases when the desired signal has higher energy compared to other sources in the mixed signal, the NMF solution of the gains matrix relies less on the prior information for the desired signal and vice versa. This means that, the need for incorporating prior information in the NMF solution depends on how bad the NMF solution for the gains matrix is without any prior.
In this chapter, we introduce a new strategy of applying the priors on the NMF solutions of the gains/weights matrix during the separation stage. The new strategy is based on evaluating how much the solution of the NMF gains matrix needs to rely on the prior models. We use here Gaussian mixture models (GMMs) to model the prior information
about the gains matrix. The NMF solutions without using priors for the weights matrix for each source during the separation stage can be seen as a deformed image, and its corresponding valid gains matrix needs to be estimated under the GMM prior. The deformation operator parameters which measure the uncertainty of the NMF solution of the weights matrix are learned directly from the observed mixed signal. The uncertainty in this work is a measurement of how far the NMF solution of the weights matrix during the separation stage is from being valid weight patterns that are modeled in the prior GMM. The learned uncertainties are used with the minimum mean squared error (MMSE) estimator to find the estimate of the valid weights matrix. The estimated valid weights matrix should also consider the minimization of the NMF cost function. To achieve these two goals, a regularized NMF is used to consider the valid weight patterns that can appear in the columns of the weights matrix while decreasing the NMF cost function. The uncertainties within MMSE estimates of the valid weight combinations are embedded in the regularized NMF cost function for this purpose. The uncertainty measurements play very important role in this work as we will show in next sections. If the uncertainty of the NMF solution of the weights matrix is high, that means the regularized NMF needs more support from the prior term. In case of low uncertainty, the regularized NMF needs less support from the prior term. Including the uncertainty measurements in the regularization term using MMSE estimate makes the proposed regularized NMF algorithm decide automatically how much the solution should rely on the prior GMM term. This is the main advantage of the proposed regularized NMF compared to the regularization using the log-likelihood of the GMM prior in previous chapters or other prior distributions [82,84,99].