Proceso de selección y manejo del recurso

2.2. Análisis administrativo

2.2.2. Proceso de selección y manejo del recurso

7.1 Summary and Conclusions

The work presented in this study represents some promising strides to- wards the solution of the cocktail party problem (CPP) within the blind source separation (BSS) framework. The aim was mainly to add some novel contributions to enhance the performance of the independent vector analysis (IVA), including its different versions, in separating speech sources from their observed mixtures in real reverberant environments. The main challenge to blind audio source separation (BASS) is the convolutive mixing of the sources in real room environments. This necessitates conducting the process in the frequency domain (FD) to avoid the computational complexity of the convolution operation in the time domain. In Chapter 2 background theory and fundamentals related to the subject of convolutive blind source separation (CBSS) were introduced. It also highlighted previous related work within the topic and their limitations. Independent component analysis (ICA), a prominent FD-BSS technique, was discussed which led to the permutation problem in FD-BSS. Then, the independent vector analysis

Section 7.1. Summary and Conclusions 198

(IVA) algorithm, based on an improved model of the ICA method to address the permutation problem inherent to ICA, was reviewed in its natural gradient form (NG-IVA) and fast fixed point form (FastIVA). The heart of the IVA method is the multivariate source prior used to model the speech signals because the non-linear score function used to retain the inter-frequency dependency is obtained from the probability distribution function (PDF) of the source prior.

In Chapter 3, techniques and settings related to the implementation and evaluation of speech and blind audio source separation (BASS) sys- tems were outlined. Different experimental setups were described, including information on datasets for speech sources, room environments and models as well as the performance parameters used in the evaluation criteria. The separation performance of the different algorithms was mainly measured objectively by signal to distortion ratio (SDR) in dB [76] or subjectively by perceptual evaluation of speech quality (PESQ) (on a scale of 0-4.5) [81] in simulated [71] and binaural real room impulse responses (BRIRs) [74, 75].

The contributions of this thesis satisfy the research objectives outlined in the introduction chapter. The objectives were addressed by introducing new methods to enhance the performance of the IVA algorithm in its various forms. The contributions can be summarised as follows:

1. A new multivariate Student’s t distribution the source prior for the batch IVA algorithm.

2. A novel energy driven mixed distribution model as a source prior for the batch IVA algorithm.

3. A particular multivariate generalized Gaussian distribution as the source prior for the online IVA algorithm.

4. A novel adaptive learning scheme to improve the performance of the online IVA algorithm.

5. A novel switched source prior technique for the adaptive learning online IVA algorithm.

In Chapter 4, a new multivariate Student’s t distribution is proposed as the source prior for the batch IVA algorithm. A Student’s t PDF can better model certain frequency domain non stationary speech signals due to its tail dependency property. The tails of the distribution can be tuned to closely match the generally heavy tail distribution of the frequency domain speech signals due to the high amplitude data points. The chapter, initially, provided an experimental comparison be- tween the batch versions of ICA and IVA. The results demonstrated the poor performance of the standard ICA due to the permutation problem and the IVA directly addresses the problem. Then, the separation performance of the IVA algorithm with the new source prior is compared with the original super Gaussian source prior in simulated and real room environments with a variety of settings. The experimental results confirmed that the proposed Student’s t source prior consistently improves the separation performance of the IVA algorithm.

Using simulated room impulse responses [71], the average recorded SDR improvement using the new Student’s t source prior was approximately 0.75 dB compared with the original IVA method. In real highly reverberant environment [74], the average recorded SDR improvement was approximately 1.31 dB compared with the original IVA method.

Section 7.1. Summary and Conclusions 200

This confirms the suitability of the Student’s t distribution to model speech signals in real life scenarios. The subjective study confirmed the improved separation performance for the IVA method with the Stu- dent’s source prior. The average separation performance improvement PESQ score was approximately 0.75.

In Chapter 5, a novel multivariate source prior for the IVA algorithm was introduced. The proposed source prior is a mixture of two distributions, instead of a single distribution; namely the original multivariate super Gaussian distribution and the multivariate Student’s t distribution. Human speech is highly non stationary with variable amplitude components. In the proposed mixed source prior, the Student’s t distribution models the high amplitude components and the original super Gaussian distribution is used to model the lower amplitude components of the speech signal. Firstly, equal weights were assigned to both the original super Gaussian distribution and the Student’s t distribution in the mixed source prior. Then, it was further enhanced with an energy driven scheme that adjusts the weight of each distribution according to the normalised energy of the observed mixtures at the frequency domain blocks of a clique based dependency model. As a results, the mixed source prior was able to adapt to different statistical properties of speech signals.

The fixed mixed source prior was adopted for the IVA and the Fas- tIVA algorithms and compared with the original single super Gaussian source prior. The detailed experimental studies using simulated [71] and real room environment [74] with different reverberation times confirmed consistent separation performance improvement of the fixed mixed source prior based IVA. Table 7.1 shows the approximate av-

In document Habilitación de la agencia de Aduana Consultores Rivera como operador económico autorizado, para la facilitación del comercio exterior (página 83-90)