• No se han encontrado resultados

7. DISEÑO METODOLOGICO

7.4. ENTREGA DE RESULTADOS

In Chapter 5, a mixture of the Student’s t distribution and the original super Gaussian distribution is proposed as a source prior for the IVA algorithms. The weight of both distributions in the mixed source prior is adapted according to the energy of the observed mixture signals. Moreover, the overlapped chain type structure is used to model the dependency within the frequency bins to achieve a robust and improved separation performance with the IVA algorithms.

• Objective 3: to derive and evaluate the expectation maximization (EM) framework to obtain a new form of IVA algorithm which explicitly adapts the source prior according to the measured signal properties.

Chapter 6 addresses this objective by exploiting the EM framework for the orig- inal IVA algorithm. The complete EM framework based IVA is derived and this new framework for the IVA algorithm adapts the source prior according to the properties of the measured signals and therefore it enhances the robustness and the separation performance of the IVA algorithm.

• Objective 4: to perform extensive evaluation studies with real speech and room impulse response measurements to confirm the performance gains of the methods proposed in objectives 1, 2 and 3.

In Chapters 4, 5 and 6 the IVA algorithms are evaluated with real speech sig- nals and by using the real room impulse responses which depict the separation performance of the proposed algorithms in the room environments.

1.4

Thesis Outline

The thesis is organised as follows:

Chapter 2 provides an introduction of the frequency domain blind source sepa- ration problem. A synopsis of independent component analysis is included and its advantages and limitations are discussed in the context of frequency domain

BSS problem. The natural gradient independent vector analysis algorithm is introduced in order to solve the permutation problem of the frequency domain ICA algorithm. Finally, the fast version of the IVA algorithm is discussed, which improves the convergence speed of the original IVA method.

Chapter 3 illustrates the experimental settings that are used to evaluate different algorithms throughout this thesis. The real room impulses responses are discussed along with the details of real room settings that are used in the experimentations. Furthermore the performance measures that are used to quantify the separation performance of the algorithms are discussed.

Chapter 4 studies a new multivariate Student’ t source prior for the different versions of the IVA algorithm. The source prior is used to derive the nonlinear score function for the IVA method, therefore it is critical to the performance of the algorithm. The multivariate Student’s t distribution is proposed as a source prior for both the IVA and the FastIVA method and the separation performance and convergence speed of the IVA and the FastIVA algorithms with the new source prior is compared with the original super Gaussian source prior.

Chapter 5 introduces a mixed source prior for the IVA algorithm. A convex combination of the Student’s t and the original super Gaussian source prior is adopted as a source prior for the IVA and the FastIVA method, to better model the speech signals. Furthermore, an energy driven version of the mixed source prior is proposed which can adapt the weight of both distributions in the mixed source prior according to the energy of the observed mixture signals. Moreover, an overlapped clique (block) structure was adopted to model the dependency within the frequency bins. This new energy driven source prior was used for both the IVA and the FastIVA methods and the separation performance of both algorithms is tested in different reverberant room environments and it consistently improves the separation performance of both versions of the IVA algorithm.

Chapter 6 describes an efficient implementation of the expectation maximization (EM) framework for the IVA algorithm. Instead of a single distribution source

1.4. Thesis Outline 33

prior for the IVA algorithm, the Student’s t mixture model (SMM) is adopted as a source prior for the IVA method. It enables the source prior for the IVA algorithm to adapt according to different mixture signals and therefore the proposed source prior can properly model the nonlinear dependency structure within speech sig- nals. An efficient EM algorithm was derived to estimate the parameters of the SMM source prior. The proposed method was tested with real room impulse responses and the experimental results confirm the advantage of the proposed method.

BACKGROUND AND

RELEVANT LITERATURE

REVIEW

2.1

Introduction

The process of automated separation of acoustic sources from the measured mix- tures is known as acoustic blind source separation (BSS). The typical application of blind source separation is the cocktail party problem. The process of focusing on one particular acoustic source of interest in the presence of multiple sound sources is known as the cocktail party problem [1]. Human beings can easily pay attention to one of the speakers in the presence of multiple active speakers; however, it is much more difficult to replicate the same ability in machines. In the past few decades, plenty of research has been conducted to study different aspects of the cocktail party problem. This research includes the study of the geometry of the microphone array [42], room impulse response identification [43], localisation of speech sources [17] and statistical estimation of speech sources. Independent Component Analysis (ICA) is one of the fundamental techniques to

Documento similar