LOS OBJETIVOS DE LA RESPONSABILIDAD EXTRACONTRACTUAL

L A I NTERPRETACIÓN E CONÓMICA

IV- LOS OBJETIVOS DE LA RESPONSABILIDAD EXTRACONTRACTUAL

As mentioned before, the aim of these experiments is to investigate the impact of differed types of noise with a various speech to noise ratios (SNRs). Figure 4.2 illustrates the degradation accuracy based on EER when the SNR decreases for each type of noise using the SALU-AC database. The X axis represents the SNR (in dB) between clean signals (which is usually greater than 25 dB) and 0 dB (where the level of speech and noise are equal) and each bar in the figure represents a different type of noisy speech (Cafeteria Babble, Interior Car, Interior Train, Street, and White noise).

Robust Speaker Recognition in presence of non-trivial environmental noise 75 It is very clear that the various noises have different impacts on the accuracy of speaker recognition. Interior Moving Car noise, for instant, has only a slight effect on accuracy of SR (0% EER and 0.14% for 20, and 15 dB respectively) if compared with other types of noise like white and train, which have a higher impact on SR accuracy (2.9% and 6.6% EER for 20, and 15 dB in white noise, and 1.15% and 1.7% EER for Train for the same SNRs). On the other hand, the Cafeteria Speech Babble noise has more effect on accuracy at 0dB with 29.6% EER than interior moving train noise; while for other SNRs, it is clear the effect of interior moving train noise is higher. Figure 4.3 shows the DET graph for different types of noise in 10dB SNR. As can be seen, the effect of car noise on FPR (FRR) and FNR (FAR) is lower, while the white noise has a higher effect on both compared with other types of noise. Furthermore, the interior moving train, street, and cafeteria babble also have a high effect on accuracy, especially on false positive rate.

Robust Speaker Recognition in presence of non-trivial environmental noise 76

Figure 4.4: Speaker Recognition Performance for different noises for TIMIT

When TIMIT was used, it is still clear that the white noise has a higher impact on accuracy than the other kinds of noise, while the car noise still has a limited effect compared with other types of noise (Figure 4.4). Cafeteria Babble and interior moving train noise have approximately the same effect, through whole SNRs except at 5dB where the effect of Cafeteria Babble is higher. Also, it can be noticed that the interior moving car and street noise have the same EER in higher SNR (20dB and 15dB with 0.45% and 1.6% EER respectively), while the effect of street noise becomes higher at 10, 5, and 0 dB (i.e. when SNR are decreased). Figure 4.5 shows a DET graph for these types of noise at 15 dB.

Robust Speaker Recognition in presence of non-trivial environmental noise 77

Figure 4.5: DET Graph for 15 dB SNR for Different types of noise in TIMIT

Again, the white noise showed the most effect on both FPR and FNR; Babble and Train have approximately the same effect on FPR, and the car noise has the least impact on SR for both FPR and FNR. Overall, through these experiments, it is clear that the impacts of different types of noise on speaker recognition accuracy are varied from one type of noise to another. Furthermore, the ratio of noise in signal (i.e. SNR) also plays a major role in the SR performance. In the next chapter, an investigation of using different noisy speech data with different SNRs in the enrollment phase has been investigated, as well as how much the robustness of speaker recognition can be improved.

Robust Speaker Recognition in presence of non-trivial environmental noise 78

4.3 Chapter Summary

In this chapter, the GMM-UBM based speaker recognition adopted in the study has been reviewed, with an explanation of how this type works. Finally, an implementation of this type of system with the impact of different types of noisy speech with different types of SNR has been investigated. The results show the variation of the effects of different types of noise on the performance of speaker recognition. Clearly, signal to noise ratio has the main role in the robustness of speaker recognition. Furthermore, the value of SNR can play a major role to find the threshold at which the speaker recognition performance is affected with additive noise. In the next chapters, different approaches are employed to improve the robustness of speaker recognition in environmental noise.

Robust Speaker Recognition in presence of non-trivial environmental noise 79

IMPROVING THE ROBUSTNESS OF SPEAKER

RECOGNITION IN NOISY CONDITIONS VIA TRAINING

Chapter Overview

As mentioned before, environmental noises are known to be one of the greatest challenges in speaker recognition since they significantly compromise the system reliability due to Channel Mismatch. The previous chapter investigated the effect of different types of noise and different SNRs on the robustness of speaker recognition. This chapter describes attempts to improve robustness by including possible channel mismatching, i.e. using noisy speech to create training models. Validation testing was carried out in emulated noisy conditions with a controlled signal to noise ratio. Below (Section 5.1) is a description of how to employ Gaussian mixture model (GMM) and Gaussian mixture model - Universal Background Model (GMM-UBM) for modelling and classification, followed by experiments to investigate the robustness of using GMM and GMM-UBM with limited data in section 5.2. After that, the experiments using noisy data in the enrolment phase and their results are described in 5.3.

Robust Speaker Recognition in presence of non-trivial environmental noise 80

In document Comprensión y justificación de la responsabilidad extracontractual (página 85-109)