• No se han encontrado resultados

“ACONDICIONAMIENTO Y FLEXIBILIDAD DE DESCARGA PLANTA COMPRESORA JUSEPÍN”

FASE II. MOVIMIENTO DE TIERRA Y CONSTRUCCIÓN DE TANQUES”

“ACONDICIONAMIENTO Y FLEXIBILIDAD DE DESCARGA PLANTA COMPRESORA JUSEPÍN”

When people is asked how they can listen to only one of the sounds in a mixture, they usually reply: “I just listen to one and try not to be distracted by the others” [Wang and Brown, 2006]. This answer presupposes a number of operations that the human brain has already done before the listener can only focus on the desired sound. The acoustic signal reaching our ears comprises sound waves originated in multiple sources and their reflections from surfaces in the environment. However, the person who is listening to a determined speaker does not bother to reject background noise or other cross-talking voices explicitly, the brain does it automatically. This problem was formally stated and named as ‘Cocktail party problem’ in [Cherry, 1953]:

“One of our most important faculties is our ability to listen to, and follow, one speaker in the presence of others...we may call it the cocktail party problem. No machine has yet been constructed to do just that”.

The design of machines that are able to listen to sounds in the same way than the humans do has been a very active line of research during many years. Nowadays, due to the rapid growth of digital systems in the last decades, it has become in one of the most interesting lines of research. One of the main problems to solve in this research area is the enhancement of degraded speech signals. Speech enhancement aims to improve the speech quality and intelligibility, introducing some kind of technology between the desired speech source and the human ear. This enhancement is necessary due to the fact that the desired speech source is mixed with other sound sources transmitting energy at the same time, which can be either noise, music, or even different speech sources. In the case that those sources are in a closed space, reverberation also decreases the quality of the received signal. The aforementioned ‘technology’ is composed of a single or a set of microphones (microphone array), a system that enhances the signals gathered by these microphones, and a single or a set of loudspeakers that reproduce the enhanced signal to be listened to by the human ear. In the digital age in which we are currently living, the speech enhancement system is based on a digital signal processor (DSP) which allows running signal processing algorithms to deal with the problem at hand. Figure 1.1 shows an overview of the described speech enhancement system.

There are different applications where speech enhancement plays an important role. In some cases, we are interested in recovering only one source with good quality, removing all the remaining sources. On the other hand, there are cases in which we are interested in recovering all the different sources. Additionally, some applications require real-time processing, which

DSP

Microphones Loudspeakers

Contaminated

input signal output signalEnhanced

Speech enhancement

algorithms

Figure 1.1: Speech enhancement system

increases the complexity of the problem. Some applications of speech enhancement that can be found in the daily life are:

 Hearing aids. Hearing loss affects an important percentage of people, and this figure is increasing due to the growing exposure to excessive noise in their daily lives. One of the main problems for hearing-impaired people is the reduction of speech intelligibility in noisy environments, which is mainly caused by the loss of temporal and spectral resolution in the auditory system of the impaired ear. The use of hearing aid devices that only provide amplification does not solve the problem, due to the fact that they amplify both speech and noise. Besides acoustic loss compensation, the DSPs of modern digital hearing aids include speech enhancement algorithms, as well as algorithms for echo cancellation and automatic sound classification.

 Hands-free communication systems. In recent years, the demand of hands-free com- munications for vehicles or teleconference systems has drastically increased the research and development of this kind of devices. The success of these systems relies on the quality of the acquired speech, which is contaminated by different types of noise and interferences. Consequently, the signals acquired by the microphones of the system are usually enhanced before being transmitted through the communication channel.

 Automatic speech recognition (ASR). Much progress has been made inASR in the last years. Smartphones, computers or smart TVs are only some examples of current tech- nologies that includeASR. The probability of success in the recognition strongly depends on the quality of the acquired signal, and the performance of ASR systems rapidly de- grades in the presence of noise. This fact makes a previous stage of speech enhancement necessary forASR systems.

 Recording systems. Audio recordings have many applications such as security, au- tomatic music transcription, audio information retrieval or electronic surveillance. One desirable operation to perform with these recordings is to recover the original sources with high quality, separating the different audio sources and removing background noise. This thesis deals with the problems of sound source separation, noise reduction and speech source enumeration. The reduction of the computational complexity of speech enhancement algorithms will be also studied for the implementation in hearing aids. Systems with a single microphone and systems containing a microphone array are studied. The former case is more

challenging due to the reduced information available in a single microphone. The latter is more interesting since the use of a microphone array includes spatial information, which gives rise to a wider range of algorithms. The aforementioned problems to solve have been a topic of study during many years, but they still remain open and unsolved due to their complexity. In a first approach, this thesis is focused on sound source separation algorithms and on the identification of the number of speakers, without considering computational restrictions. After that, the study is focused on low-cost algorithms for speech enhancement in hearing aid devices. These systems must work in real time but they have very low computational capacity due to their reduced battery life, which limits the power consumption. Hence, the computational cost of the signal processing algorithms developed for hearing aids must be low, implying that these algorithms must be relatively simple to be implemented in real time in this type of devices. An important part of this study is focused on binaural hearing aids, which is a recent topic of research. In binaural systems, the hearing-impaired person wears a different device in each ear, and these devices exchange information between them. Due to aesthetic reasons, it is desirable to connect them with a wireless link, which increases the power consumption. This wireless data link originates a new problem to solve: the reduction of the information exchanged between both devices without degrading excessively the performance of the binaural enhancement algorithm. The remaining of this chapter contains a description of the problems to solve in digital hearing aids, a comprehensive review of the state of the art in this research field, and the main goals of this thesis. The chapter ends with a description of the structure of this thesis.