2. Sesgos de respuesta en los auto-informes de personalidad
2.4. Control de los sesgos de respuesta: deseabilidad social y
There are two broad categories of applications for supervised neural networks
1. Classification.
2. Function approximation.
Classification involves separating the data into discrete groups or classes. An example of a classification task would be determining the colour (e.g. red, green or blue) of an object from a digital image. Function approximation the most appropriate methodology for learning continuous acoustic parameters from speech and music signals.
It is usual when working with ANNs to split data into multiple data-sets. The training of the network is carried out on one set called, appropriately, the training set. One feature of a neural network is the ability to adapt to new situations. When a correctly trained network is presented with new data from the same global population as the training set, the ANN can still perform the regression or classification task. This is known as generalisation. The ANN uses features of the data-set, empirically learned during training, to classify the data. To test the network’s ability to generalise, another data set called the test set is fed through the network (without updating the weights). By analysing the error on the test set, and comparing with the error on the training set, the generalisation ability of the ANN can be quantified.
Divergence between the training and test set total squared error indicates overtraining. Overtraining occurs when the training set is not representative of the global population
Chapter 5 : Optimisation and Artificial Neural Networks 60
of possible data inputs. It can also occur when the network is larger than the problem requires. The training process finds a way to minimise the error that is not representative of underlying relationships between features in the data set. For example, in the colour classification task, the network is taught to classify the colour of a series of pictures of fruits and the training set yields perfect results, i.e. the banana is classified as yellow, the apple as red and the pear as green. However, when presented with an orange, the ANN classifies it as red because it is the same shape as the apple.
To prevent overtraining, another unique data set is defined called the validation set. During training, after all training data has been passed through the network and the weights updated, the validation set is fed through the network and the error recorded, but the weights are not updated. By comparing the error in the training and validation sets overtraining can be detected, as the error in the validation set will begin to rise as the training error continues to fall.
5.3
Discussions
This chapter has introduced concepts relating to optimisation and neural networks. Optimisation enables a model of a physical system to be optimised according to some cost function, to yield the optimum parameters for that model given some input data. Supervised artificial neural networks are similar in that training involves the minimisation of some cost function, however the ANN method does not have a specific model of the system being measured. Features of the system are empirically learnt as the ANN is trained, this adaptability makes the ANN very good at performing complex non-linear mappings.
There are various ‘off the shelf’ algorithms for performing optimisation and training ANNs. Matlab [66] by Mathworks contains a number of toolboxes by which optimisation and neural network training can be carried out. These methods are optimised for speed and memory consumption and have been used throughout the thesis to carry out neural network training and optimisation tasks.
Chapter 6 : Room acoustic parameter estimation using artificial neural networks 61
6
ROOM ACOUSTIC PARAMETER ESTIMATION USING
ARTIFICIAL NEURAL NETWORKS
This chapter discusses the development of a machine learning method, based on previous work by Li [2] and Cox et al.[5], for estimating room acoustic parameters from speech and music using Artificial Neural Networks. ANNs with a large number of inputs are difficult to train. Generally, ANNs are limited to less than 200 inputs [2] therefore a preprocessing stage must be employed to reduce the quantity of data inherent in reverberated speech and music signals before being fed into the ANN. Reverberation is known to smooth signal envelopes and this is similar to low pass filtering. By estimating the transfer characteristics of this low pass filtering operation, the acoustic parameters can be estimated. An envelope spectrum detector [18] is used to compress the data prior to the machine learning stage. Speech envelope spectra are known to be relatively stable features of speech signals and Houtgast [19], Li [2] and Cox et al.[5] have shown that reverberation time and noise levels can be estimated using them.
Developments in this chapter focus on extensions to the method developed by Li et al.
for speech, to allow music signals to be used in estimations. Work in this chapter details the development of a multi-band envelope spectrum detector to account for the uneven spectrum of music signals. The training of such a system requires a large database of realistic room impulse responses. Previously, Li et al. used stochastically generated impulse responses, in this thesis geometric room prediction software is used to generate the training examples, as described in Chapter 4. As the machine learning method uses reinforcement learning, a more realistic RIR database will yield a more accurate measurement system.