2. Sesgos de respuesta en los auto-informes de personalidad
2.2. Delimitación de la deseabilidad social (DS)
Later in this thesis a method for estimating acoustic parameters from received signals using a model of sound decay is defined, this is called the maximum likelihood method. This method uses the reverberant decays in-between speech utterances and formulates a likelihood function incorporating the sound decay model. The input to the likelihood function consists of the received reverberant sound and all of the model parameters. As the parameters are unknown, the likelihood function must be ‘optimised’ so that the most likely set of model parameters (e.g. decay time) responsible for generating the decay are found. In the case of a likelihood function this is a parameter set that produces the maximum function value for all possible parameters. Locating the optimum of a complicated, multi-dimensional function is a difficult problem. This is a separate area of research in which many researchers are searching for faster and more accurate methods of optimisation. There is no ultimate optimisation algorithm and each
Chapter 5 : Optimisation and Artificial Neural Networks 49
method has its pros and cons. A good literary source that explains the many facets to optimisation is contained in [56].
It is important to note that maximising a function is equivalent to minimising the negative of the function. In the case of maximum likelihood optimisation, as presented later in this thesis, the optimisation minimises the negative likelihood value.
5.1.1
Optimisation overview
A multi-dimensional function, such as a likelihood function, can be thought of as the landscape of an undiscovered world. Upon arriving, how does one find the deepest valley? One could travel to all points in that world measuring the height at each. This, however, may take an extremely long time to complete (this is analogous to the direct search method mentioned in 5.1.3). Alternatively one could start walking downhill and continue until there is nowhere to go except down (this is analogous to the gradient search method mentioned in 5.1.4). However there is no guarantee that this peak is ‘globally’ the deepest ravine, it may merely be a small ‘local’ valley. More intelligent methods of searching could be employed, such as looking into the distance, using rough initial estimates of the landscape, or perhaps using satellite technology to gleam a rough overall impression of the landscape.
The problem described in the previous paragraph is analogous to the investigator’s search for the global minimum of a function. Search methods can often get stuck on local peaks in the function; to prevent this, algorithms must use more intelligent search methods. One may wish to evaluate the function at every set of parameter values (grid co-ordinates) but this is untenable as the parameter space may be very large and as the number of variables (dimensionality of the function) increases, the grid size increases. For instance, if the function is N dimensional and the problem requires 100 steps to explore each dimension, the problem will scale by N100.
5.1.2
Types of Optimisation algorithms
There are a number of types of optimisation methods two of which include direct search methods and gradient search methods
Chapter 5 : Optimisation and Artificial Neural Networks 50
5.1.3
Direct search methods
Direct search methods use only function evaluations. Different direct search algorithms explore the search space in various ways to find the solution. A simple, commonly used search method is the brute force method where the function is evaluated over a large grid of parameter values. This can be a very computationally expensive method, but it has the advantage of knowing that the whole parameter space has been explored. The
random walk method is another direct search method. First a starting point is chosen and the function evaluated at that point, then a new location is chosen by adding a randomly generated value to the parameter(s). The two locations are then compared and the most optimal location is chosen as the new position. Next a new random direction is generated and the process repeated. This method is not reliable in practice as successful global optimisation requires a careful choice of the statistics (mean/standard deviation) of the random walk direction. Another direct search method known as the pattern search algorithm [57] (or the Hooke-Jeeves algorithm) chooses a starting point, then evaluates the function at trial points in all N positive and negative (N number dimensions) axis directions. The algorithm then moves to the most optimal location and repeats the process. If a more optimal solution is not found, then the operation is repeated with a smaller step size. In general, direct methods are not as efficient at finding a local optimum solution as methods which use gradient information, but they are often used in hybrid global optimisation methods [82]. The brute force approach can be particularly useful when the parameter search space is limited and the required accuracy of the estimates is finite. A grid of parameters over the search space range with resolution according to the required accuracy can give a small grid of parameter estimates that is computationally efficient to evaluate and indicates the optimal solution to within a specified accuracy. In this thesis a brute force method is used to provide rough estimates for parameters prior to determining the local minimum with a more accurate method. Other properties of the function can be useful in reducing the search space, for example functions may be symmetrical, as is the case for the likelihood function developed later. This effectively halves the search space.
Chapter 5 : Optimisation and Artificial Neural Networks 51
5.1.4
Gradient search methods
Where derivatives of the function with respect to the parameters are available, gradient search methods can be used. First and second order derivatives contain information about the local direction of the function and a search direction can be chosen accordingly. Examples of gradient search methods include the famous steepest descent method and the Newton-Raphson method. In general they are very efficient at determining a local extrema. These choose the locally optimal search direction at every iteration, but can only ever guarantee convergence to local extrema. Gradient search methods will be utilised in this thesis to provide quick convergence to a local optimum. As previously mentioned the starting point for the optimisation will be determined from a brute force evaluation of the function at a number of chosen grid points, this helps to avoid local peaks and find the optimal solution for a given parameter search space.
5.1.5
Simulated Annealing
Simulated Annealing (SA) [58] is a scheme to avoid getting stuck at local optima and can be applied to many optimisation methods. SA allows the search method to choose ‘up-hill’ directions i.e. less optimal solutions. The probability of this being allowed decreases in proportion to the difference between the current function value and the function at the ‘up-hill’ position. The probability of an up-hill direction being chosen also decreases with a temperature parameter which has a value that decreases with time. In other words after a reasonable length of time SA forces the search algorithm to look more locally. The algorithm is called Simulated Annealing as the method is analogous to the process of the controlled cooling of a molten material to increase crystal size i.e. reach a minimum energy configuration. Whilst simulated annealing methods are global optimisation methods, there is no way of determining if they have sampled sufficient of the parameter space to include the global minimum.
5.1.6
Evolutionary Algorithms
Evolutionary Algorithms are a set of optimisation algorithms that use a Darwinian based decision process to choose more optimal solutions. After a starting point is defined (initialisation – or birth in this case!) the algorithms use mutation and recombination to gain new estimates and a selection process selects the fittest result. These processes are
Chapter 5 : Optimisation and Artificial Neural Networks 52
repeated until some stopping criteria are reached. Examples of evolutionary algorithms include Genetic Algorithms, Evolutionary Strategies, Differential Evolution and Evolutionary Programming. As with all global optimisation methods, evolutionary methods have no real convergence criteria. It is not possible to state when the global optimum solution has been found. It is also thought that this methodology may not be particularly computationally efficient as the number of function evaluations may be large. Due to the large number of optimisations required, speed of the optimisation is very important.
5.1.7
Constrained optimisation
A further distinction in type of optimisation is whether there are any bounds on the parameter values. The optimisation is either constrained or un-constrained. Constraints can be placed on parameters to ensure compliance of the solution with a physical reality. An example of a constraint to a physical reality, in the case of reflection decay estimation, would be ensuring a model cannot allow the build up of sound to occur when no sound source is present. Price [59] comments that functions involving constraints can often be intractable as they have many local solutions and interacting mixed-type variables. Despite this, constrained optimisation will be utilised to prevent the search algorithm yielding unrealistic solutions. The parameters will be constrained so that only realistic sound decays can be considered as solutions. It is hoped that constraining the algorithm to physical reality will yield fewer problematic parameter estimates.
5.1.1
Choice of algorithm
The choice of algorithm is dictated by the type of problem that needs to be solved. The search space for the globally optimal model parameters is finite because the parameter space is defined by the range of possible decay behaviours that may be encountered in real rooms. For example, reverberation times greater than approximately 15s are generally not realisable due to air and surface absorption. This places a lower limit on the possible rate of decay that may be encountered. This steers the choice of algorithm towards constrained optimisation, to ensure the model parameter estimates fall within the bounds of the physically realisable. Additionally, the desired parameter estimation accuracy is not infinitesimal but related to the subjective difference limens. Therefore a
Chapter 5 : Optimisation and Artificial Neural Networks 53 brute force approach is adopted to provide a quick, coarse parameter estimate prior to performing gradient based constrained parameter optimisation. This combination of techniques can help ensure that the parameter estimates will be optimal to within a specified accuracy.