CAPÍTULO III: METODOLOGÍA
3.6. DISEÑO ESTRUCTURAL DE ELEMENTOS
3.6.1. DISEÑO DE CÚPULA SUPERIOR
Once an image texture feature vector is computed for a sample image, it needs to be processed by a machine learning method in order to assign a particular class to the sample region [31]. In this thesis, an artificial neural network (ANN) system is used as the machine learning method to classify the sample regions of a side-scan sonar image. This method was chosen because it proved to be a very accurate and, while the training process is slow, the classification process is fast and easily parallelizable2.
An ANN is a system that is built on layers of neurons and weighted connections between neurons of consecutive layers [54]. Figure 3.6 provides a visual explanation, where the input data (from the feature vector) is passed through to the neurons in the input layer. The neurons in the input layer pass on their information to the neurons in the hidden layer through the connections between layers3. Each connection between neurons contains a weight that controls the amount of influence one neuron has on the next consecutive neuron. Figure 3.7 presents an example of a single connection between two neurons, i and j, in their corresponding layers, k − 1 and k respectively. The output, or activation, of neuron j is represented by akj and it is influenced by the activation of neuron i in the previous layer, represented by ak−1i . The influence of activation ak−1i on akj is weighted by wijk. Although the connection between neurons in figure 3.7 appear rather straight forward, the full network example in figure 3.6 show that the activation akj is influenced by all the neurons in the previous layer.
Figure 3.7 only examines one connection between neurons of different layers. Because a neuron is connected to all the neurons of the previous layer, the activation of that neuron is derived from the weighted sum of all activation values of neurons from the previous layer. To approximate the true activation of a biological neuron, the weighted sum of activation
2
However, parallelizing the ANN is left as future work.
3
Although figure 3.6 only demonstrates a single hidden layer, an ANN design can easily extend to multiple
Figure 3.6: Example of an artificial neural network. The input data is passed through the layers the system by weighted connections. The output data is a function of the input data and the weighted connections between neurons of each layer.
Figure 3.7: Example of a single connection between two neurons. The activation, akj, of neuron j in layer k is influenced by the activation, ak−1i , of neuron i in layer k − 1. The influence of this connection is weighted by wk
ij.
levels from the previous layer is passed into a sigmoid function. Equations 3.2 and 3.3 demonstrate how the activation, akj, of a neuron is determined based on neurons from the previous layer. akj = σ(X i wkijak−1i ) (3.2) σ(z) = 1 1 + e−z (3.3)
Computing output data (classification values in this case) by propagating input values through the sigmoid functions and weighted connections is quite fast and simple. However, the process of determining the optimal weights for the connections is more complicated. Like most optimization methods in machine learning, determining the optimal values for the degrees of freedom (e.g. the connections weights) is achieved by minimizing some cost
function. For an ANN, this usually implies minimizing some variation of the following: C = 1 2 X n X j kyn,j− aoutputn,j k2, (3.4) where yn,j is a known output training value for test case n and for output dimension
j, and aoutputn,j is the activation at the output from the ANN based on yn,j’s corresponding
input training values. If the network output activation matches the training data, the cost function will be zero and the network will be in an optimal state. Unfortunately, finding the ideal parameters for the connection weights is challenging because aoutputn,j is a non- linear function (due to the sigmoid function in (3.3). The non-linearity means that solving
∂C
∂wijk = 0 to find the optimal connection weights is a non-trivial problem.
A method called backwards propagation of errors, or backpropagation, was introduced to approximate the partial derivatives of the cost function within the network [55]. This technique initially assigns random values to the connection weights of the ANN and com- putes output activation values from training data. The error between the ANN output and the training data is passed backwards into the network from the outputs and the back- propagation error is used to approximate the partial derivatives of the cost function at each connection weight. Using this method, the weight values are usually updated using an optimization method like a gradient descent approach. This process is repeated until the system converges onto a local minimum.
To alleviate the problems of encountering local minima, the entire process of back- propagation is repeated many times, each with a new set of randomized initial connection weights. A cross-validation scheme is used to determine the set of connection weights that achieves the lowest activation output error with respect to the training data. Once the ANN is constructed, new image texture feature vectors are be passed into the system and a classification value is represented by activation output for each output layer neuron. As an example, a sandy sample region of a side-scan sonar image should yield an activation output close to 1.0 for the sandy output neuron and activation outputs close to 0.0 for all other output neurons in the network. In fact, the colour coding of the side-scan image in figure 1.4 is taken directly from the output values of the ANN and converted into RGB values as a visual representation.
In general, because many ANN libraries exist (e.g. MATLAB’s Neural Network Tool- box), training the connection weights with backpropagation can be treated as a black box method. A user only has to choose the network architecture (number of hidden layers and neurons in the each layer) and provide the training data inputs and outputs.