inducida con 1,2-Dimetilhidracina
I. 1 3 Diagnóstico 1 3 1 Manifestaciones clínicas
The basic element of a neural network is artificial neurons that are linked together to form either a single layer or multiple layers. The type of neural network determines the basic neuron elements employed. A neuron is a simple virtual device that receives a number of inputs either as raw data inputs or outputs from the preceding neuron. Each neuron sums all inputs and performs a (usually nonlinear) transfer function to generate an output. The output value is either a final model prediction or is used as one of the inputs to other neurons (SPSS, Inc., 2010).
The structure of a neural network is composed of many neurons connected in a systematic way. The most common neural network structure consists of three basic layers: (1) an input layer that represents a layer for input neurons where external information (independent variables in statistics) is received; (2) one or more hidden layers that perform the internal
99
processing on information received from input layer and (3) the output layer, which represents a layer for output neurons where information is transmitted outside of the neural network (dependent variable in statistics).
These layers are fully interconnected with each other. That is, each neuron in the input layer is connected to every neuron in the hidden layer and each neuron in the hidden layer is connected to every neuron in output layer. Each connection has its associated weight, which verifies the power of one neuron over another. Each weight may have either a positive or a negative value attached to it. Positive weights indicate reinforcement and negative weights are associated with inhibition (Irwin et al., 1995). As shown in Figure 3.3, predictions are generated by the information flow from the input layer via the processing layer (i.e., hidden layer) to the output layer.
Figure 3.3: Basic neuron model
X1
X 2
h
jo
j=g(h
j)
X i
Source: Based onIrwin et al. (1995), p. 3; Brockett et al. (1994); and Udo (1993), modified by the author
Figure 3.3 shows a neural network structure with inputs (X1,X2,….Xi) connected to neuron j
with weights (W1j,W2j…..Wij) on each connection. After multiplying each input signal by its
associated weight, the neuron adds all of the received input signals. This results in an output (hj) that passes through a transfer (activation) function, g(hj), which is normally non-linear, to
conclude with the final output Oj. W1j W2j Wij ∑ ∑ ( ) Neuron j Transfer function
100 3.3.6.3 Multilayer perceptron
PASW® Modeler 14 offers two different types of neural networks: MLP and radial basis function. In this thesis, MLP is employed because of the categorical nature of the dependent variable. MLP is one of the most frequently used neural network models. It is applied in approximately 95% of the reported neural network business application studies, mainly for prediction, classification and modelling (Wong et al., 1997). MLP is utilised to solve problems that concern learning the relationships between a set of inputs and a known output. MLP is a feed-forward neural network with up to two hidden layers. MLP is a supervised learning network that permits weights to be learned from experience, based on empirical observations of the object of interest (Rumelhart et al., 1986; Salchenberger et al., 1992). That is, any non-linear function can be approximated by adjusting or training the supervised network based on the given input-output pairs. An MLP network is a function of one or more independent variables that minimises the prediction error of the dependent variable. The training of an MLP involves the minimisation of an error function based on the generalised delta rule using a back-propagation algorithm (SPSS, Inc., 2010). Back propagation is the most popular example of a neural network training algorithm used to calculate the gradient of the network; that is, to calculate the first derivatives of the error function with respect to each network weight (Fausett, 1994; Lee et al., 2005; Patterson, 1996).
The calculation of neural network weights is known as the training process. In the training process, the input data feed forward via the network to generate a prediction from the output layer. The network compares the predicted output to the actual output and calculates the error. In an attempt to improve the overall predictive accuracy, the difference between the actual output and the predicted output is propagated backward (i.e., as an error function) via the
101
network to adjust and update the connection weights. This process is repeated until either the error function is sufficiently close to zero or the default number of iterations is reached. Figure 3.4 shows an example of MLP feed-forward architecture. The architecture consists of three main layers: (1) the input layer, which consists of neurons of all input variables (Xi);
(2) the last layer, which is the output layer, which is one neuron (Y) and (3) the interior layer(s), called the middle or hidden layer(s), which have three neurons in this architecture (Hj).The flow of data is from left to right, with input (Xi) passed via the network through
connecting weights to the hidden layers of neurons and subsequently to the output layer.
Figure 3.4: MLP feed-forward architecture (one hidden layer) Input layer Hidden layer Output layer
X1 Wij Wj X2 Y X3 Xi
Source: Modified by the author from Erbas and Stefanou (2009), Fletcher and Goss (1993),Lee et al. (2005), Limsombunchai et al. (2005), Sermpinis et al. (2012), Smith and Gupta (2000) and Udo (1993) .
Accordingly, the following equation explains the MLP feed-forward function for one hidden layer:
Y=
𝐹
[∑ 𝐹 (∑ )] (10)where Y = the output of the network, 𝐹 = the logistic (sigmoid) transfer function,
, for the output layer, = the connection weights from hidden layer (node j) to output
H1
H2
102
layer, 𝐹= the logistic transfer function for the hidden layer, = the connection weights from input layer (node i) to hidden layer (node j) and = the input variable for node i
(Brown and Mues, 2012; Erbas and Stefanou, 2009; Limsombunchai et al., 2005; Salchenberger et al., 1992).