HISTORIA DE LA PLANIFICACIÓN ESTRATÉGICA Estrategia (raíces etimológicas)
2.2.1 DEFINICIONES DE PLANIFICACIÓN ESTRATEGICA
58 If tk>= x1 then
Display message i1
Else if tk>= x2 then Display message i2 .
.
Else if tk >= xn then Display message in
End if
UpdateSession:
tK ← tK + 1
3. Send pre-processed update to neural network 4. If Update requirement is satisfied
Then Stop Else Goto 2
59
(1997), “the reason for avoiding target values of 0 and 1 is that sigmoid units cannot produce these output values given finite weights. If we attempt to train the network to fit target values of exactly 0 and 1, gradient descent will force the weights to grow without bound. On the other hand, values of 0.1 and 0.9 are achievable using a sigmoid unit with finite weights. The same procedure was then repeated for each hidden node and for all hidden layers. The net effect of all these procedures is then transformed into an activation function at the output nodes and thus represents the ANN solution of the fed samples. This output is most likely to deviate from the target solution due to the randomly initialized interconnected weights for which the probability of obtaining a converged solution by just the initial iteration is extremely infinitesimal.
However, in the backward sweep, the error difference from the targeted output and the recent output from the ANN is used to adjust the interconnected weights. Repeated forward and backward sweeps will eventually allow the outputted solution or target value to converge or agree with the FEBPLA of the ANN within allowable tolerance which could be prespecified. From the transparent process above though may be opaque at a first glance considering the long execution time and several epochs or iterations needed to be run to ensure a convergence, neural network methods are not “black box” models.Sometimes, as pointed out by Bhadeshia (1999), neural networks are incorrectly described as “opaque”
systems. A lot depends on the computational ability, strength and efficiency of the host computer to handle huge data and on the analysis provided by the implementing models to complete the convergence in record time(Zhou, Chawla, Jin & Williams, 2015). The work of Topping et al. (1998) suggested either a single (mostly used method) or batch pattern training for weights adaptation with the latter more appropriate for parallel processing for weight adaptation and update. Since the ANN are opaque to direct validation using other methods of data mining and are increasing been used in artificial intelligence for modeling complex systems that involves global optimization care must be ensured that errors during model designing, uncertainty in human inputs and the environment are eliminated as much as possible (Gopal, 2009;Jakeman, Voinov, Rizzoli & Chen, 2008).
The superiority of ANN over other popular methods of AI such as knowledge based systems (KBS) is the ANN‟s ability to learn from examples. This is similar to a human learning from experience and learning through trial and error. The basic difference between ANNs and KBSs is that ANNs do not require a knowledge base or its equivalent during problem solving. ANNs only require a number of solved problems (input and output sets) in order to train the network and produce a validational set of weights that can quickly be used to sort other new data. However, KBSs and ANNs can be combined together so as to take
60
advantages of their strengths to solve specific problems (Krishnamoorthy & Rajeev, 1996).
This reason motivated the use of ANN modeling over knowledge or case base reasoning (CBR) in this work.
A summary of the various aspects of implementing a ANN is listed on Table 3.2.
Table 3.2 Analysis of the effects of various values of design parameters on ANN training convergence and generalization.
Design
parameter Too high or too large Too low or too small Number of
hidden nodes (NHN)
Overfitting ANN (no generalization)
Underfitting (ANN unable to obtain the underlying rules embedded in the
data) Learning rate
(η)
Unstable ANN (weights) that oscillates about the optimal
solution
Slow training but stable optimal solution convergence
Momentum coefficient (µ)
Reduces risk of local minima.
Speeds up training. Increased risk of overshooting the solution
Suppressed effect of momentum leading to increased risk of potential
entrapment in local minima. Slows training
Number of training cycles
Good recalling ANN (i.e., ANN memorization of data) and bad generalization to untrained data
Produces ANN that is incapable of representing the data
Size of training subset (NTRN)
ANN with good recalling and generalization
ANN unable to fully explain the problem and also generalize
61 Size of test
subset (NTST)
Ability to confirm ANN generalization capabilty
Inadequate confirmation of ANN generalization capabilty
Figure 3.16 shows the connection patterns for the inputs, weights and outputs of the implemented ANN for this work. The ANN is completely connected, meaning that every node in a given layer is connected through a weight (wij) to every node in the next layer, although not to other nodes in the same layer. At initialization, these weights are randomly set to between 0 and 1. The network contains thirty inputs units in the input layer and three nodes in the hidden layer where weighting is implemented and five output nodes which corresponds to the outputs of the fuzzy inferred diseases. For a full illustration of the designed ANN with all its weights, nodes and bias see Figure 5.1 in Appendix A.
Figure 3.16: Connection patterns for input neurons, bias and output for the ANN A converging value of 0.9 on any of the outputs indicates the presence of any of these diseases
Malaria Influenza Flu EVD
Lassa fever Marburg
62
Further analysis can help show the effect of new symptom(s) on the present diagnosis and the similarity with already existing and diagnosed conditions. This is done by means of the k-nearest neighbor algorithm after normalization of the outputs, thus allowing for classification of new disease strains or new symptoms.
Table 3.3 shows a sample of the normalized initial inputs for training the ANN for a case of Influenza flu, i.e. without new symptoms.
The GUI for the FEBPLA showing preliminary execution of the ANN is shown in Figure 3.17. An adjustable learning rate of 0.1 was initially chosen with randomized weight‟s values of less than one, i.e. 0 < W < 1 (0 and 1 non inclusive).
Inputs from Fuzzy processing
Outputs to converge to Hidden Layer
Results
Optimally trained outputs
63
Though the ANN cannot in practice converge to the actual values of taught or learned outputs, it continues to fine tune its convergence solution to be closer to the actual outputs in each iteration until an optimal solution is attained after which the iteration will overshoot optimality and produce errors. The resultant optimal convergence point is critical in the generalization of the ANN to new data. New symptoms input as shown in Table 3.4 can be fed to the designed ANN and their effect on the present diagnosis investigated.
Table 3.4 A sample of the normalized initial inputs for training the ANN for a case of Influenza flu with the last three inputs supplied (i.e inputs 28 – 30).