• No se han encontrado resultados

1.2.5. Políticas Públicas

1.2.9.1. Principios de la Administración

𝝏𝜷𝒌

Let:

𝑓(𝛽) = 𝜕𝑙(𝛽) 𝜕𝛽𝑘;

𝑓′(𝛽) = 𝜕2𝑙(𝛽)

𝜕𝛽𝑘𝜕𝛽𝑘′(i.e. the second order partial derivatives of Eq. 3 with respect to each

𝛽𝑘);

Initiate:

Generate 𝜷 (j) = 𝛽𝑘(𝑗)for j = 0 by guessing initial values for each 𝛽𝑘; Repeat Until Convergence over j = 0, 1, 2, . . .:

Set 𝜷 (j +1)= 𝜷 (j) + 𝑓(𝜷 (j))[𝑓’(𝜷 (j))]-1 ;

Making class predictions is relatively simple once the coefficient 𝜷 is estimated. Given an input observation,

x

, the predicted probability that y= 1 for the observation is calculated using Eq. 1.1.

The final step is to decide on a threshold value for 𝑃(𝑦 = 1|𝑥), above which the output is predicted as 1. A popular choice is 0.5, i.e. the prediction is 𝑦 = 1 if 𝑃(𝑦 = 1| 𝑥) ≥ 0.5., owing to the fact that a threshold of 0.5 is an unbiased threshold which provides a starting point for testing model performance.

3.2.4.2 Artificial Neural Networks

An Artificial Neural Network (ANN) is a biologically inspired ML model that mimics the way in which the brain processes information. The brain contains numerous basic information processing cells called neurons. Each neuron is connected to a large number of other neurons by synapses and axons. In very basic terms, a neuron receives information in the form of impulses, from other neurons via synapses. The impulses are combined in the neuron and if the combination of impulses excites/activates a neuron, the neuron in turn sends an impulse to its subsequently connected neurons through its axon [40], [41]. This structure is illustrated in Figure 11 [40] .

Figure 11: Illustration of two connected neurons, Adapted from [40]

In 1943 a mathematical model of how neurons process information was proposed, named McCulloch-Pitts Neurons [37]. The model is illustrated in Figure 12 [37]. In Figure 12 we see a single neuron connected to M other neurons that supply binary inputs x1, x2, x3, . . ., xM. The strength of an input is determined by the conductivity of the connecting synapse, which is represented by weights w1, w2, w3, . . ., wM. The inputs multiplied by the weights are aggregated by the neuron, the result is passed to a step function, referred to as the activation function, with threshold theta. If the result is greater than theta the neuron fires, producing an output of 1, alternatively it produces an output of 0 [37].

Figure 12: McCulloch-Pitts neuron model, adapted from [37]

Clearly a single neuron like the one illustrated in Figure 12 is not capable of modelling complex functions. It can be shown that a series of such neurons connected in parallel is only capable of modelling discriminant linear lines, planes or hyperplanes, depending on the number of neurons, severely limiting their utility. However, ANNs that are capable of modelling complex functions are created by connecting layers of neurons together, in other words, the output of a layer of neurons serves as the input of a subsequent layer of neurons. In this way, the input variables are transformed into more and more complex features that the neural network can use to make complex decisions. In fact, the universal approximation theorem states that an ANN with a single hidden layer containing a finite number of hidden neurons can approximate any continuous function perfectly [37].

The architecture of an ANN, as well as the choice of activation function and training algorithm plays a crucial role in determining its performance [41]. An inexhaustible combination of these configurations has been proposed in the literature. Therefore, this review will focus on a very popular ANN schema, namely, a fully connected, feedforward ANN with one input layer, one hidden layer and one output layer. The input and hidden layers will each have one bias neuron. The activation function of the hidden and output layer neurons will be the logistic function (Eq. 1.1) and the back propagation algorithm will be used to train the model. Such an ANN is illustrated in Figure 13. This ANN schema was selected due to the fact that various authors in the field of ML refer to it when discussing ANNs for classification purposes, which is particularly relevant to the focal case study of this project [37], [38], [40], [41]. Furthermore, this relatively simple ANN schema includes the various elements of an ANN model, which means that its selection for discussion does not limit the detail with which this model type can be discussed.

An ANN model is trained by adjusting the weights connecting the neurons. The challenge is to determine what effect a change in one of the weights in the first layer will have on the prediction of the output nodes, moreover, how the weights must be adjusted to improve the accuracy of the model’s predictions. The back propagation algorithm provides a means by which the derivative of the error of prediction can be obtained with respect to any weight in the network. Gradient descent can then be used to adjust the weights in a way that will reduce the error, once this derivative is calculated.

Figure 13: Feedforward ANN

Some notation must be clarified to describe how an ANN like the one depicted in Figure 13 generates predictions for input values, and how the model is trained using the back propagation algorithm. Consider an NxM matrix, x, which is comprised of N input observations of M-1

features and one bias feature with value 1. Then, let y be a vector of length N, comprised of observed binary output values related to each of the N inputs. The dimensions of the network are as follows: The input layer consists of M nodes, one for each of the M input features. The hidden layer contains H nodes and one bias node. The output layer consists of a single node, which produces a value between 0 and 1, which is rounded off to produce a binary output. Let

w(1)be the MxH weight matrix connecting the input layer to the hidden layer and let w(2) be the

(H+1)x1 weight matrix connecting the complete hidden layer to the output layer. Both weight vectors are initiated with guessed values.

The ANN produces predictions according to the following algorithm:

Algorithm 2: ANN Prediction Generation

Documento similar