VILLA SANTANA (parte) 902 967 2 RIO OTÚN 7.870 20

FEMENINA MASCULINA

1 VILLA SANTANA (parte) 902 967 2 RIO OTÚN 7.870 20

X. z Yc Yi outputs

Figure 5.2 A n example of a multilayer perceptron (MLP). The bias

parameters in the first layer are shown as weights from an extra input having a fixed value ofx„+j = 1. Similarly, the bias parameters in the second layer are

shown as weights from an extra hidden unit with activation again at = 1.

The activation of hidden unit j is then obtained by transform ing the linear sum in Eq. [5.5] using an activation function g(-) to give

Z, = g ( a j ) [5.6]

In the case of MLP networks, the most usual choice for the activation function is that of a logistic sigmoid function:

[5.7]

which has the advantage, com pared to the step function, that it is continuously differentiable.

The outputs of the netw ork are obtained by transform ing the activations of the hidden units using a second layer of processing elements. Thus, for each o utput unit k, a linear combination of the outputs of the hidden units can be constructed as:

y=i

Chapter 5_______________________________________________________ Vascular Image Analysis

The activation of the A;th o u tput unit is then obtained by transforming this linear combination using a non-linear activation function, to give:

A [5.9]

Here we have used the notation g( ) for the activation function of the output units to emphasise that this need not be the same function as used for the hidden units.

If the bias is absorbed into the weights, and Eq. [5.5] is combined w ith [5.6], [5.8] and [5.9], an explicit expression is obtained for the complete function represented by the netw ork architecture in Fig. 5.2 in the form:

A / + 1 /7 + 1

)) [5.10]

/=!

Equation [5.10] represents the transform ation of the input variables by two successive single-layer networks. It is clear that this class of networks can be extended by considering further successive transform ations of the same general kind, corresponding to networks w ith extra layers of weights.

5.4 The Kohonen Network

Most networks of practical interest are trained by presentation of cases for which the netw ork outputs are known. For example, the MLP calls for supervised training w here for each sample input contained in the training set, the desired outputs are also known. Then the inputs and outputs are both presented to the network, and the netw ork learns to associate them using special supervised training algorithms such as backpropagation and conjugate

gradient (more details on supervised training techniques can be found in

[Bishop, 1995]). However it is also possible to use neural networks to discover

patterns in data w ithout relying on other information. This is mainly practicable on applications for which 'correct" outputs are not known, or w hen

Chapter 5_______________________________________________________ Vascular Image A nalysis

the learning process needs to be assessed. There are also other applications w here it is difficult to judge which input corresponds to the correct output, and the netw ork is used to provide an objective solution. The process by which a NN can discover the correspondence betw een input vectors and o utput classes w ithout prior knowledge (i.e. w ithout being provided beforehand a sample w ith the correct outputs), is called unsupervised training.

One of the m ost famous N N models geared tow ard unsupervised training is

the Kohonen network [Kohonen, 1982], nam ed after its inventor, Teuvo Kohonen.

It relies on a type of learning called competitive learning. In m ost other netw ork models, all neurons adjust their weights in response to a training presentation. In competitive learning, the neurons compete for the privilege of learning. Only one, or at most a few neurons are allowed to adjust their w eights in response to a presentation.

The Kohonen netw ork is essentially a two-layer network. Because special norm alization is required some people call it a three-layer network, w ith the layer following the input layer being a norm alization layer. However, in N N theory the norm alization step is usually considered nothing more than input processing, not deserving designation as a dedicated layer.

There are M neurons in the neural layer and each has a param etric w eight vector v^*"^ of dim ension N: v^'”^=(vi^'"^, v^^""^,..., v^'"^), w here m = l,..., M. N is also the dim ension of the input vectors w here q = l,..., Q and Q is the total num ber of input vectors. Figure 5.3 displays a Kohonen network, which is also called a self-organising feature map (SOFM).

Each of the Kohonen neurons operates in a very simple manner. Its o u tp u t is equal to its net input. In other words, the ou tp u t of a neuron is a w eighted sum of its inputs. There is neither an activation function nor a bias term applied. In particular, the output is com puted as:

[5.11]

;=1

A Kohonen netw ork is virtually always used as a classifier. After the w eights in the above equation are com puted by training, an unknow n case is

Chapter 5_______________________________________________________ Vascular Image A nalysis

presented to the network. All o utput activations are found. The o utput neuron having maximum activation is considered the winner, thus determ ining the class to which the case belongs. In contrast to the MLP, the Kohonen netw ork has no hidden layer, and it is strictly linear in response. Most practical netw orks of this type have a very large num ber of output neurons, each of which focuses on a single pattern. Finally, as will be show n in the next section, the utility of the Kohonen netw ork lies mainly in its fairly rapid training and its ease of interpretation.

► Yi ► 72

► 7m

Figure 5.3 The Kohonen network.

5.4.1 Training the K ohonen N etwork

The Kohonen netw ork is trained in a m anner quite different from m ost other networks. It employs competitive learning [Masters, 1993]. For each training presentation, o utput neurons compete w ith each other, and only the w inner (and its neighbours in some variations) is allowed to learn. The form of competition is very straightforward. Each o utput neuron is competed to the normalised inputs, by a vector of N weights. These w eight vectors are normalised to unit length, as are the inputs. Specifically, one input vector is selected from the sample and p u t into the netw ork and the squared distances between and each m = 1,..., M, are com puted by:

Chapter 5_______________________________________________________ Vascular Image Analysis

D,„ = = ' Z ( x y - v i ”>Ÿ [5.12]

n = \

The m inim um distance is then determ ined to obtain the neuron m* that is the w inner over the other neurons. The w inner neuron's weights are updated in such w ay that it will react to this particular presentation even more strongly next time, thus strengthening its w inning position. Eventually, the w eight vectors will converge to a point of stability, after which further training presentations do not significantly change them. Once this point is reached, the training process is deemed complete. N ew cases can then be presented to the network, and finding the maximally activated neuron can perform classification of an unknow n vector.

5.4.2 Updating the W eights

The weights of the Kohonen network are u p d ated according to two popular methods. In the winner-take-all strategy, a fraction of the difference betw een the winning neuron and the input vector is added to the w eight vector according to the updating rule:

y{r»*) -yirri*) [5.13]

As can easily be concluded from the previous formula, the weight vector is pushed slightly in the direction of the data vector, w hereas all other neurons keep their previous values. The term a is the step gain (or learning rate). If a is large, learning occurs quickly, b u t if it is too large learning can become unstable and errors may even increase. It m ust always be m uch less than 1, typically at 0.4. In some applications very good results are obtained by decreasing it slowly as training progresses. However, this is not as im portant w hen corrections are accumulated across the entire training set, as it is w hen weights are updated for each case (e.g. backpropagation training).

Chapter 5 Vascular Image Analysis

In document Pereira imaginada 2009 2014” cuadrante 3 delimitación, caracterización y fundamentación (página 98-120)