• No se han encontrado resultados

16640573

N/A
N/A
Protected

Academic year: 2020

Share "16640573"

Copied!
19
0
0

Texto completo

(1)

*Corresponding author.

E-mail address:[email protected] (G. Joya).

Hop"eld neural networks for optimization:

study of the di!erent dynamics

G. Joya*, M.A. Atencia, F. Sandoval

Departamento de Tecnologn&a Electro&nica, E.T.S.I.Telecomunicacio&n, Universidad de Ma&laga, Campus de Teatinos, 29071Ma&laga, Spain

Departamento de Lenguajes y Ciencias de la Computacio&n, E.T.S.I.Informa&tica, Universidad de Ma&laga, Campus de Teatinos, 29071Ma&laga, Spain

Received 20 December 1998; accepted 30 October 2000

Abstract

In this paper the application of arbitrary order Hop"eld-like neural networks to optimiza-tion problems is studied. These networks are classi"ed in three categories according to their dynamics, expliciting the energy function for each category. The main problems a!ecting practical applications of these networks are brought to light: (a) Incoherence between the network dynamics and the associated energy function; (b) Error due to discrete simulation on a digital computer of the continuous dynamics equations; (c) Existence of local minima; (d) Convergence depends on the coe$cients weighting the cost function terms. The e!ect of these problems on each network is analysed and simulated, indicating possible solutions. Finally, the called continuous dynamics II is dealt with, proving that the integral term in the energy function is bounded, in contrast with Hop"eld's statement, and proposing an e$cient local minima avoidance strategy. Experimental results are obtained solving Diophantine equation, Hamiltonian cycle and k-colorability problems. 2002 Elsevier Science B.V. All rights reserved.

Keywords: High-order Hop"eld neural networks; Energy function; Optimization problem; Discrete and continuous dynamics; Local minima escape algorithms

1. Introduction

The neural paradigm initially proposed by Hop"eld as an associative memory

[6,7], either in its original version ("rst order Hop"eld networks [8]) or in its high

0925-2312/02/$ - see front matter 2002 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 5 - 2 3 1 2 ( 0 1 ) 0 0 3 3 7 - X

(2)

order generalized version [18,17,14,11], has been widely used later for the solution of optimization problems. The de"nition of a network within this paradigm implies

"xing two key characteristics which allow it to be used to solve optimization

prob-lems: its activation dynamics and an associated energy function which decreases as the network spontaneously evolves. The applied methodology may be summarized in the following way [5]: given an optimization problem, "nd the cost function that

de-scribes it, design a Hop"eld neural network whose energy function must reach its

minima in the same points as the cost function, so that the stability con"gurations of

the network correspond to solutions of the problem.

This method faces some application problems that bound both the network that can be used and the results that may be obtained. We think that these problems must be studied, either analytically or experimentally, to avoid the lack of rigour in the application of Hop"eld networks that is present in many papers published about this

subject, which do not explicitly take into account these limitations. These problems may be grouped in four kinds:

(a) Many applications described do not coherently make a correspondence between the network dynamics and the energy function associated to that network. Using an analog neuron network with continuous dynamics and associating to it an energy function corresponding to a discrete neuron network is a common situ-ation [8,22].

(b) The energy function is forced to decrease only if the network evolves according to its dynamics equations. If these equations are continuous (di!erential equations),

they cannot be strictly represented by means of a computer simulation. In other words, the simulation implies the discretization of these equations (di!erence

equations) [21], so that the bigger the simulation step is the more di!erent the real

and the theoretical network behaviour will be.

(c) The energy function of a Hop"eld network has many local minima.

Conse-quently, the network probably will reach an equilibrium state that does not correspond to a problem solution. The search for evolution strategies to move the network out of local minima and take it to a global minimum is an important task in this"eld.

(d) The convergence toward a minimum that is a solution of the optimization problem is extremely conditioned, in the case of networks with discrete dynamics, by the value of the coe$cients that weight the di!erent terms of the cost function.

As far as we know, there is no general analytical methodology to "nd the

adequate value of these coe$cients, but this search is accomplished

particulariz-ing for a problem and for a concrete type neuron [15] or a concrete kind of constraints [4].

In this paper we perform a detailed study of problems (a)}(c) and we experimentally

show that problem (d) does not a!ect networks with continuous dynamics II when an

evolution strategy is used. Thus, in reference to the"rst problem, we explicitly classify

the existing dynamics as well as the energy that corresponds to each one. For the second problem, we obtain experimental results concerning the e!ect on the network

(3)

evolution of the discretization due to the computer simulation. In reference to the third problem, we study the mathematical properties of the di!erent terms appearing

in the energy function originally described in [7], which will be called here sums term and integral term, and we prove that the integral term is bounded, in contradiction to

Hop"elds statements; this study provides us with a new strategy for network

evolu-tion, based upon a nonheuristic criterion. Our experimental results show that both traditional and new evolution strategies make the continuous Hop"eld neural

net-work very independent of the value of the coe$cients that weight the terms of the cost

function (fourth problem).

Our study is restricted to those optimization problems that ful"ll the following

requirements (most optimization problems solved with the Hop"eld paradigm ful"ll

them [8,12,10,15}17,19]):

(1) The solution may be expressed by a set of n variables x"(x

,x2xL) each

taking one of two discrete values. That is,x

G3!1, 1∀ior x

G30, 1∀i.

(2) The cost function may be described by an arbitrary order multinomial expression in variablesx:

f(x)" P

HGG2GH>Z!LH> a

GG2GH>xGxG2xGH>#C. (1)

These conditions force the following properties of the neural networks:

(1) As variablesxG must be discrete, neurons must have a discrete activation func-tion (sG3!1,1∀i or s

G30,1∀i) or a continuous activation function

(sG3[!1,1]∀i), provided that the equilibrium state is reached at one of the

extremes of the interval.

(2) The cost function has no term raised to a power greater than one in each

x

G because xIG"x

G if xG30,1 or xIG"1 (even k) and xIG"x

G (odd k) if x

G3!1,1. Consequently, the network has no self connections. This

approxima-tion can also be used for networks with continuous neurons, assuming that, in this case, the energy function will only coincide with the cost function when the neurons reach their extreme values.

(3) Each term of the cost function contains a di!erent combination of variables, so

that the symmetrical connections in the associated neural network are the same at every order.

The rest of this paper has the following structure: in Sections 2}4 we present three

possible dynamics of the Hop"eld network evolution (discrete, continuous I,

continu-ous II) whose associated energy function has been proved to exist. In each case we study its behaviour with respect to the problems mentioned above. For continuous dynamics II (presented by Hop"eld in [7]), we show that the integral term is bounded

for any value of s

G and we study the impossibility of a neuron with a sigmoid-like

activation function to reach its extreme values (vertexes, edges or sides in the

con"guration space hypercube). For this task, we analyse and compare the variation

(4)

results to design a new evolution strategy for the avoidance of local minima. In Section 6 the main conclusions are summarized. To obtain experimental results, three di!erent problems have been solved with Hop"eld neural networks. The"rst one is

the diophantine equation problem (DEP) [3,10]. This problem is represented by a monolithic cost function, in the sense that it does not admit the existence of di!erent

values for the coe$cients weighting each one of its terms. The second one is the

Hamiltonian cycle problem (HCP), which is here represented by a cost function that is a particular case of that obtained for the travelling salesman problem (TSP) [5]. The third one is thek-colorability problem (kCP) [19], where an example map with 25 regions, which can be coloured with a minimum of 5 colours [20], is solved. In the second and third cases, the robustness of the traditional and new evolution strategies versus the variation of the weighting coe$cients is shown. These problems are

schematically described in the appendices.

2. Hop5eld networks with discrete activation function and discrete dynamics

In this section we study the high order generalization of the network presented in [6] with an asynchronous activation dynamics, independent for each neuron, repre-sented by Eqs. (2), (3). It has an energy function shown in Eq. (4).

s

G(t)"sgn(u

G(t)), (2)

u G" O

H GG2GHZ!LH G$GG2GH

¹

GGG2GHsGsG2sGH!IG, (3)

E(s)"! O

H G GG2GHZ!LH G$GG2GH

¹

GGG2GHsGsGsG2sGH# G

I

GsG#C, (4)

wherenis the number of neurons,qis the order of the network (q"1 is the original

Hop"eld's network),¹

GGG2GH is the weight of thejth order connection from neurons i

2iH to neuroni, IG is the bias of neuroni, uGis the input potential to neuroni, and s

G is the state of neuronThis kind of network is not ai, that is a step-like function ofuG.

!ected by problems (a) and (b), because the energy

function is exactly the same as functionf(x) and the discrete dynamics may be easily simulated on a computer, by means of a random activation of the network neurons. However, problem (c) a!ects severely because the state con"gurations of these

net-works are restricted, in the state space, to the vertexes of a hypercube with an edge of length 1 and centered in (2) in the binary case, or with edge of length 2 and center

(020) in the bipolar case. Consequently, if a network withnneurons is in a particular

states(associated to vertexv

Q), it can only evolve towards a con"gurationsassociated

to some of then neighbour vertices ofv

Q. Applying this dynamics to optimization

problems produces very poor results for the following reasons:

E High probability of evolution to local minima. A network in a state s may only evolve if any of its n neighbour states has a lower energy. Otherwise,

(5)

Fig. 1. (a) Representation of possible states for a Hop"eld neural network with three neurons. Numbers

indicate the energy associated to each state. If the network starts from state A, it will arrive at state B or C depending on which neuron is"rst activated. (b) Possible evolution trajectories of this network starting

from any particular state (descendent direction).

the network does not evolve even when there are other farther states with lower energy.

E The evolution is too dependent on the sequence of activated neurons. The smallest possible movement in the state space is the distance between two neighbour vertexes. Thus, starting from the same initial state, two possible"nal states may be

very di!erent, after several activations, depending on the sequence of activated

neurons. Fig. 1a shows this fact using a trivial network with three neurons, that is, a three-dimensional state space: if the network starts from state A, it will arrive at state B or C depending on which neuron is"rst activated. Fig. 1b shows all possible

evolution trajectories of this network (descendent direction) starting from a particu-lar state.

This kind of dynamics has been used, as it will be later shown, to solve our example problems and the simulation results are shown in Tables 2}4.

3. Hop5eld networks with continuous activation function and continuous dynamics I

An arbitrary order Hop"eld feedback neural network with continuous activation

function g(u

G) (being g a sigmoid or hyperbolic tangent function) whose evolution

dynamics is described by Eqs. (5), (6) has the energy function shown in Eq. (7):

sG(t)"g(u

G(t)), (5)

du G

dt" O

H GG2GHZ!LH G$GG2GH

¹

GGG2GHsGsG2sGH!IG, (6)

E(s)"! O

H G GG2GHZ!LH G$GG2GH

¹

GGG2GHsGsGsG2sGH# G

I

(6)

Table 1

Simulation results for a Hop"eld network with continuous dynamics I

Average number of Percentage of correct Average number of simulation simulation steps solutions (global minima) (%) steps for correct solutions

t"1 191 6 1983

t"0.1 6856 40 13,198

t"0.01 51,389 100 51,389

The proof for a"rst order network is found in [1] and the high order generalization is

immediate.

The imposed condition of no existence of self connections (condition 2 for the cost function, as seen in Eq. (1) in Section 1) implies a lineal dependence of the energy function with respect to everys

G. In other words, partial derivatives of the energy with

respect to every s

G do not depend on sG. Thus, minima of the energy function are

reached only at the extremes of the de"nition interval ofs

G (!1,1or0,1). This

fact allows for the application of this kind of networks to optimization problems as described in Section 1, because there are no local minima in the interior of the hypercube of states.

Problem (a) does not appear here because the expression of the energy for this dynamics is the same as in the discrete case, so thatf(x) (cost function) may be directly associated toE(s). For problem (b), the proof for the existence of an energy function is valid for a continuous dynamics, but the simulation of these networks by computer implies a numerical integration of equation 6, that is, the network evolves according to a discrete dynamics that is not required to produce a decrease of the energy function expressed in (7). Thus, the discrete dynamics is described by

uG t"

O

H GG2GHZ!LH G$GG2GH

¹

GGG2GHsGsG2sGH!IG, (8)

so that the input potential to neuroniat simulation stepk#1 is de"ned with respect

to potential at stepkas follows:

uG(k#1)"u

G(k)#t)

O

HGG2GHZ!LH G$GG2GH

¹

GGG2GHsGsG2sGH!IG

. (9)

The greatert, the farther the simulated evolution trajectory of the network from the theoretical trajectory according to Eq. (6) and the greater the probability of falling into a local minimum (problem c). A network with this dynamics has been simulated to solve the Diophantine Equation Problem with several values fort. Table 1 shows the average number of simulation steps, the percentage of correct solutions (global minima) and the average number of steps for the simulations that reached a global minimum for eachtfor the diophantine equation problem.

(7)

It can be seen that the frequency of falling into local minima depends directly ont, while the number of simulation steps depends inversely ont. Moreover, the longer the trajectory of the network evolution, the better the solution reached, statistically speaking.

4. Hop5eld neural network with continuous activation function and continuous

dynamics II

An arbitrary order Hop"eld feedback neural network with continuous activation

function g(u

G/) (g being a sigmoid or hyperbolic tangent function), the evolution

dynamics of which is described by Eqs. (10) and (11), has the energy function shown in Eq. (12):

sG(t)"g

uG(t)

, (10)

du G

dt"!uG# O

H GG2GHZ!LH G$GG2GH

¹

GGG2GHsGsG2sGH!IG, (11)

E(s)"! O

H G GG2GHZ!LH G$GG2GH

¹

GGG2GHsGsGsG2sGH# G

I GsG

#

G

QG

g\(s) ds#C. (12)

The proof for a"rst order network is found in [7]. The same proof process can be used

to "nd the high order generalization because Eq. (12) has no variable s

G raised to

a power greater than one. The parameter that describes the slope of the activation function in [7] is"1/. In this paper, the inverse parameter,, is used because the

application of this dynamics to optimization problems described in Section 1 requires the evolution of the activation functions towards a step-like function which is obtained by moving towards R or towards 0. The second case has an easier

simulation on a computer.

Due to the existence of the integral term the energy function cannot be equated to the cost functionf(x). This dynamics is often applied without considering this fact, that is, removing the integral term (problem a). This is an erroneous procedure, because in this case the minima in the two functions are no longer the same. Only in the case"0 ("R) functionsEandfare the same, but this is the discrete neuron

case described in Section 2.

Unlike the dynamics described in Section 3, the integral term causes the existence of stable states (minima of the energy function) of the network at points inside the hypercube in the state space. Moreover, Hop"eld states that in this kind of networks

(8)

reach a stable state in vertexes, edges or sides of the hypercube. This fact, which in principle, would make these networks useless to solve optimization problems like those in Section 1, is presented as a consequence of the unboundedly increasing integral term whens

Gapproaches$1. Literally:`The integral is zero for<

G"0 and

positive otherwise, getting very large as<

G approaches$1 because of the slowness

with whichg(<) approaches its asymptotesa[7].

We think that Hop"eld's statement is erroneous, and next it is shown that using the

hyperbolic tangent function the integral term remains bounded. (A similar result may be obtained for a sigmoid activation function).

Theorem 1 (Joya [9]). Given an arbitrary order Hopxeld network,whose neurons have

an activation function s G"g(u

G/)"tanh(u

G/), with '0,every integral term in its energy function remains bounded by thevalueln 2,that is

0)

Q G

g\(s) ds(ln 2. (13)

Proof. The hyperbolic tangent function may be expressed as

g(x)"eV !e\V

eV#e\V "eV

!1

eV#1, (14)

"ndingx, we have

x"(ln(1#g(x))!ln(1!g(x))), (15)

so the inverse functiong\may be expressed as

g\(x)"(ln(1#x)!ln(1!x)). (16)

The term ln(1#ax) dx may be integrated in parts by making u"ln(1#ax), v"(1#ax)/a, obtaining

QG

g\(s) ds"

2[(1#sG)ln(1#sG)#(1!sG)ln(1!sG)]. (17)

This expression is zero whens

G"0; whens

G tends to 1 we have

lim

QG

QG

g\(s) ds"

2

lim

QG

(1#s

G)ln(1#s G)#lim

QG

(1!s

G)ln(1!s

G)

. (18)

The former limit of the second member of (18) is 2 ln 2, and the latter may be calculated using the L'Ho(pital rule:

lim

QG

(1!s

G)ln(1!s G)"lim

QG

ln(1!s G)

1/(1!s G)

"lim QG

!(1!s

(9)

Fig. 2. (a) Representation of integral term obtained from Eq. (17). (b) Representation of integral term as presented by Hop"eld.

Doing the same whens

G tends to!1, we obtain

lim

QG!

QG

g\(s) ds"ln 2. 䊐 (20)

Corollary 2. If is the same for every neuron,the sumI@ of all integral terms in Eis

bounded: 0)I

@)nln 2.

Fig. 2 shows the graphic representation of the integral term as presented by

Hop"eld and the representation obtained from expression 17.

It is worth noting that Hop"eld's"nal statement, the impossibility for an analog

neuron to reach its extreme values, is strictly true, because, given any neuroni, there is always a contour in the state space around the extreme valuess

G"1 (ors

G"!1)

where the energy is lower than that ofs

G"1 (ors

G"!1). This fact can be explained

by analysing the behaviour of the di!erent terms that compose the energy function

(Eq. (12)): the sums term and the integral term. Because of construction, the sums term decreases when the output value of a neuron approaches its extreme values (#1 or !1) if it corresponds to a minimum of the cost function. The decreasing of the sums

term may be represented, in a general way, as a straight line as appearing in Fig. 3, because the partial derivative of the sums term with respect tosGdoes not depend on suchsGin the case of null self-connections, or depends linearly in the case of non null self-connection. On the other hand, the function describing the variation of the integral termI

@ asymptotically grows to in"nity when the output value approaches

1 (or!1) (see Fig. 3), because the dependence of its partial derivative with respect to s

G is expressed as dI@/ dsG"(/2) ln((1#s

G)/(1!s

G)). In these circumstances, there

always exists a cross point for both curves of variation (Fig. 3), from which the approach ofs

Gto 1 (or!1) is impossible, since this would impose a positive variation

(10)

Fig. 3. Graphical representation of the variation of the sums and integral terms with respect tos

G. Each straight line represents the function`decreasing of the sums termawith a di!erent coe$cient, (in the case of

null self-weights, one would have horizontal lines). The asymptotical curve represents the function

`increasing of the integral terma.

much as wanted, by scaling the coe$cients weighting the sums term. If the distance

between the cross point and the extreme value is less than the simulation precision, this extreme value will appear as a"nal result.

Problem (b), produced by time discretization in a computer simulation, a!ects also

this kind of networks, but its e!ect is di$cult to quantify due to the existence of local

minima into the hypercube (problem (c)), as mentioned above. The evolution strat-egies [17,15], which must be used in optimization problems to move the network out of these minima mask the e!ect of time discretization. The following section is

centered on these strategies and presents a new one by using the property of boundedness of the integral term that has been shown.

5. Evolution strategy in a network with continuous dynamics II

Once the bound of the integral term has been obtained, the way to apply a network with continuous neurons and dynamics II to optimization problems is analyzed. In this case, while'0, functionf, which describes the problem and must be minimized,

cannot be associated to the energy function of the network (Eq. (12)), but with just a part of it: the part that includes the discrete sum term. Consequently, equilibrium states of the network, which are minima of the whole energy function, are not solutions to the problem. Moreover, every s

G of the solution must be discrete, so

a continuous network cannot"nd a solution due to the existence of minima inside the

hypercube. The way to avoid these problems and make the network converge up to a solution state is to decreasealong the simulation, theoretically until"0; at this

(11)

strategy reminds us of the simulated annealing (SA) [13] used in discrete dynamics: both modify the parameter that determines the slope of a sigmoid-like function up to it is converted into a step-like function. The di!erence lies in the fact that the function

in S.A. is the probability of a neuron changing from!1 to 1 (or vice versa) even if the

energy increases, while, in our case, the modi"ed function is the neuron activation

function itself. In other words, this strategy is, actually, equivalent to using many di!erent neural networks, one for each used. For any"xed, no matter how small,

the hyperbolic tangent function g(uG/) remains bounded. Indeed, when K0,

al-thoughuG/is large,g(uG/) is almost insensitive to changes ofuG, becauseg(uG/)K0.

Therefore, each one of these networks, with a particular, has its own energy function and, consequently, it is stable.

As far as we know, there is no described algorithm for the variation ofalong the simulation with a justi"cation other than heuristic. In general, the procedure used

may be described as follows: starting from high enough, let the network evolve until a stable state*which is not a solution*is reached, then multiplyby a factor less

than 1, let the network evolve again up to a new stable state, and so on; the process ends whenbecomes zero and, at this moment, the stable state reached should be a global minimum of the cost function. As the evolution strategy is meant to decrease

, the question arises whether a small initialwill obtain the same result with a faster convergence. However, a network with smallis prone to falling into local minima, because the abrupt energy surface does not allow for the existence of local minima escape trajectories. This fact is easily observed in the limit case, "0. Then, the

network dynamics becomes discrete, and local minima are a serious problem, as we have already studied in Section 1. To summarize, small initial is equivalent to a discrete dynamics with any initial state, probably far from a solution. However, if initial is large, whenapproaches 0 the system becomes equivalent to a discrete network with an initial state, which is probably near a solution.

A new evolution strategy is proposed below, based upon the property of the integral term being bounded by the quantityCI

@"nln 2 (as shown in Theorem 1). This

strategy is oriented to reducing the number of evolution steps, and may be justi"ed by

the following reasoning: the cost function is associated just to the discrete sum terms of the energy function (equation (12)), so the solution to the problem corresponds to a state where these terms have a minimum valueE

; on the other hand, the integral

term of the energy function (Eq. (12)) reaches its maximum valueCI

@in the vertices of

the hypercube, i.e., sG"$1∀i; so that, if a solution vertex could be reached for

aO0, the associated energy would beE"E

#CI

@. Then, if for a particularthe

network evolves until reaching a state of energy valueE@

, which veri"es

E@

(E#CI@, (21)

it means that the network is in a state with a smaller energy than that of a vertex; consequently, if now the network is left to evolve, it will move farther from that vertex, i.e., from a possible solution, since the network evolves in the decreasing direction of the energy. For the network to evolve in the direction approaching the vertex, the parametermust be decreased until reaching a new value

(12)

Fig. 4. Schematic representation of the state space. Each circle represents the set of states whose energy is between the minimal valueE

and the value corresponding to a vertexE#CI @.

veri"es

E

@'E#CI@. (22)

That is, the state falls out of the hypersphere with center at E

and radius CI@

(see Fig. 4).

To sum up, this strategy does not need to wait for the network to reach a local minimum inside the hypercube to carry out the change of, but this change will be made just when the evolution is proved to tend to that minimum.

Using the environment for design and simulation of high order arti"cial neural

networks developed in [2], problems DEP, HCP and kCP have been studied.

5.1. Simulation of the diophantine equation problem

For the diophantine equation problem, (see Appendix A), the following experiments have been carried out:

E Simulation of a discrete network.

In this case, neurons have a step-like activation function, so that sG3!1,1

(Table 2).

E Simulation of a network with continuous neurons and traditional evolution strategy.

In this case, neurons have tanh() as an activation function, thereby s

G3[!1,1].

Starting from high enough (2000), the network is left to evolve until it reaches a stable state for each. At this moment,is decreased (multiplying it by 0.8), and the evolution starts again. See Table 2.

(13)

Table 2

Simulation results for the evolution of a high order Hop"eld ANN oriented to solving a diophantine

equation

Discrete network Continuous network Continuous network Traditional strategy New strategy

Initial * 2,000 2000

Decreasing factor of * 0.8 0.8

Percentage of correct 38% 100% 100%

solutions (global minima)

Average number of 10 750 68

simulation steps

E Simulation of a network of continuous neurons and evolution strategy based on the bound of the integral term (new strategy).

The network has the same characteristics as in the former case, but now the network is left to evolve either until it reaches a stable state or the condition in Eq. (21) is ful"lled. Then,is decreased (multiplying it by 0.8), and the evolution starts again.

See Table 2.

5.2. Simulation of the Hamiltonian cycle problem

For the Hamiltonian cycle problem (see Appendix B) similar experiments have been made. However, this case is more complex, due to the existence of coe$cients in the

cost function whose values must be assigned to obtain the energy function and, consequently, the connection weights. As mentioned above, there is no analytical and general method to obtain these coe$cients, but they are obtained heuristically or by

a problem-speci"c analysis. The simulation is intended to determine the in#uence of

the choice of coe$cients on the success of the di!erent network dynamics. Thus, we

repeat the same simulations as for DEP, studying a discrete network, a continuous network with traditional strategy and a continuous network with the new strategy, and reproducing the simulations with several choices of coe$cients. The results are

presented in Table 3, which shows in each cell the percentage of correct solutions reached and the average number of simulation steps for each network and particular set of coe$cients.

5.3. Simulation of the k-colorability problem

As in the HCP, the di!erent dynamics were applied to ak-colorability problem (see

Appendix C). For the purpose of testing the independence of the new evolution strategy with respect to the parameters of the problem, an exhaustive set of coe$cient

values was selected. For each dynamics and each value of the coe$cients, the network

evolution was simulated 100 times. The discrete network never reached a correct solution, while the continuous dynamics II, both with the traditional evolution strategy and with the new one, always succeeded to obtain a correct colouring. The

(14)

Table 3

Simulation results for the evolution of a Hop"eld ANN oriented to solving the HCP. For each set of cost

function coe$cients and each dynamics, the percentage of correct solutions and the average number of

iterations are shown

Coe$cient values Discrete network Continuous network Continuous network

Traditional strategy New strategy

G"1 D"0.7 0% 100% 100%

<"H"0.7 38 48,843 1190

G"0.7 D"0.7 0% 100% 100%

<"H"0.7 27 23,355 1765

G"1 D"0.3 0% 100% 100%

<"H"0.7 30 17,070 1475

G"1 D"0.7 0% 100% 100%

<"H"0.6 32 30,139 611

G"1 D"0.7 0% 0% 0%

<"H"0.4 30 50,000 700

G"0.6 D"0.7 0% 100% 100%

<"H"0.7 31 20,635 1365

Table 4

Average number of iterations up to an energy minimum of a Hop"eld ANN oriented to

solving the kCP, with 25 regions and "ve colours. The discrete dynamics never reaches a correct

solution

Coe$cient values Discrete network Continuous network Continuous network

Traditional strategy New strategy

a"1 b"1 56 26,108 16,045

a"1 b"2 56 26,544 16,008

a"1 b"3 57 26,699 16,146

a"1 b"4 56 26,390 16,045

a"1 b"5 55 26,247 16,008

a"1 b"6 56 26,556 16,146

a"1 b"7 57 26,714 16,045

a"1 b"8 56 26,364 16,008

a"1 b"9 56 26,208 16,146

a"2 b"1 55 26,081 16,130

a"3 b"1 56 26,794 16,297

a"4 b"1 57 26,613 16,263

a"5 b"1 55 26,357 16,313

a"6 b"1 56 26,081 15,965

a"7 b"1 56 26,794 16,141

a"8 b"1 57 26,613 16,074

(15)

results, shown in Table 4, indicate that the discrete dynamics falls into a local minimum after a few steps (approximately 56), while the new strategy increases the rate of evolution, when compared to the traditional strategy.

From analysis of Tables 2}4 some conclusions may be obtained:

E The probability of falling into a local minimum is very high in the discrete case.

E The average number of steps is greatly reduced by means of the evolution strategy based upon the bound of the integral term.

E The percentage of correct solutions obtained presents a low dependence on the chosen coe$cients, both for the traditional strategy and the new one.

6. Conclusions

This paper has been dedicated to studying Hop"eld-like networks when applied to

optimization problems. We think that this was a necessary study, due to the large number of published papers consisting of practical applications of this model to particular problems, without explicitly considering the problems derived from the divergence from the theoretical principles imposed by a simulated realization. A

clas-si"cation of these networks has been carried out according to their activation

dynamics, specifying the form of the function that has been demonstrated to be an energy function in each case. This classi"cation is useful to bring to light the problems

mentioned, which may be classi"ed in four categories:

(a) Incoherence between the network dynamics and the associated energy function. (b) Error due to discretization of the continuous dynamics equations caused by

simulation on a digital computer. (c) Existence of local minima.

(d) Correct solutions very dependent on the cost function coe$cients.

The way these problems a!ect each dynamics is analysed by means of simulations of

three particular problems: solution of a diophantine equation problem (DEP), a Hamiltonian cycle problem (HCP), and ak-colorability problem (kCP). Networks with a discrete dynamics are mainly a!ected by problem (c), as the probability of

falling into local minima has been shown to be too high. In the case of networks with continuous dynamics I, the main problem is (b), so that the larger the discretization stept, the larger the distance from the evolution trajectory of the network to the theoretically predicted trajectory. We show that iftis small enough, this distance is insigni"cant, thus ensuring the reaching of a global minimum. However, this implies

a greater computational time. In the case of networks with continuous dynamics II, the main problems are (a) and (c), due to the existence of local minima inside the hypercube. Theorem 1 proves that the integral term that appears in the energy function of these networks is bounded: 0)I

@)nln 2, in contrast with Hop"eld's

statement that predicts an asymptotic increasing of this term. This result supports the development of a new strategy for the avoidance of local minima, which is compared

(16)

to the traditional one, showing a greater e$ciency concerning the number of

simula-tion steps.

Acknowledgements

This work has been partially supported by the Spanish ComisioHn Interministerial

de Ciencia y TecnologmHa (CICYT), Project No. TIC95-0589 and TIC98-0562. The

authors wish to thank Ian Johnstone and Araceli Ruiz-LoHpez for their linguistic

revision of the English manuscript. The careful reading and useful suggestions of the reviewers is gratefully acknowledged.

Appendix A. Diophantine equation problem

This example problem consists of the solution of the following diophantine equa-tion [10,3]:

ax#by"c, a"1,b"3, c"37. (A.1)

It may be solved with a third order Hop"eld network with 10 neurons. The cost

function is

f(x,y)"(ax#by!c). (A.2)

Solutions are obtained as

x"K\ G

2G sG#1 2 , y"

L\ GK

2G\KsG#1

2 . (A.3)

The values of the connection weights are

I

G"![(1!

G)4aK2G\(a(3K#K!2)2G\)!c#bC )

#

G2b2G\L\(bC!c#a(K

#K))], (A.4) ¹

GH"!2![(1! G)(1!

H)2a2G>H\(bC!c #a(3K

#3K!2)2G\!2)2 H\))

#

GHb2G>H\L\#((1!

G)H#(1!

H)G)2abK2G>H\L\], (A.5) ¹

GHI"!3![(1! G)(1!

H)(1!

I)4aK2G>H>I\ #((1!

G)(1!

H)I#(1! G)(1!

I)H#(1! H)(1!

I)G)

(17)

¹

GHIJ"!4![(1! G)(1!

H)(1! I)(1!

J)a2G>H>I>J\], (A.7) G"

0 ifi(m

1 ifi*m

, K

N"K\ G

2NG\, C N"L\

GK

2NG\K\. (A.8)

Appendix B. Hamiltonian cycle problem

This example problem may be described as follows: given a not totally connected graph,"nding, if there is any, a closed cycle that passes through every node once and

only once. It is a particularization of the travelling salesman problem, considering that the distance d

VW between two nodes, x and y, is 0 if they are connected, and

1 otherwise.

The cost function is obtained from the function presented for the TSP [5]:

E"H

2

VG$H

sVGsVH#

<

2

V$WG

sVGsWG!G VG

sVG#D

2

VWG

dVWsVG(sWG>#s

WG\). (B.1)

The values of the connection weights are

I

VG"!G, (B.2)

¹

VGWH"!H

VW(1!

GH)!< GH(1!

VW)!Dd

VW(1!

VW)(G>H#

G\H), (B.3)

whereVW is the Kronecker's delta symbol.

Appendix C. k-colorability problem

In this problem, a map consisting of r regions must be coloured, so that two contiguous regions must have di!erent colours, while the total number k of used

colours must be kept to a minimum. It can be considered an optimization problem, and may be solved by a Hop"eld network [19] withr)kneurons arranged asrrows

andkcolumns. In this neuron array, each row represents a region and each neuron into a row represents the colouring of that region. Thus, if the network reaches a correct solution, exactly one neuron in each row should be on. This network has the following energy function:

E"a

2 P V

I G s

VG!1

#b P V

P W W$V

I G

d

VWsVGsWG, (C.1)

where the"rst term favours the states with exactly one activated neuron in each row,

and the second term penalizes the same colouring in contiguous regions. For this neural network, the values of the connection weights are

I

VG"a, (C.2)

¹

VGWH"! VWa!

(18)

where

VW is the Kronecker's delta symbol and d

VW is 1 if x and y represent two

contiguous regions, and 0 otherwise.

References

[1] S. Abe, Theories on the Hop"eld neural networks, IEEE International Joint Conference on Neural

Networks, I, 557-564, 1989.

[2] M.A. Atencia, An arbitrary order neural networks design and simulation environment, Master's

Thesis, Departamento de TecnologmHa ElectroHnica, Universidad de MaHlaga, 1997 (in Spanish).

[3] M. Garey, D. Johnson, Computers and Intractability, A Guide to the Theory of NP-Completeness, W.H. Freeman and Company, New York, 1979.

[4] A.H. Gee, R.W. Prager, Polyhedral combinatorics and neural networks, Neural Comput. 6 (1) (1994) 161}180.

[5] J. Hertz, A. Krogh, R. Palmer, Introduction to the Theory of Neural Computation, Addison-Wesley, Reading, MA, 1991.

[6] J. Hop"eld, Neural networks and physical systems with emergent collective computational abilities,

Proc. Natl. Acad. Sci. USA 79 (1982) 2554}2558.

[7] J. Hop"eld, Neurons with graded response have collective computational properties like those of

two-state neurons, Proc. Natl. Acad. Sci. USA 81 (1984) 3088}3092.

[8] J. Hop"eld, D. Tank,&Neural'computation of decisions in optimization problems, Biol. Cybernet. 52

(1985) 141}152.

[9] G. Joya, Contributions of high order arti"cial neural networks to the design of autonomous systems,

Ph.D. Thesis, Departamento de TecnologmHa ElectroHnica, Universidad de MaHlaga, 1997.

[10] G. Joya, M.A. Atencia, F. Sandoval, Application of high-order Hop"eld neural networks to the

solution of diophantine equations, in: Arti"cial Neural Networks. Lecture Notes on Computer

Science, Vol. 540, Springer-Verlag, Berlin-Heidelberg, 1991.

[11] G. Joya, M.A. Atencia, F. Sandoval, Associating arbitrary-order energy functions to an arti"cial

neural network, implications concerning the resolution of optimization problems, Neurocomputing 14 (1997) 139}156.

[12] K.H. Kim, C.H. Lee, B.Y. Kim, H.Y. Hwang, Neural optimization network for minimum-via layer assignment, Neurocomputing 3 (1991) 15}27.

[13] S. Kirpatrick, G. Gelatt Jr., M. Vecci, Optimization by simulated annealing, Science 220 (1983) 671}680.

[14] Y. Kobuchi, State evaluation functions and Lyapunov functions for neural networks, Neural Networks 4 (1991) 505}510.

[15] S. Mehta, L. Fulop, An analog neural network to solve the hamiltonian cycle problem, Neural Networks 6 (1993) 839}881.

[16] J. Ortega, A. Prieto, A. Lloris, F. Pelayo, Generalized Hop"eld network for concurrent testing, IEEE

Trans. Comput. 42 (8) (1993) 898}912.

[17] T. Samad, P. Harper, High-order Hop"eld and Tank optimization networks, Parallel Comput. 16

(1990) 287}292.

[18] T.J. Sejnowski, Higher-Order Boltzmann Machines, American Institute of Physics, 1986, pp. 398}403.

[19] Y.Takefuji, K.C. Lee, Arti"cial neural networks for four-coloring map problems andk-colorability

problems, IEEE Trans. Circuits Systems 38 (3) (1991) 326}333.

[20] M. Trick, Operations research page. http://mat.gsia.cmu.edu.

[21] F.-S. Tsung, G. Cottrell, Learning in recurrent"nite di!erence networks, Int. J. Neural Systems 6 (3)

(1995) 249}256.

[22] G. Wilson, G. Pawley, On the stability of the travelling salesman problem algorithm of Hop"eld and

(19)

Gonzalo Joyawas born in Granada (Spain) in 1960. He obtained his degree in Physics from the University of Granada (Spain) and the Ph.D. from the University of MaHlaga (Spain). He is a Professor in the Department of TecnologmHa ElectroHnica

of the University of MaHlaga. His current"elds of interest include the foundations

of Arti"cial Neural Networks and their application to Energy Management

Systems, Control and Optimization.

Miguel A. Atenciawas born in 1966 in MaHlaga (Spain). He received the diplomate

degree and the MSc degree in Computer Engineering from the University of Malaga. His research interests, which are being developed in his doctoral thesis, include mathematical foundations of recurrent neural networks and their simula-tion. Since 1988 he has been System Manager at the Hospital`Carlos Hayaa, as well as at the University of Malaga. Also, he is Associate Professor at the Department of Lenguajes y Ciencias de la ComputacioHn of the University of

Malaga. He is the President of AIA (Andalusian Association of Computer Engineers).

Francisco Sandovalwas born in Spain in 1947. He received the title of Telecommu-nication Engineering and Ph.D. degree from the Technical University of Madrid, Spain, in 1972 and 1980 respectively. From 1972 to 1989 he was engaged in teaching and research in the"elds of opto-electronics and integrated circuits in the

Universidad PoliteHcnica de Madrid (UPM) as an Assistant Professor and a

Lecturer successively. In 1990 he joined the University of MaHlaga as Full Professor

in the Department of TecnologmHa ElectroHnica, starting his research on Arti"cial

Neural Networks (ANN). He is currently involved in VLSI design of ANN, and application of ANN to Energy Management Systems and Broad Band Commun-ication.

Referencias

Documento similar

The electrical energy and the charge consumed by the reversible reaction of the film under voltammetric conditions between the constant potential limits are a function of the

Supplementary Figure 4 | Maximum Likelihood Loliinae tree cladograms (combined plastome + nuclear 35S rDNA cistron) showing the relationships among the studied samples in each of

Abstract— Network slicing is considered one of the main pillars of the upcoming 5G networks. Indeed, the ability to slice a mobile network and tailor each slice to the needs of

Government policy varies between nations and this guidance sets out the need for balanced decision-making about ways of working, and the ongoing safety considerations

No obstante, como esta enfermedad afecta a cada persona de manera diferente, no todas las opciones de cuidado y tratamiento pueden ser apropiadas para cada individuo.. La forma

It serves a variety of purposes, making presentations powerful tools for convincing and teaching. IT hardware

N , The Road from Moorihedabad to Jellinghy—and the Ganges, Dacca, and Tiperah Rivers, E.. Aracan

These papers provide interesting results when DTLZ and WFG test problems are solved: although most of the time the a priori articulation of preferences brought better results than