5. RESULTADOS Y DISCUSIÓN
5.1. Análisis descriptivo
Abstract
See the individual abstracts for sections B2.7.1, B2.7.2, B2.7.3, B2.7.4 and B2.7.5.
B2.7.1 Deceptive landscapes Kalyanmoy Deb
Abstract
In order to study the efficacy of evolutionary algorithms, a number of fitness landscapes have been designed and used as test functions. Since the optimal solution(s) of these fitness landscapes are known a priori, controlled experiments can be performed to investigate the convergence properties of evolutionary algorithms. A number of fitness landscapes are discussed in this section. These fitness landscapes are designed either to test some specific properties of the algorithms or to investigate overall working of the algorithms on difficult fitness landscapes.
B2.7.1.1 Introduction
Deceptive landscapes have been mainly studied in the context of genetic algorithms (GAs), although B1.2 the concept of deceptive landscapes in creating difficult test functions can also be developed for other evolutionary algorithms. The development of deceptive functions lies in the proper understanding of the building block hypothesis. The building block hypothesis suggests that GAs work by combining low-order building blocks to form higher-order building blocks (see Section B2.5). Therefore, if in a function the B2.5 low-order building blocks do not combine to form higher-order building blocks, GAs may have difficulty in optimizing the function. Deceptive functions are those functions where the low-order building blocks do not combine to form higher-order building blocks: instead they form building blocks for a suboptimal solution. The main motivation behind developing such functions is to create difficult test functions for comparing different implementations of GAs. It is then argued that if a GA succeeds in solving these difficult test functions, it can solve other simpler functions.
A deceptive function usually has at least two optimal solutions—one global and one local. A local optimal solution is the best solution in the neighborhood of the solution, whereas the global optimal solution is the best solution in the entire search space. Thus, the local solution is inferior to the global solution and is usually known as the deceptive solution. However, it has been shown elsewhere (Deb and Goldberg 1992, Whitley 1991) that the deceptive solution can be at most one-bit dissimilar to the local optimal solution in binary functions. A deceptive fitness function is designed by comparing the schemata representing the global optimal solution and the deceptive solution. The comparison is usually performed according to the fitness of the competing schemata. A deceptive function is designed by adjusting the
string fitness values in such a way that the schemata representing the deceptive solution have better fitness than any other schemata including that representing the global optimal solution in a schema partition. It is then argued that, because of the superiority of the deceptive schemata, GAs process them favorably in early generations. Solutions representing these schemata take over the population and GAs may finally find the deceptive solution, instead of the global optimal solution. Thus, the deceptive functions may cause GAs to find a suboptimal solution. Since these fitness landscapes are supposedly difficult for GAs, considerable effort has been spent in designing different deceptive functions and studies have been made to understand how simple GAs can be modified to solve such difficult landscapes (Deb 1991, Goldberg et al 1989, 1990). In the following, we first define deception and then outline simple procedures for creating deceptive functions.
B2.7.1.2 Schema deception
Although there exists some lack of agreement among researchers in the evolutionary computation (EC) community about the procedure of calculating the schema fitness and about the very definition of deception (Grefenstette 1993), we present here one version of the deception theory. Before we present that definition, two terminologies—schema fitness and schema partition—must be defined. A schema fitness is defined as the average fitness of all strings representing the schema. Thus, one schema is worse than another schema if the fitness of the former schema is inferior to that of the latter schema. A schema partition is represented by a binary string constructed with fand∗, where af represents a fixed position having either a 1 or a0 (but not both) and a∗ represents a ‘don’t care’ symbol denoting either a 1or a 0. A schema partition represents 2k schemata, wherekis the number of fixed positions fin the partition. The parameterkis also known as the order of the schema partition. Thus, an order-kschema partition divides the entire search space into 2k distinct and equal regions. For example, in a three-bit binary function, the second-order schema partition (ff∗) represents four schemata: (00∗), (01∗), (10∗), and (11∗). Since each of these schemata represents two distinct strings, the schema partition (ff∗) divides the entire search space into four equal regions. It is clear that a higher-order schema partition divides the search space into exponentially more regions than a lower-order schema partition. In the spirit of the above schema partition definition, it can be concluded that the highest-order (of order`) schema partition divides the search space into 2` regions and each schema represents exactly one of the strings. Of course, the lowest-order (of
order zero) schema partition has only one schema which represents all strings in the search space. Definition B2.7.1. A schema partition is said to be deceptive if the schema containing the deceptive optimal solution is no worse than all other schemata in the partition.
We illustrate a deceptive schema partition in a three-bit function having its global and deceptive solutions at (111) and (000), respectively. According to the above definition of schema partition, the schema partition (ff∗) is deceptive if the fitness of the schema (00∗) is no worse than that of the other three schemata in the schema partition. For maximization problems, this requires the following three relationships to be true:
F (00∗)≥F (01∗) (B2.7.1)
F (00∗)≥F (10∗) (B2.7.2)
F (00∗)≥F (11∗). (B2.7.3)
Definition B2.7.2. A function is said to be fully deceptive if all 2`−2 (see below) schema partitions are deceptive.
In an `-bit problem, there are a total of 2` schema partitions, of which two of them (one with all
fixed positions and the other with all∗) cannot be deceptive. Thus, if all other (2`−2) schema partitions
are deceptive according to the above definition, the function is fully deceptive. Deb and Goldberg (1994) calculated that about O(4`)floating point operations are required to create a fully deceptive function. A
function can also be partially deceptive to a certain order.
Definition B2.7.3. A function is said to be partially deceptive to order kif all schema partitions of order smaller thankare deceptive.
Fitness landscapes
B2.7.1.3 Deceptive functions
Many researchers have created partially and fully deceptive functions from different considerations. Goldberg (1989a) created a three-bit fully deceptive function by explicitly calculating and comparing all schema fitness values. Liepins and Vose (1990) and Whitley (1991) have calculated different fully deceptive functions from intuition. Goldberg (1990) derived a fully deceptive function from low-order Walsh coefficients. Deb and Goldberg (1994) have created fully and partially deceptive trap functions (trap functions were originally introduced by Ackley (1987)) and found sufficient conditions to test and create deceptive functions. Since trap functions are piecewise linear functions and are defined with only a few independent function values, they are easy to manipulate and analyze to create a deceptive function. In the following, we present a fully deceptive trap function.
Trap functions are defined in terms of unitation (the number of1s in a string). A function of unitation has the same function value for all strings of identical unitation. That is, in a three-bit unitation function, the strings (001), (010), and (100) have the same function value (because all the above three strings have the same unitation of one). Thus, in a `-bit unitation function there are only(`+1)different function values. This reduction in number of function values (from 2` to (`+1)) has helped researchers to create deceptive functions using unitation functions. A trap functionf (u), as a function of unitationu, is defined as follows (Ackley 1987): f (u)= a z(z−u) ifu≤z b `−z(u−z) otherwise (B2.7.4)
wherea andb are the function values of the deceptive and global optimal solutions, respectively. The trap function is a piecewise linear function that divides the search space into two basins in the unitation space, one leading to the global optimal solution and other leading to the deceptive solution. Figure B2.7.1 shows a trap function as a function of unitation (left) and as a function of the decoded value of the binary string (right) witha =0.6, b=1.0, and z=3. The parameterz is the slope change location andu is
Figure B2.7.1. A four-bit trap function (a=0.6,b=1.0, andz=3) as a function of unitation (left) and as a function of binary strings (right).
the unitation of a string. Deb and Goldberg (1994) have found that an `-bit trap function becomes fully deceptive if the following condition is true (for small`anda≈b):
z≥ ` 2 + (b−a) (b+a) `(`−1) 2 . (B2.7.5)
The above condition suggests that in a deceptive trap function the slope change locationz is closer to`. In other words, there are more strings in the basin of the deceptive solution than that in the basin of the global optimal solution. Using the above condition, we create a six-bit fully deceptive trap function with the strings (000000) and (111111) being the deceptive and global optimal solutions (a=0.92,b=1.00, andz=4):
f(000000)=0.920 f(*00000)=0.805 f(**0000)=0.690 f(***000)=0.575 f(****00)=0.460 f(*****0)=0.367 f(000001)=0.690 f(*00001)=0.575 f(**0001)=0.460 f(***001)=0.345 f(****01)=0.275 f(*****1)=0.274 f(000011)=0.460 f(*00011)=0.345 f(**0011)=0.230 f(***011)=0.206 f(****11)=0.273 f(000111)=0.230 f(*00111)=0.115 f(**0111)=0.182 f(***111)=0.341 f(001111)=0.000 f(*01111)=0.250 f(**1111)=0.500 f(011111)=0.500 f(*11111)=0.750 f(111111)=1.000
The leftmost column shows seven different function values in a six-bit unitation function and other columns show the schema fitness values of different schema partitions. In the above function, the deceptive solution has a function value equal to 0.92 and the global solution has a function value equal to 1.00. The string (010100) has a function value equal to 0.460, because all strings of unitation 2 have a function value 0.460. In functions of unitation, all schema of a certain order and unitation also have the same fitness. That is, the schema (00*010) has a fitness value equal to 0.575, because this schema is of order five and of unitation one, and all schema of order five and unitation one have a fitness equal to 0.575. The above schema fitness calculations show that the schema containing the deceptive solution is no worse than any other schemata in each partition. For example, for any schema partition of order two, the schema containing the deceptive solution has a fitness equal to 0.690, which is better than any other schema in that partition (third column). However, the deceptive string (000000) is not the true optimal solution. Thus, the above schema partition is deceptive. Since all 26−2 or 62 schema partitions are deceptive, the above fitness landscape is fully deceptive.
Although in the above deceptive landscape the string of all1s is considered to be the globally best string, any other string ccan also be the globally best string. In this case, the above function values are assigned to another set of strings obtained by performing a bitwise exclusive-or operation to the above strings with the complement ofc(Goldberg 1990).
B2.7.1.4 Sufficient conditions for deception
Deb and Goldberg (1994) have also found sufficient conditions for any arbitrary function to be fully deceptive (assuming that the strings of all 1s and all0s are global and deceptive solutions, respectively): primary optimality condition: f (`) >max[f (0),maxf (1)]
primary deception condition: f (0) >max[maxf (2), (f (`)−(minf (1)−maxf (`−1))] secondary deception condition: minf (i)≥maxf (j ) for 1≤i≤ b`/2candi < j ≤`−i
(B2.7.6)
where minf (i) and maxf (i) are the minimum and maximum function values of all strings having a unitation i. A fitness function satisfying the above conditions is guaranteed to be a fully deceptive function; however a function not satisfying any of the above conditions may also be deceptive. However, Deb and Goldberg (1994) have observed that the above conditions can prove deception in most of the deceptive functions that exist in the GA literature. These sufficient conditions allow a systematic way of creating a deceptive function and a quick way to test deception in any arbitrary function. The number of floating-point operations required to design a fully deceptive function using the above conditions is only O(`2), whereas O(4`)operations are required to create a deceptive function with the consideration of all
schema partition deception.
B2.7.1.5 Other deceptive functions
Goldberg et al (1992) have also defined multimodal deceptive functions and developed a method to create fully or partially deceptive multimodal functions from low-order Walsh coefficients. Mason (1991) has developed a method to create deceptive functions for nonbinary functions. Kargupta et al (1992) have also suggested a method to create deceptive problems in permutation problems. C1.4
The design of deceptive landscapes and subsequent attempts to solve such functions have provided better insights into the working of GAs and helped to develop modified GAs to solve such difficult functions. The messy GA (Deb 1991, Goldberg et al 1989, 1990) is a derivative of such considerations C4.2.4 and has been used to solve massively multimodal, deceptive, and highly nonlinear functions in only O(`log`)function evaluations, where`is the number of binary variables (Goldberg et al 1993). These results are remarkable and set up standards for other competitive algorithms to achieve, but what is yet
Fitness landscapes
more remarkable is the development of such efficient algorithms through proper understanding of the complex mechanisms of GAs and their extensions for handling difficult fitness landscapes.
B2.7.2 NK landscapes Lee Altenberg†
Abstract
NK fitness landscapes are stochastically generated fitness functions on bit strings, parameterized (with N genes and K interactions between genes) so as to make them tunably ‘rugged’. Under the genetic operators of bit-flipping mutation or recombination, NK landscapes produce multiple domains of attraction for the evolutionary dynamics. NK landscapes have been used in models of epistatic gene interactions, coevolution, genome growth, and Wright’s shifting balance model of adaptation. Theory for adaptive walks on NK landscapes has been derived, and generalizations that extend beyond Kauffman’s original framework have been utilized.
B2.7.2.1 Introduction
A very short time after the first mathematical models of Darwinian evolution were developed, Sewall Wright (1932) recognized a deep property of population genetic dynamics: when fitness interactions exist A2.2 between genes, the genetic composition of a population can evolve into multiple domains of attraction. The specific fitness interaction is epistasis, where the effect on fitness from altering one gene depends on the allelic state of other genes (Lush 1935). Epistasis makes it possible for the population to evolve toward different combinations of alleles, depending on its initial genetic composition. (Wright’s framework also included the complication of diploid genetics, which augments the fitness interactions that produce multiple attractors.)
Wright thus found a conceptual link between a microscopic property of organisms—fitness interactions between genes—and a macroscopic property of evolutionary dynamics—multiple population attractors in the space of genotypes. To illustrate this situation, Wright invoked the metaphor of a landscape with multiple peaks, in which a population would evolve by moving uphill until it reached its local fitness peak. This metaphor of the ‘adaptive landscape’ is the general term used to describe multiple domains of attraction in evolutionary dynamics.
Wright was specifically interested in how populations could escape from local fitness peaks to higher ones through stochastic fluctuations in small population subdivisions. His was thus one of the earliest conceptions of a stochastic process for the optimization of multimodal functions.
Stuart Kauffman devised the NK fitness landscape model to explore the way that epistasis controls the ruggedness of an adaptive landscape (Kauffman and Levin 1987, Kauffman 1989). Kauffman wanted to specify a family of fitness functions whose ruggedness could be ‘tuned’ by a single parameter. He did this by building up landscapes from multiple ‘atoms’ of maximal epistasis.
The NK model is a stochastic method for generating fitness functions,F :{0,1}N 7→ <+, on binary
strings,x∈ {0,1}N, where the genotypexconsists ofN loci, with two possible alleles at each locusx i.
(As such, it is an example of a random field model elaborated upon by Stadler and Happel (1995).) It has two basic components: a structure for gene interactions, and a way this structure is used to generate a fitness function for all the possible genotypes.
The gene interaction structure is created as follows: the genotype’s fitness is the average ofN fitness components Fi contributed by each locus i. Each gene’s fitness component Fi is determined by its own
allele, xi, and also the alleles atK other epistatic loci (soK must fall between zero andN −1). Thus,
the fitness function is:
F (x)= 1 N N X i=1 Fi(xi;xi1, . . . , xiK) (B2.7.7)
where{i1, . . . , iK} ⊂ {1, . . . , i−1, i+1, . . . , N}. TheseK other loci could be chosen in any number of
ways from theN loci in the genotype. Kauffman investigated two possibilities: adjacent neighborhoods, † The author thanks the Maui High Performance Computing Center for generously hosting him as a visiting researcher.
where theK genes nearest to locusi on the chromosome are chosen; and random neighborhoods, where these K other loci are chosen randomly on the chromosome. In the adjacent neighborhood model, the chromosome is taken to have periodic boundaries, so that the neighborhood wraps around the other end when it is near the terminus.
Epistasis is implemented through a ‘house of cards’ model of fitness effects (Kingman 1978, 1980): whenever an allele is changed at one locus, all of the fitness components with which the locus interacts are changed, without any correlation to their previous values. Thus, a mutation in any one of the genes affecting a particular fitness component is like pulling a card out of a house of cards—it tumbles down and must be rebuilt from scratch, with no information passed on from the previous value.
Kauffman implemented this by generating, for each fitness component, a table of 2K+1 numbers for each possible allelic combination for theK+1 loci determining that fitness component. These numbers are independently sampled from a uniform distribution on [0,1). (See section B2.7.2.4 for alternative implementations of this scheme.)
The consequence of this independent resampling of fitness components is that the fitness function develops conflicting constraints: a mutation at one gene may improve its own fitness component but decrease the fitness component of another gene with which it interacts. Furthermore, if the allele at another interacting locus changes, an allele that had been optimal, given the alleles at the other loci, may no longer be optimal. Thus, epistatic interactions produce ‘frustration’ in trying to optimize all genes simultaneously, a concept borrowed from the field of spin glasses, of which NK landscapes are an example (Anderson 1985).
B2.7.2.2 Evolution on NK landscapes
The definition given by Kauffman for the NK landscape is simply a fitness function on a data structure. The genetic operators that manipulate these data structures in creating variants are not explicitly included in the NK landscape specification. However, nothing can be said about the evolutionary dynamics until the genetic operators are defined. A change in the genetic operator will effectively define a new adaptive landscape (Altenberg 1994a, 1995, Jones 1995a, b). The NK structure was defined with the ‘natural’ operators in mind: bit-flipping mutation, and recombination between strings. The magnitude of mutation and recombination rates also has a fundamental effect on the population dynamics.
One of the main differences between evolutionary algorithms and evolutionary genetics is relative