143.1.1-Morfología e histología
4- PACIENTES Y M É TODOS
To evaluate the recombination statement, the Santa Fe Ant Trail problem was chosen [149]. This problem was previously described in Section4.1.2. For convenience, the function and terminal sets were
136 6. PARALLEL DOMAIN-SPECIFIC OPTIMISATION IN ABM 1 mol 2 defvar m = 0.001 3 defvar av vel ={0.0,0.0,0.0} 4 defvar count = 1 5 queryneighbours 6 count = count + 1 7 done
8 defvar c = compute closestingroup
9 defvar origin ={0,0,0}
10
11 select permutationto maximise(count)
12 if (distanceto c) <3then 13 move awayfromc 14 else 15 movetowardsc 16 end 17 movetowardsc
18 movetowardsorigin
19 end
20
21 vel = 0.995∗vel
22 pos = pos + vel
23 end
LISTING6.5: A spatial ABM program written in the custom DSL, with selective uncertainty.
Parameter Value
Time Steps per Generation 260
Number of Generations 10
Agents per candidate model 64 Number of candidate models 32 Max magnitude of initial velocities 0.16
Max turn velocity 0.1
P(crossover) 0.8
P(mutation) 0.01
Cube dimensions 30x30x30 units
TABLE6.1: Additional parameters used in the simulation and optimiser.
1 movetowardsorigin
2 vel = vel + 0.5∗av vel
3 movetowardsc
4 movetowardsc
6.4. EXPERIMENTS AND CONVERGENCE RESULTS 137
FIGURE6.3: A screenshot of the optimiser in evaluation stage, gathering fitness scores from 16 separate candidate simulations, using the program specified in Listing6.5(with agents superimposed). All candidate models are superimposed. The cube in the centre of the image is where all individual agents are intialised.
138 6. PARALLEL DOMAIN-SPECIFIC OPTIMISATION IN ABM
andT={Move Forward (M),Turn Right (R),Turn Left (L)}. It is not as straightforward to form a solution to this
problem which achieves full score.
1 if mytype == 1then
2 defvar fitness = 0
3 if timestep== 200then
4 fitness = count food
5 end
6 select recombinationtominimise(fitness)
7 if food ahead == 1then
8 move and consume
9 else 10 turn left 11 turn right 12 end 13 end 14 end
LISTING6.7: The initial code given for the Santa Fe Ant Trail model induction attempt.
The MOL code excerpt provided for this is given in Listing6.7. The only code not shown is a small piece of code to ensure that only one terminal is processed during execution, this prevents the algorithm from expanding rapidly and simply iterating over the entire lattice.
It is perhaps immediately clear that this initial code would not solve the problem, but it provides all but one of the terminals and nonterminals to construct a working solution. Two other nonterminals are provided separately within the optimiser. These are theProgN2nonterminal, and theProgN3nonterminal. These two functions simply execute all their arguments sequentially.
Candidates in these populations are represented by using the Karva language of Ferreira [64], and expressions (frequently abbreviated to K-expressions) are often written:Q-*+/-abca/a-bbadac. This string encodes a tree, where the symbolsQ,+,-,*,/are 2-arity nonterminals anda,b,c,dare terminal symbols. This tree is a “genotype”, since not all of its symbols are included in the final “phenotype” (ie. executable program). It is interpreted by placing the first symbol at the root, and successively adding tree arguments one by one, from left to right, level by level until the tree is complete.4
The initial parsing of the code given in Listing6.7would occur as normal, but a separate pass constructs the set of nonterminal and terminal symbols from the code given in theselectblock. The if statement within the block (including its condition) is stored as a 2-arity nonterminal and coded asI0. TheIindicates that it is an if statement, and the zero indicates that it is the first if statement encountered. Other if statements with different conditions are stored asI1,I2and so on. Terminals such asmove and consumeis stored asN0(anonymous Lua expression), and the other symbols are stored asN1andN2. For these tests, the statements given in the construct are not used, other than to construct a symbol database. Future work includes decomposing this code in the same fashion and inserting it into the first generation. In the case of this simulation, the following unambiguous symbols will be
6.4. EXPERIMENTS AND CONVERGENCE RESULTS 139
used to ease readability:M–move and consume,P–ProgN2,Q–ProgN3,L–turn left,R–turn right,
I–IfFoodAhead. The fitness is computed at timestep200, using the code just before theselectblock.
Together, the constructed function and terminal sets are used to construct randomk-expressions, enable simple crossover and mutation operators, and construct typed ASTs for inclusion into typed candidate program trees. The most important purpose that conversion into, and out ofk-expressions using code trees is to allow crossover and mutation to take place onk-expressions.
In order to implement the terminals turn left and turn right, it is necessary to store a direction state variable. This was embedded into the cell state variable (32-bit integer stored in the lattice). Bitmasking functions written in C and used within a MOL extension allowed the agents to retrieve and update these state variables. Storing this kind of information in this manner is not absolutely necessary, but it is done to enable simple compatibility with the parallel code generator at a later stage.
Additional parameters chosen for this model are shown in Table6.2.
Parameter Value
Total Generations 10000
Timesteps per Generation 200
Population Count 200
Grid Size 32 by 32
P(mutate) 0.1
P(crossover) 0.8
Program Head Length 17
Total Genotype Length 17∗2 + 1
TABLE6.2: Parameters used for optimising model structure for the original version of the Santa Fe Ant Trail problem.
Figure6.6shows1000generations of a sample model optimisation run. In this instance, lower scores indicate less food tokens left on the lattice after execution of 200 time steps. The target is to reach zero. As seen in the plot, mean fitness decreases steadily, and minimum fitness in the populations drop significantly at irregular intervals. Finally, at around generation950, the best possible fitness is achieved. The candidate found is discussed below.
Computing time was approximately two and half hours, and a program was generated which obtained the best possible score. The program tree is shown in Figure6.5, and the correspondingk-expression was:
QRPLQMQIPIILLRIIPRLRLRRRRL
Which does not show intron symbols. It is interesting to note that this program needed26symbols, whereas the hand-written solution (which was IMPLPIPMRRPLPRPIMML), required19.
At this point it is wise to consider the problem of performance. The Santa Fe Ant Trail problem is of inherently low complexity, since there is only one active agent on a lattice, which must execute code. In addition, it is completely deterministic, requiring no averaging to obtain a statistically significant score. However, two performance characteristics are considered, the first is the time taken to compile the programs, and the second is to execute them all for one timestep.
140 6. PARALLEL DOMAIN-SPECIFIC OPTIMISATION IN ABM
PROGN3
RIGHT PROGN2 LEFT
PROGN3 MOVE
PROGN3 IF-FOOD-AHEAD PROGN2
IF-FOOD-AHEAD IF-FOOD-AHEAD LEFT LEFT RIGHT IF-FOOD-AHEAD IF-FOOD-AHEAD
PROGN2 RIGHT LEFT RIGHT
RIGHT LEFT
LEFT RIGHT RIGHT RIGHT
FIGURE6.5: The best candidate generated by the MOL recombination optimiser (executed using the code shown in Listing6.7). The score this candidate achieved was zero, which is the best possible score.
FIGURE6.6: An optimisation run showing fitness by generation of the Santa Fe Ant Trail problem using the MOL code shown in Listing6.7. As seen in this plot, a steady decrease in average fitness is followed by frequent drops in minimum fitness until the best possible score (0) is achieved.