The Genetic Algorithms Manipulation Environment was conceived as the central part of the principal parallel GA project funded by the European Commission. The ESPRIT III Programming Environment for Applications of PArallel GENetic Algorithms (PAPAGENA) aimed at disseminating the use of parallel genetic algorithms in complex optimisation and m odelling problem s. The PAPAGENA project involved many partners including private companies, universities and research centres from Germany, England, Holland and France. Three applications have been developed in the project in different domains namely: finance, bio informatics and economic modelling.
The financial application, developed by CAP Volmac and KIQ, provides predictive systems to assist financial organisations to optimise their decisions in fields such as credit scoring, insurance risk, or marketing expenditure[42]. Genetic programming is used to construct an algebraic formula that can regenerate, and hopefully predict, a series of training values. Populations of candidate formulas created by the GP are scored on the basis of how well they fit the training set and their ability to predict a validation set. This type of problem evolves highly
Chapter 4________________________The GAME System________________________________ 7
^
com plex genetic structures represented as parse trees using G A M E’s g en etic-orien ted abstractions. In this context, the algorithm must be able to manipulate operators as well as a large number of possible variables. The operators relate to the functions used to construct the algebraic formulae, whereas the variables relate to the possible data fields (e.g., number o f credit cards, house price, etc.). Biological-like operators are then defined to directly manipulate the two possible data types (operators and variables). For example, the crossover operator swaps sub trees at nodes of equivalent data types. Similarly, two distinct forms of mutation are defined to be applied over variables and algebraic operators. Anticipating the system’s usage as a financial modelling tool, research concentrates on inducing algebraic formulae from sets of noisy, possibly incomplete, and even contradictory real-world data.
In the bio-informatics domain, an application has been developed at Brainware GmbH to predict stable protein conformations. This problem has the potential to open up a vast new world of drug design and medical treatments. Parallel GAs are being applied to search energetically and structurally favourable protein conformations. Parallelism is fundamental to this apphcation due to the amount of data to be processed for each GA generation. It is exploited in three principal modules, namely: transformation, evaluation and recombination [74]. The transformation module converts proteins’ descriptions between polar and Cartesian co-ordinates. Cartesian co-ordinates are commonly used to describe the spatial organisation of protein molecules. However, genetic manipulations are easier to implement using polar co-ordinates. The transformation module was then introduced in the application to acconunodate both requirements. Each protein in the GA population is described using polar co-ordinates, which are then converted to Cartesian co ordinates before undergoing fitness evaluation. The evaluation module contains the objective function that computes the stability of protein conformations. Protein evaluations are extremely time consuming, requiring the use of a large data base of molecules. The data base provides the important characteristics of molecules that are used by the objective function to work out stabihty of a particular protein conformation. Finally, the recombination module implements the genetic algorithm, which creates new protein conformations by recombining and modifying molecules and spatial organisations of existing proteins. The degree of parallehsm required in this application is achieved with the implementation of these modules as GAME Components, and distributing many instances of them among several processors.
The economic modelling application, also developed by Brainware in collaboration with IfP, is targeted at simulating a variety of possible scenarios associated with the current economic changes within Eastern Europe [89]. In this case, GAs are used to m im ic the behaviour of complex multi-agent systems, subject to a variety of economical and physical constraints. In essence, an artificial economy is created within the computer, modelled in terms of traditional economic theory, evolution, and principle-based engineering [88]. The application uses three GA components running in parallel to evolve three separate “m odels” o f economic agents: the
Labour-Market, the Enterprise and the Locational models. Each model is represented by distinct genetic structures using GAM E’s genetic-oriented abstractions. The results of each module’s generation are analysed by the Global Economic module that tries to find the best combination of requirements and features of model.
This application is expected to help national and local governments by providing a knowledge basis to assist in the formulation and implementation of effective strategies in many sectors, e.g., investment, industrial location, logistics, etc. It has already been adopted by the Brandenburg State in Germany, as a means for modelling and understanding local labour movements, which have risen considerably since the German unification.
Other PAPAGENA partners were TELMAT Informatique from France and the German National Research Center for Computer Science (GMD). TELMAT was responsible for porting GAME onto their transputer-based parallel machines, whereas GMD provided the theoretical foundations and research support for the development of parallel genetic algorithms’ applications.
4.5. Summary
This chapter presented an overview of the GAME program m ing environm ent. It introduced the system’s genetic-oriented abstractions for the representation of diverse problems and a program m ing model that helps with the creation of portable sequential and parallel applications. GAME’s modular architecture was described, followed by a brief presentation of the system’s five main modules. The design and implementation of three of GAME’s modules - the Virtual Machine, the Parallel Execution Module and the Service and Genetic libraries - that constitute the main subject of the research reported in this thesis, are described in more detail in the next chapters. Finally, a short description of the PAPAGENA project was presented to highlight the importance of the GAME system in the context of a European project, aimed at solving real-world problems.
Chapter 5
The Genetic-Oriented Representation
and the Virtual Machine
This chapter reports on the design and implementation o f GAME'S Virtual Machine module. It starts by introducing the abstractions and objects that grants GAME the ability to '‘genetically" represent a broad range o f problems. The Virtual Machine, its m odules - the P opulation M anager, the Fitness E valuator and the P arallel Support - and the VM Application Program Interface are then described.
5.1. Overview
The description of GAME’s genetic-oriented abstractions for problem representation and the Virtual Machine module presented in this chapter focuses more on their design than on their implementation aspects. The objective is to give sufficient information about their design to allow other implementations, possibly using even a different programming language. Nevertheless, C++ class declarations are provided, along with the description of their most important member functions and data.
This chapter starts by explaining the importance of a problem’s representation and how GAME facihtates their manipulation via its genetic-oriented abstractions. The following sections show how data of different types are stored, and how the representation structure is organised. One of the sections discusses the problem o f addressing large and deep storage units in the representation and presents the solution adopted.
The modular design of the Virtual Machine is then presented. The VM comprises three modules: the Population Manager, the Fitness Evaluator and the Parallel Support. The Population Manager (PM) is responsible for the execution of genetic manipulation commands; the Fimess E valuator (FE) em beds the problem -dependent objective function and perform s related computation (total, average, etc.); and the Parallel Support module controls the execution of many PM and FE instances on parallel platforms. The VM Application Program Interface, its commands and communication objects - VmMsg - that transport them are also described.