• No se han encontrado resultados

In order to meet the first requirement as discussed above, ie that of providing OR parallel execution, a process is defined as the flow of computation involved in rewriting an expression and it exists until an alternative branch of the expression tree becomes rewritable. At this point the process spawns offspring processes to correspond with the alternative nodes and terminates. If no alternative nodes are encountered during rewriting a process terminates when it has reduced the expression to its minimum or fixed point (as with the sequential version). Processes can thus be spawning or non spawning, the latter corresponding to the leaves of the solution tree. In a situation where no OR nodes exist the whole query evaluation will take place in one process.

The spawned processes become candidates for parallel execution: whether they are actually evaluated simultaneously will depend on the architectural considerations and the available computational resources but the model provides for the possibility.

The method of process spawning involves message passing between a parent process and its offspring. Essentially when a process encounters an OR node it creates a message structure for each alternative containing the information required to establish the offspring process. These messages are used to trigger the creation of new processes: in a "real" system some or all of these messages would be transmitted across the communication medium to other processing elements to inaugurate the execution of the new processes. Because of the manner in which processes are defined in the system, message passing is one way, ie parent to children, and there is no reverse communication. The other aspect of communication to note at this stage is that it follows a one to many pattern: one parent needs to communicate with a minimum of two offspring at the same time. Fig.5.2 shows this in diagrammatic form for the example given above.

v^ruipitr n u t

,c(x) and e(x) message

a(x)

Fig. 5.2 - Process Representation

message message

b(x) and e(x)

d(x) and e(x)

Because of the requirement that processes are fully independent of each other it follows that the messages which inaugurate them must contain all the necessary information for them to start execution. In Chapter 2 the concept of "environment" for a process has been described: for an OR process in Prolog this consists of the current goal list and binding values, in the PLL an expression tree and binding values. This is the point at which models designed specifically for shared memory machines have a considerable advantage as the transfer of the environment from a parent process to its offspring can be achieved by using shared memory rather than a message containing the necessary information. As the aim in defining intercommunicating parallel execution systems usually favours forcing the computation/communication balance in the direction of computation, the question of representing the environment in a message passing system is of prime importance. In order to avoid large communication overheads it is necessary to condense the environmental information into an optimised message format.

As discussed in Chapter 3.1.3.3 the problem with non shared memory systems is that data on the environment which has to be made common to two processes must either be copied or recomputed. The approach taken in this project is that shared memory machines are too limiting for systems which display large potential for parallel execution, such as OR parallel Datalog programs. Hence the computational overheads of copying and/or

y^ruipitf n u t

recomputing the parental environment have to be accepted but reduced as much as possible.

In the PLL system the environment of a process can be regarded as comprising two parts: the expression tree on the evaluation stack, and the binding values in the variable area of the stack as indicated in the binding list. Because the model is designed for a non shared memory system each process operates within its own independent environment. At any stage during query evaluation these two aspects represent the state of the computation and thus information on them must be passed to offspring processes at the time of spawning. Two points in connection with the expression tree need consideration at this stage: first that the expression tree is not in a suitable format to be passed between processes, and a mechanism for representing it in a linear form must be devised. The linear message will need decoding by the recipient process in order for the expression tree to be re-created. This process is analogous to parsing a query. Secondly a method to keep the part of the message which describes the expression tree as small as possible must be found.

Two methods of cutting down these overheads have been simultaneously employed; one involves the introduction of an optimised message packet which in turn leads to a degree of recomputation. As this is tied to the architectural considerations it will be discussed in Chapter 6.

The second method is to assume that there is a copy of the interpreter plus user defined rules globally available on a read only basis. The most likely implementation of this is to hold a copy of the rewrite interpreter in each processing element. The interpreter consists of the meta or system defined rewrite rules plus the user defined rules which are stored in the rule network as described in Chapter 4.6. At this stage no distinction is made between user defined structures representing base predicates or relations and those which define higher level user "rules", the assumption being that both types of information is immediately available. In a realistically large system it is a reasonable assumption that most of the base predicates would be stored on disk. The memory storage implications for this are discussed in the next chapter. The reason for assuming that each process has an available copy of the interpreter to refer to is that it enables part of the process environment to be described by pointers into the rule network, thus making the process representation more compact. The problem of bindings

x t v 't '

however still remains. In a model that does not encompass sharing evaluation memory space, binding values have to be included in full in the process creation message. The amount of data that this will involve obviously varies considerably. In systems where heavy reliance is placed on large structured terms it will make for unwieldy communications.

The process based nature of the system is shown diagrammatically in Fig.5.2; this represents processes at the computational model level. The next step in the design is to move to the second level, ie the implementation of the parallel interpreter, and look at the manner in which the abstract concept of independence of processes can be incorporated into the rewrite rule system. This is discussed in the next section. The final level, ie the mapping of the language system onto a parallel architecture and a simulation of its performance, is the subject of chapters 6 and 7.

Two important aspects concerning the model can be seen in Fig.5.2. First the independence of processes means that there is no concept of ordering of process execution. As far as the theoretical system is concerned the processes can be evaluated in any order without effecting the validity of the final outcome. In a theoretical parallel system where processes are evaluated as soon as they are created, the effect is comparable with a breadth first search of the solution tree. In a "real" system computational resources are unlikely to be adequate to provide for simultaneous execution of all available processes, and some form of scheduling will be involved. The independence of processes means that different scheduling schemes can be tried out without any worries about the correctness of the system.

The second feature of the model that the diagram shows is the replicated evaluation of the mutually conjoined expression, ie e(x). This would appear to produce a significant overhead in the amount of computation taking place in the system, although if processes were all being evaluated simultaneously the overall time to produce the query response would not be diminished. This would seem to indicate that the first approach to OR parallelism as described in Chapter 5.3.1 could produce a more efficient system. In fact the amount of repeated or redundant computation is often less than initially expected. Because of the manner in which AND node rewriting takes place, in the situation where b(x), c(x) or d(x) produce FALSE results, no evaluation of e(x) is attempted. If however b(x), c(x) or d(x) are themselves rewritten to other expressions and produce

individual and different bindings, these need to be involved in the rewriting of e(x) from the start. Thus in these two situations there is no unnecessary repeated evaluation of e(x). The occasion where redundant computation does exist is when neither b(x), c(x) or d(x) is further reducible: in that instance e(x) will be evaluated three times under the same environmental conditions.

Because of these considerations it has been decided that when the new expression tree is set up in the new process the expression representing the alternatives is placed on the left hand arm of the AND node, thus ensuring that the interpreter will attempt to rewrite it first. In the following two cases the spawned processes will have the same expression trees to work on: s(x) and (r(x) or t(x)),

and

(r(x) or t(x)) and s(x)

will both result in these two processes r(x) and s(x),

t(x) and s(x).

Because of the manner in which the conjunction rewrite rule works this will ensure that the alternative subexpressions, ie r(x) and t(x) are evaluated first in the two spawned processes (see Chapter 5.4.4.5).

5.4. The Implementation of the Parallel Process Model