• No se han encontrado resultados

CAPÍTULO 11: CONTEXTOS FUNERARIOS Y CEREMONIALES

11.4 Aproximaciones al ritual ofrendatorio

Here we assume that h-re nement approach has been used and a hierarchy is main- tained. In order to illustrate the simple idea behind di usion algorithms it is con- venient to introduce a weighted graph which, following Vidwans et al. ([104]), we call a Weighted Partition Communication Graph (WPCG). This represents the face adjacency of thekPk processors being used (processors that share at least one edge

of a root element with a given processor are said to be face adjacent to that pro- cessor). A WPCG is obtained by having one vertex for every processor and an edge between two vertices if and only if they are face adjacent to each other. The weight wNi of the i

mesh which reside on theith processor and the weightwE

ij of the edge connecting

the ith andjth processors is equal to the number of leaf-level edges which lie on the

interpartition boundary between the two processors. Di usion methods correspond closely to simple iterative methods for the solution of di usion problems; indeed, the surplus load can be interpreted as di using through WPCG towards a steady balanced state.

2.4.1 Basic Di usion Method

This iterative approach, which is described in [12] for example, is a very simple and intuitive parallel method for dynamic load balancing. Here for each vertex in the WPCG we transfer an amount of work to each of its neighbours which is proportional to the load di erence between them. In general this approach will not provide a balanced solution immediately, so the process has to be iterated a number of times until the load di erence between any two processors is smaller than a speci ed value. In e ect this method di uses the load gradually amongst neighbours. If we denote by li the load of the processor pi then the above basic

di usion method can be described algorithmically by the procedure given in Figure 2.1.

The main advantage of this method is that it only needs communications among neighbours (which may also be asynchronous). The main disadvantage is that the convergence can be slow (in the worst case the number of iterations needed to reach a given tolerance is O(kPk

2) where

kPk is the total number of processors ([52]))

and the method is neither able to detect a global imbalance nor able to remedy it (see [52] for an example). It may also be noted that a processor pi essentially acts

simultaneously on all its interprocessor communications channels. Even though a machine may have parallel hardware for communication, the communication will often have to be serialised with respect to an individual processor.

In order to avoid these shortcomings we consider another di usion method, to be called the multi-level di usion method ([52]).

2.4.2 A Multi-Level Di usion Method

This is basically a divide-and-conquer type of approach. Let P be the WPCG (see

begin

while (not converged) do for all processors pi do

for all Ni neighbours pj of pi do

if li > lj

transfer b(li?lj)=2c load from pi to pj

end for end for end while end.

Figure 2.1: Di usion method.

& %

in the set P at that stage. The change in computational load on processor pi is

denoted byli. The sum of the load incrementsli of all subproblemspi in the subset

Pj of P is denoted by Lj. The procedure balance shown in Figure 2.2 achieves the

desired load balance. It is important to note that the bisection step in Figure 2.2 means the following:

- P1 \P 2 = ;, - P1 [P 2 = P, - jkP 1 k?kP 2 kj 1.

It is also important to note that no assumptions on the processor topology are made by the algorithm. Hence the user has the freedom to orient the bisection of the processor sets towards his/her processor topology if this is appropriate. It can easily be seen that the average case time complexity of this algorithm is O(logkPk).

The principle drawback of this algorithm is that it is not always possible to bisect a connected graph into two connected subgraphs. Also the conditionjkP

1 k?kP

2 kj

1 is too restrictive in the sense that relaxing this condition may improve the quality of the load balancer.

As a matter of fact the dynamic load balancing algorithm presented in forth- coming chapters relaxes this condition in addition to choosing the sorted version of

begin balance(P)

if kPk= 1 then return

bisect P intoP1 and P2

calculateL1 and L2 transferb(L 2 kP 1 k?L 1 kP 2 k)=(kP 1 k+kP 2 k)c load fromP 2 to P1 balance (P1) balance (P2) end balance.

Figure 2.2: Multi-level di usion method.

& %

the Fiedler vector for the purpose of bisections.

2.4.3 Dimension Exchange Method

In [23] Cybenko shows that the basic di usion algorithm is very slow to converge and therefore proposes an alternative version of the algorithm known as the di- mension exchange method. This method is designed speci cally with a hypercube architecture in mind.

Let us rst de ne the edge-colouring of a graph G = (V,E). By this we mean that the edges of G are coloured with some minimum number of colours (say k) such that no two adjoining edges are of the same colour. A dimension is then de ned to be the collection of all edges of the same colour. Let us assume that we have an edge-colouring of the WPCG. Then the dimension exchange method can be described in terms of the procedure shown in Figure 2.3.

Xu and Lau (see [117, 118]) have generalised the dimension exchange method by introducing an exchange parameter and called the new method the generalised dimension exchange method. In their paper they have also analyzed its properties and potential eciency.

Unfortunately all of the above mentioned algorithms do not take into account one important factor, namely that the data movement resulting from the load bal- ancing schedule should be kept to a minimum. Also no information is given about

Procedure for processor i ( 0 i < kPk)

begin

while (not Terminate) for( c = 1; c  k; c++)

if there is an incident edge coloured c

load balance the two connected processors end if

end for end while end procedure.

Figure 2.3: Dimension exchange method.

& %

culates the total weight to be transferred.

Documento similar