• No se han encontrado resultados

Los profetas y los santos fueron personas como nosotros mismos que tenían

As discussed earlier, mapping of clusters to physical processors is necessary for UNC scheduling when the number of clusters is larger than the number of physical processors. However, the mapping of clusters to processors is a relatively unexplored research topic [121]. In the following we discuss a number of approaches reported in the literature.

Upon obtaining a schedule by using the EZ algorithm, Sarkar [160] used a list-scheduling based method to map the clusters to physical processors. In the mapping algorithm, each task is considered in turn according to the static level. A processor is allocated to the task if it allows the earliest execution, then the whole cluster containing the task is also assigned to that processor and all the member tasks are marked as assigned. In this scheme, two clusters can be merged to a single processor but a cluster is never cracked. Furthermore, the allocation of channels to communication messages was not considered.

Kim and Browne [105] also proposed a mapping scheme for the UNC schedules obtained from their LC algorithm. In their scheme, the linear UNC clusters are first merged so that the number of clusters is at most the same as the number of processors. Two clusters are candidates for merging if one can start after another finishes, or the member tasks of one cluster can be merged into the idle time slots of another cluster. Then a dominant request tree (DRT) is constructed from the UNC schedule which is a cluster graph. The DRT consists of the connectivity information of the schedule and is, therefore, useful for the mapping stage in which two communicating UNC clusters attempt to be mapped to two neighboring processors, if possible. However, if for some clusters this connectivity mapping heuristic fails, another two heuristics, called perturbation mapping and foster mapping, are invoked. For both mapping strategies, a processor is chosen which has the most appropriate number of channels among currently unallocated processors. Finally, to further optimize the mapping, a restricted pairwise exchange step is called for.

Wu and Gajski [189] also suggested a mapping scheme for assigning the UNC clusters generated in scheduling to processors. They realized that for best mapping results, a

dedicated traffic scheduling algorithm that balances the network traffic should be used. However, traffic scheduling requires flexible-path routing, which incurs higher overhead. Thus, they concluded that if network traffic is not heavy, a simpler algorithm which minimizes total network traffic can be used. The algorithm they used is a heuristic algorithm designed by Hanan and Kurtzberg [83] to minimize the total communication traffic. The algorithm generates an initial assignment by a constructive method and the assignment is then iteratively improved to obtain a better mapping.

Yang and Gerasoulis [191] employed a work profiling method for merging UNC clusters. The merging process proceeds by first sorting the clusters in an increasing order of aggregate computational load. Then a load balancing algorithm is invoked to map the clusters to the processors so that every processor has about the same load. To take care of the topology of the underlying processor network, the graph of merged clusters are then mapped to the network topology using Bokhari’s algorithm.

Yang, Bic, and Nicolau [195] reported an algorithm for mapping cluster graphs to processor graphs which is suitable for use as the post-processing step for BNP scheduling algorithms. The mapping scheme is not suitable for UNC scheduling because it assumes the scheduling algorithm has already produced a number of clusters which is less than or equal to the number of processors available. The objective of the mapping method is to reduce contention and optimize the schedule length when the clusters are mapped to a topology which is not fully-connected as assumed by the BNP algorithms. The idea of the mapping algorithm is based on determining a set of critical edges, each of which is assigned a single communication link. Substantial improvement over random mapping was obtained in their simulation study.

In a recent study, Liou and Palis [126] investigated the problem of mapping clusters to processors. One of the major objectives of their study was to compare the effectiveness of one-phase scheduling (i.e., BNP scheduling) to that of the two-phase approach (i.e., UNC scheduling followed by clusters mapping). To this end, they proposed a new UNC algorithm called CASS-II (Clustering And Scheduling System II), which was applied to randomly generated task graphs in an experimental study using three clusters mapping schemes, namely, the LB (load-balancing) algorithm, the CTM (communication traffic minimizing) algorithm and the RAND (random) algorithm. The LB algorithm uses processor workload as the criterion of matching clusters to processors. By contrast, the CTM algorithm tries to minimize the communication costs between processors. The RAND algorithm simply makes random choices at each mapping step. To compare the one-phase method with the two-phase

method, in one set of test cases the task graphs were scheduled using CASS-II with the three mapping algorithms while in the other set using the mapping algorithms alone. Liou and Palis found that two-phase scheduling is better than one-phase scheduling in that the utilization of processors in the former is more efficient than the latter. Futhermore, they found that the LB algorithm finds significantly better schedules than the CTM algorithm.

Documento similar