SITUACIONES TIPO III - HORIZONTE INSTITUCIONAL

CAPÍTULO II. HORIZONTE INSTITUCIONAL

ARTÍCULO 64. SITUACIONES TIPO III

The Gamma database needs to be efficient at adding new tuples, searching for individual tuples and searching for a subset of tuples with some common field values. The default implementation of each table in the sequential Gamma database is TreeSet. When a new tuple is moved to Gamma database, JStar runtime uses the compareTo method defined in the orderby declaration to check if this tuple has been present before. This comparison method only compares the key fields and each of the operations depends on the data type of the compared key field. If the key field is an integer, then the comparison uses the greater than (>) operator and the less than (<) operator to determine the equality of two tuples. As for the String values, the comparison uses the compareTo method of Java String class to compare two strings. If the comparison results of all the key fields are the same, then the method returns zero. Otherwise, it returns a non-zero value. The new or non-existing tuples are moved to the Gamma database; the duplicate tuples, whose result is zero, are discarded.

Consider the PvWatts example. The following procedure is used to insert a new PvWatts tuple to the table in Gamma database. First, the year value of this tuple is used to compare with one tuple in the Gamma database. If the year value is greater than that of the other tuple, then the method returns the positive one (+1). If the value is less than the other, then the negative one (-1) is returned. When the result is equal, the next key field (the month value) is used to compare these two tuples. This comparison method continues until the result is returned or it has used all the key fields for comparison. Then another tuple in Gamma is chosen to compare with this tuple and repeat the above procedure until all of the tuples have been compared. If the final result is zero, then the Gamma database adds this tuple. Otherwise, it ignores this tuple without doing any action.

Chapter 3 Related Work

The parallel computing divides a great deal of computation into many tasks which can be carried out in parallel. Having many computers to work on the same problem can shorten the completion time and achieve the same or better performance than using a single unit computer. To provide a large number of the computing resources, the High Performance Computing (HPC) facility are either the multi-core machines or clusters of small machines that are linked together through the local network or interconnect to work together. And currently the clusters of multi-core machines are preferable because their prices are more affordable and have more computational power than the single-unit machines [26].

The HPC parallel programming model employs the distributed or shared memory design to parallelize the computation across the multiple processors. The distributed memory programming model distributes the tasks and data over the multi-core machine, where each processor owns one private memory space to store the data locally. If the processor requires the non-local data, it must communicate with other processors and move the remote data to its local memory. Referencing the remote data takes some extra time to find the data location and thus could lead to an unexpected delay. Instead of data distribution, the shared memory programming model keeps all the data with one public memory space where the processors can retrieve data from.

3.1. LIBRARY-BASED SHARED MEMORY PROGRAMMING LANGUAGES 31 The distributed memory offers the memory locality to allow each processor to store the data in the closest (private) memory location. As each processor mostly uses the local data, the distributed memory model can avoid the race conditions but produce inevitable communication costs. Message Passing In- terface (MPI) is the communication library for the distributed memory model. The MPI can be called directly from C and Fortran, or packaged as a library and imported into a Java project. The MPI program is efficient and portable as the MPI interface has been widely adapted in every distributed memory systems and also optimized to provide the good performance.

The shared memory system uses one single memory space to store the data and provides the unified global memory addresses to quickly locate and retrieve the data. As the same data are occasionally synchronized by more than one processors, the shared memory program may have the performance problems, such as the race conditions.

3.1 Library-based Shared Memory Program-

ming Languages

Java language is considered as a option for programming on parallel hardware. With built-in multi-threading and networking APIs, Java programmers can write a parallel application to utilize the computing power of the multi-core machines in a cluster. But writing a low-level multi-threading application is hard and buggy as the data synchronization requires the external mechanism and the incorrect design of parallel tasks may lead to deadlock. Therefore, Java from 1.5 specification supports the high-level shared memory programming with several concurrency utilities, including the thread pools, the concurrent collections and the atomic variables.[26]

Using Java for HPC may have some difficulties. Although the performance of JVM has been continuously improved by experts and engineers, the performance of Java HPC solutions may still be reduced by some unpredictable

3.2. PARTITIONED GLOBAL ADDRESS SPACE LANGUAGES 32

In document PACTO DE CONVIVIENCIA. Institución Educativa Técnica Agropecuaria Mariano Melendro. Pacto de Convivencia Escolar y Social - PACES (página 62-65)