Kernel Classes - The OOSH C++ Library - An object-oriented library for shared-memory parallel s

4. The OOSH C++ Library

4.5 Kernel Classes

The kernel classes are the layer of abstraction isolating machine-dependent classes from the rest of the code. Kernel classes are mostly used as building blocks for other classes.

The top of the class hierarchy is the class Object, which is an abstract class used to define common behaviour for most other classes including error handling and low level allocators. There are two other classes used respectively for constructing free lists and allocating memory a page at a time: Free_objects and Free_page. Allocation of memory a page at a time is implemented to support potential future work on program structuring for efficient TLB usage (see 8.2.1).

Queues are implemented on top of class Queue (the head of the queue and iterators over elements); objects that can be queued are derived from Queued_object.

Class Synch_data presents an application-level interface to Synch_ info (the machine-dependent class, which contains state identifying a specific barrier). Process is the application-level view of the class Machine_process, while Lock_data is the application-level view of Lock_ info.

Locks and barriers are implemented by conceptually similar mechanisms. A lock or barrier is identified by a specific object which maintains its state (class Lock_data or

University

of Cape

Town

if (*insert_pos==NULL)

{ Lock insert_lock(lock_data);

I lsti 11 NULL?

II (2) }

if (*insert_pOS==NULL) { *insert_pos = this;

return; II (1) }

if ( xxxx ) { "'

(a) OOSH approach

if(*insert_pOS==NULL) { LOCK(CellLock);

I* sti 11 NULL? *I

}

if (*insert_pos==NULL) { * i nsert_pos = p;

flag=FALSE;

} ^'

U~LOCK(CellLock);

i f (flag && ( xxxx )) { '"

(b) SPLASH approach Figure 4.2 OOSH vs. SPLASH Locks

from Barnes-Hut code; detail left out-the OOSH version releases the lock at (1) or (2) and relies on the compiler to release the lock in the destructor for the Lock object; on the other hand, in the SPLASH code, the programmer must contrive code to be sure the lock is unlocked exactly once and

under the right conditions

Synch_data, respectively). A lock is held for the lifetime of a variable of class Lock, which is given a pointer to a specific Lock_data object, for example:

{ Lock error _region (output_ lock);

cerr << error _message << flush;

}

The same lock can be used somewhere else:

{ Lock debug_ region (output_ lock);

cerr << debug_message << flush;

}

When the close of scope of a variable of class Lock is reached, the destructor is called. The constructor for Lock acquires the lock identified by the Lock_data pointer sent to it as an argument and saves a copy of the pointer in the new object of class Lock. The destructor for Lock releases the lock. Note that in each case, a local variable is being defined, and has to be given a name (error _region and debug_region in the two lock examples). Although the local name is not used anywhere else, such a variable does have to be given a name. Otherwise, the compiler is free to treat it as a temporary, and call the destructor immediately, instead of waiting for the name to go out of scope [Ellis and Stroustrup 1990, p 268].

54 AN OBJECT-ORIENTED LIBRARY FOR SHARED-MEMORY PARALLEL SIMULATIONS

• I

University

of Cape

Town

The lock construct has the advantage that it is relatively easy to use, and it is not possible to forget to release a lock. For example, code in the SPLASH version of Barnes-Hut uses complicated logic with a flag to determine which code should be executed in the loop which inserts a body into the octree, as well as whether to terminate the loop. The OOSH version of the code uses a return statement at the point the body has been inserted, and relies on the C++ compiler to release the lock by calling the destructor.

As an example, the two approaches are given in figure 4.2, which contains code from Barnes-Hut, with detail left out, to illustrate the principle. In 4.2a, the OOSH version, the lock is released either when the destructor is called at the return (labelled

"(1)")--or at the exit of scope of the Lock variable (at the comment labelled "(2)"). In 4.2b, the SPLASH version, explicit LOCK and UNLOCK macros are used, and the logic to ensure that UNLOCK is correctly used is comparatively complicated.

To avoid having the untidy situation of more than one UNLOCK macro for a single

LOCK, the SPLASH implementers chose to set a flag inside the inner if statement. The flag can then be tested at later points in the loop to ensure that no further execution occurs. If the SPLASH coders had instead chosen to use a return statement at the point where no further work is required, they would have had to remember to put an

UNLOCK in two places, which requires careful reading by the programmer to be sure that the construct is correct.

Barriers are implemented roughly the same way as locks. A barrier has no concept of opening and closing, but to allow for the possibility of absorbing minor load imbalances between phases of a timestep, barriers are implemented in OOSH with two phases: announcing arrival at the barrier, and waiting for all other processes to announce their arrival at the barrier. These two phases are also implemented using constructors and destructors. In principle it is possible to put code which is not dependent on synchronization between these two parts of the barrier. Usage of a barrier looks like this:

{ Delayed_synch wait_name (delay_name);

11 can insert code not dependent on synchronizing here

University

of Cape

Town

}

where delay_name is a pointer to a Synch_data object, and wait_name is a local name. As with locks, the local name is essential to prevent the destructor from being called prematurely, but can also be useful for identifying this synchronization point to the human reader of the code.

If no code is inserted inside the scope of a De layed_synch, the effect is the same as a conventional barrier.

One other group of classes of general use is ArraylD, ArrayZD and Array3D, which respectively implement arrays of 1, 2 and 3 dimensions. These array classes can store pointers to any descendant of class Object and their size can be dynamically determined when they are constructed. Their major advantage over conventional arrays is dynamic size determination. In principle, the same effect can be achieved in C by implementing multi-dimensional arrays as arrays of pointers [Kernighan and Richie 1978]. Classes which redefine the array indexing operator allow redefined indexing to be done more transparently.

In document An object-oriented library for shared-memory parallel simulations (página 62-65)