• No se han encontrado resultados

17 PATRONES DE CAMBIO

In document EL TAO DE LA FÍSICA (página 105-115)

The code base on which the DEC OSF/ l product is built, i.e., the Open Software Foundation's OSf'/1 software, provides a strong foundation for SMP. The OSF further strengthened thi s foundation in OSF! l

ve rsions 1 . 1 and 1 .2, when it corrected multiple Si'vll' problems in t he code base and parallel ized (and thus unfunneled) additional subsystems. As the mu ltiprocessing bootstrap effort continued , the team analyzed and incorporated the OSF/

I

ver­ sion 1 . 2 S1Y!P improvements into DEC OSF!l version

3.0. As strong as this starting point was, however, some st ructures in the system did not receive the

DEC OSF/ 1 Version .). 0 Syrnmetric Multiprocessing Implementation

appropriate level of synchronization. The team cor­ rected these problems as they were uncovered through testing and code inspection.

The DEC OSF/ 1 operating system uses a combina­ tion of simple locks, complex locks, elevated SPL, and funneling to guarantee synchronized access to system resources and data structures. Sim ple locks, SPL. and fu nnel ing were described briefly in the earl ier d iscussion of preemption. Complex locks. l i ke elevated SPL, are used in both uniprocessor and m u l tiprocessor enviro nments. These locks are usu­ ally sleep locks- threads can block while they wait for the lock-which offer additio nal features, i nclud i ng m u l t iple-reader/single-writer access and recursive acquisition.

An example of the use of each synchronization technique fol lows:

• A simple lock is used to protect the kernel 's cal l­

out (timer) queue. In an S,\1 P environment, m u l­ tiple threads can update the ca l lout queue at the same time. as each of them adds a t i mer entry to the queue. Each thread must obtain the call­ out lock before adding an entry and release the lock when done. The cal lout simple lock is also a good example of SPL synchron ization under multiprocessing because the cal lout queue is scanned by t he system clock l S R . Therefore, before locking the cal lout Jock, a thread m u st raise the SPL to the clock's lPL. Otherwise, the thread holding the cal lout lock at an SPL of zero can be interrupted by the clock I S R , which w i l l in tu rn attempt to take the callout lock. The resu lt is a permanent dead lock.

• A complex lock protects the file system direc­

tory structure. A blocking lock is requ ired because the d i rectory lock holder m ust perform I/O to update the directory, which itself can block. Whenever block ing can occur w h ile a lock is hel d , a complex lock is requi red.

• Fu nnel ing is used to synchronize access to the

I S O 9660 CD-ROM file syste m . - The decision to fu n nel this file system was .largely due to l i mi ta­ tions in the DEC OSI'/ 1 version 3.0 schedu le; however, the file system is a good cho ice for fun­ nel ing because of its general l y slow operation and I ight usage.

To ensure adequate performance and seal ing as processors are added to t he system , an SJ\II P imple­ mentation must provide fo r as much paral lel ism through the kernel as possible. The granularity of

Digital Technical jourual Vol. (J No . . > Summer 19')4

locks placed in the system has a major i mpact on the amount of paral lelism obtained .

During multiprocessing developmen t, locking strategies were designed to

Reduce the total number of locks per su bsystem • Reduce the number of locks ta ken per subsys­

tem operation

• Improve the level of paral lelism throughout the

kernel

At t i mes, t hese goals clashed: enhancing paral­ lelism usually involves add ing a lock to some struc­ ture or code path. This outcome confl icts with the goal of reducing lock counts. Consequent ly, in prac­ tice. the process of successfu l l y para l lel izing a sub­ system involves striking a balance between lock red uction and the resulting increase i n lock granu­ larit y. Often, benchmarking different approaches is required to fine- tune this balance.

Several general trends were uncovered during lock analysis and tuning. In some cases locks were removed because they were not needed; they were the products of overzealous synchro nization. For example, a structure that is pr ivate to a thread may require no lock ing at a l l . Moreover, a data ele­ ment t hat is read atomica l l y needs no locking. An

example of lock removal is the getti meofday( ) sys­ tem cal l, which is used frequent ly by DBMS servers. The system cal l simply reads the system time, a 64- bit quantity, and copies it to a buffer provided by the cal ler. The original OSF/ 1 system cal l . running on a 32-bit architecture, had to take a simple lock before reading the time to guarantee a consistent value. On the Alpha archi tecture, the system call can read the ent ire 64 -bit time value atomical ly. Removing the lock resulted in a 40 percent speed up.

In other cases, analyzing how structures are used revealed that no lock ing was needed. For example, an 1/0 control block cal led the buf structure was being locked in several device drivers while the block was in a state that al lowed only the device driver to access i t . Removing t hese unnecessary locks saved one complex and one simple locking sequence per l/0 operation in these drivers.

Another effective optim ization involved post­ poning lock i ng until a thread determined that it had actual work to do. This technique was used success­ fu l l y in a routine frequently cal led in a transaction processing benchmark. The routi ne, which was locking structures in anticipation of fol lowing a rarely used code path, was mod ified ro lock only

DEC OSF/1 Synunetric Multiprocessing

when the u ncommon code path was needed . This optimization significantly reduced lock overhead. To improve paral lel ism across the system, the

DEC OSf/1 SMP development team modified the lock

strategies in numerous other cases.

In document EL TAO DE LA FÍSICA (página 105-115)