• No se han encontrado resultados

3.2 Requerimientos del Sistema

3.2.1 Funcionalidad

Parallelism & driving force of com!uter design & energy and cost "eing the !rimary design constraint#

 There are "asically two )inds of Parallelism in a!!lications#

%#Da'a@Lee! Para!!e!is 1DLP2 There are many data items that can "e o!erated on at the same time#

&#Tas@Lee! Para!!e!is 1TLP2 arises "ecause tas)s of wor) are created that can o!erate inde!endently and largely in !arallel#

Com!uter hardware can e1!loit these two )inds of a!!lication#

Parallelism in ma6or four ways#

(#Ins'r+.'i"n Lee! Para!!e!is 1ILP2

•. E1!loits 2P with com!iler#

•. =ll Processors since a"out .% use !i!elining to overla! the e1ecution of instructions and im!rove !erformance#

•. This !otential overla! among instructions is called Instruction 2evel Parallelism#

•.  The instructions can "e evaluated in !arallel#

&#Ve.'"r Ar.-i'e.'+res an/ Grap-i. Pr".ess"r Uni's 1GPU2

E1!loits 2P "y a!!lying a single instruction to a collection of data in !arallel#

(#T-rea/ Lee! Para!!e!is

E1!loits either 2P or T2P in a tightly cou!led hardware module that allows for interaction among !arallel threads#

*#Re:+es' Lee! Para!!e!is

E1!loits !arallelism among largely decou!led tas)s s!eci*ed "y the !rogrammer or the o!erating systems#

Michael 5lynn !laced all com!uters in to one of four categoriesA

%# Sing!e Ins'r+.'i"ns Sing!e Da'a 1SISD2 s'rea

A

•.

Uni!rocessor category #

•.

/tandard se0uential com!uter, "ut it can e1!loit I2P#

•.

/I/ architectures that use I2P techni0ues such as su!erscalar#

&# Sing!e Ins'r+.'i"ns M+!'ip!e Da'a 1SIMD2 s'rea

A

•.

In a /IM machine, the same instruction is e1ecuted "y multi!le !rocessors using dierent data streams#

•.

Each !rocessor has its own data memory, "ut there is only one instruction memory and control !rocessor, which fetches and dis!atches instructions#

•.

/tandard se0uential com!uter, "ut it can e1!loit I2P#

•.

/I/ architectures that use I2P techni0ues such as su!erscalar#

•.

It e1!loits 2P, "y a!!lying the same o!erations to

multi!le items of data in !arallel#

(# M+!'ip!e Ins'r+.'i"ns Sing!e Da'a 1MISD2 s'reaA

o commercial multi!rocessor of this ty!e has "een "uilt to date#

*# M+!'ip!e Ins'r+.'i"ns M+!'ip!e Da'a 1MIMD2 s'reaA

Each !rocessor fetches its own instructions and o!erates on its own data#

 These !rocessors either utili8e centrali8ed shared memory architecture or each has its own memory and they communicate with each other through cross"ar networ)s#

/IM !rocessors can e1!loit data !arallelism, "ut are not as De1i"le as MIM !rocessors# They are suita"le for algorithms with high data

!arallelism and little data de!endent control Dow#

MIM !rocessors are more De1i"le, they can "e either function as single-user machines, focusing on high !erformance for one !articular a!!lication or as multi-!rogrammed machines running many tas)s simultaneously#

>owever they are much more e1!ensive and com!licated due to

re!lication of control hardware, high instruction "andwidth re0uirement and /ynchroni8ation of data !ath#

Kesides !ure /IM and MIM a!!roaches, a com"ination of "oth /IM

and MIM a!!roaches is also !ossi"le, e1!loiting the advantages of "oth /IM

and MIM architectures#

 Tightly cou!led MIM architectures e1!loits T2P , since multi!le coo!erating  Threads o!erate in !arallel#

2oosely cou!led MIM architectures 3Clusters and 7/C4 e1!loits R2P, where many inde!endent tas)s can !roceed in !arallel with little need for communication and /ynchroni8ation#

MULTITHREADING

M+!'i'-rea/ing /imultaneous e1ecution of two or more threads

"y the multi!le !rocessors#

On a /ingle !rocessor, Multithreading generally occurs "y Time

ivision Multi!le1ing 3TM4# The !rocessor switches "etween dierent threads#

On a Multi!rocessor the threads or tas)s will actually run at the same time with each !rocessor or core running as !articular thread or tas)#

T)pes

# Coarse-grained Multithreading#

9# 5ine-grained Multithreading#

:# /imultaneous Multithreading#

A/an'ages "3 M+!'i'-rea/ing

# If a thread gets a lot of cache misses, the other thread can continue, ta)ing

advantage of unused com!uting resources, which thus can lead to faster overall

e1ecution, as these resources would have "een idle if only a single thread was

MULTITHREADING

Disa/an'ages 

# Multi!le threads can interfere with each other, when sharing hardware

resources such as caches or T2P#

9# E1ecution time of a single threads are not im!roved, due to slower fre0uency or

adding !i!eline stages that are necessary to accommodate thread switching >?7#

:# Re0uires more changes to "oth a!!lica"le !rograms and O/ than multi!rocessing#

C"arse@graine/ M+!'i'-rea/ing

=lso )nown as Kloc) or coo!erative multithreading#

/im!lest ty!e of multithreading, occurs when one thread runs until, it is

"loc)ed "y a event that normally would create a long latency stall#

/uch a stall might "e a cache miss, that have to access o-chi! memory &

might ta)e huge num"er of CPU cycles, for the data to return#

Instead of waiting for the stall to resolve, a threaded !rocess

MULTITHREADING

Fine@graine/ M+!'i'-rea/ing

It is to remove all de!endencies stalls from the e1ecuting

!i!elining#

/ince one thread is relatively inde!endent from other thread there is less

chance of one instruction in one !i!eline stages needing an out!ut from an older

instruction in !i!eline#

Har/8are C"s'

It has additional cost of each !i!eline stages trac)ing the thread I of the

Instruction it is !rocessing#

/ince there are more threads "eing e1ecuted concurrently in the !i!eline

shared resources increase# Caches need to "e larger to avoid threading "etween

the dierent threads#

Si+!'ane"+s M+!'i'-rea/ing

Most advanced ty!e of multithreading a!!lied to su!erscalar

Documento similar