• No se han encontrado resultados

OBJETIVOS DE LA INVESTIGACIÓN:

SALUD VALLECITO – PUNO

The major trend in the design effort has been focused on two implementation considerations: the optimised use of the silicon area for improving parallelism of integrated nodes; and a short implementation time, compatible with

the thesis schedule. The composition of VLSI design tools used allowed us to complete the chip design and to meet partially these consideration points.

We experienced some difficulties for not using an integrated development system. It took longer than expected to produce the final layout and some planned tasks could not be executed. For instance, a consistent functional specification for all cells and blocks was produced to run a functional test. However, the lack of functional models for nMOS and pMOS transistors in our version of the SOLO system, added to our limited time, prevented us from performing such test. Also, design tools provided insufficient support for layout measurements and therefore delivered poor statistics about performance and transistor count. Nevertheless, the automated place and route system played an important role for floor planning and area reduction.

The floor plan organisation of the prototype takes into consideration pin access and terminal proximity for size and routing optimisation. Figure 7.6 shows the schematic floor plan for the prototype. The control part is all aligned to the left while the operative part is to the right. The left border contains control pins, while the busses encircle the other borders to the right. The relative placement of the blocks was designed to minimise the distance between connecting terminals, reducing routing lengths and propagation times.

Between the PLA column and the operative blocks lays a collection of random logic comprising drivers for the registers and flip-flops storing machine states. These flip-flops are arranged as shift registers to enable a scan observation of the internal state together with some important control signals. The drivers are qualified with the global phases to activate access lines in the operative part.

N e t w o r k B u s P ad s C P A c o d C M E {. I Output 1 Input Q ueue 8 Q ueue S . ‘: E I F I m 1- c - ' ■ C P ■ T ■ M - E D I E D 2 M T C < C : R L C? " ■ | f . * • | | **■ M P C l ':' ¥ l mh J M ethod t O bject 1 $ jjk M T S M M d C ache 1 C ache M e m c x v B u s P a d s ;

Figure 7.6: Schem atic F lo o r Plan

T h e prototype was im plem ented using 2M CM OS process technology with double-m etal interconnection layers. It was designed tow ards an academ ic m ulti­ project service, with a lim itation of 100 m m ^ m axim um size. T h e features of the B R O O M node for the current im plem entation are sum m arised in T able 7.1.

N u m b e r o f P r o c e s s o r s 1 D a t a L e n g t h 3 2 - b I n s t r u c t i o n L e n g t h 8 - b A L U 3 2 - b E x t e r n a l B u s L e n g t h 3 2 - b e a c h ( N e t w o r k a n d M e m o r y ) M e t h o d a n d O b j e c t C a c h e 3 2 x 3 2 - b ( t o t a l ) M e s s a g e Q u e u e s 1 2 8 x 3 2 - b ( t o t a l ) C o n t r o l P L A s 10 M a c h i n e C y c le 1 0 0 n s ( e s t i m a t e d ) T o t a l I / O B a n d w i d th 4 0 M b y t e / s p e r b u s ( e s t i m a t e d ) P a c k a g e 1 0 0 p in P G A ( 9 2 u s e d ) N u m b e r o f s t a n d a r d g a t e c e lls 7 4 3 N u m b e r o f f u l l- c u s t o m c e lls 1 7 9 N u m b e r o f t r a n s i s t o r s 5 0 k ( e s t i m a t e d ) D e v i c e te c h n o l o g y 2M C M O S D i e S iz e 7 .6 x 8 .5 m m ( 6 4 .6 m m 2 )

T able 7.1: B R O O M N ode Processor Features

The chip contains about 50K transistors (the precise num ber of transistors could not be established), 743 standard cells, and 179 custom ised cells. T he layout

is 7.6 x 8.5 mm giving a total of 64.6 mm^ die area, within a good margin from the limit. The total number of 92 pins includes 64 I/O data pins, the others representing control, supply/clock and test.

In the experimental evaluation for operation speed, the program SPICE was used to estimate the performance of possible critical paths in the processor’s circuit. One possible critical path is the long carry chain in the arithmetic logic. It has the delay equivalent to 16 cascaded inverters. Operations involving 32 bit propagation in the chain are allowed two phases to complete. This extends the safety margin for the completion of long additions or subtractions. The other suspect path is cache to register transfer, involving decoding of cache address and delays in the internal and operative busses. The estimation with SPICE tests and data from the standard library is that both paths can operate within a clock of lOMhz.

Chapter 8: Assessment

This chapter presents the goals proposed to this work, and assesses the results obtained. The goals are discriminated in three levels: the virtual machine, BROOM architecture and V L S I implementation. The main criteria for the assessment consider the suitability, parallelism and integration o f the diverse levels o f the project.

Documento similar