The CQ I31C interfaces the C OAL bus to the Q 2 2 - bus. This c h i p provides address transla ti on between the 26-bit COAL bus and 2 2 -bit Q 2 2 - bus In add i t ion , CQBIC handles data buffering between t he 3 2 -b i t sync hronousjasyn chronous C OAL bus and the 1 6-bit asynch ronous Q 2 2-bus. Q 2 2 -bus addresses are t rans l ated tO C OA L b us addresses by a program mable mapping fu nction (scatter-gather map) , which is software compatible w i t h the M icroVAX I I syste m . This fu nct ion gives the CPU the capabi l i ty to map any page of the 4 megabyte (MB) Q 2 2 -bus address space to any page of the m a i n memory address space . Thus Q 2 2 - bus O MA devices can transfer d i rectly to or from d i sconriguous pages of main memory. COAL bus add resses are trans lated i n to Q 2 2 - bus add resses by a direct mapp i ng function . T h is fu nct ion maps t he 4 MB Q 2 2-bus memory space and the 8KB Q 2 2-bus ljO space i n to t he VAX ljO space. Thus the CPU can d i rectly access Q 2 2 - bus me mory or device registers by means of two ra nges of l/0 page addresses .
Digital Technical journal No. 7 August I ')88
OMA write references are buffered i n two natu ral ly aligned ocraword buffers and transferred to main memory by the most efficient combina tion of multiword transfers . The two octaword buffcrs al low an enti re block-mode transfer ( u p t o ! 6 words) r o be buffered by the CQBIC. After the first buffer has been ti lled by the Q 2 2 -bus device , it is emptied i nto mai n memory while the Q 2 2-bus device ti l ls the second buffer. S i nce the COAL bus is faster than t he Q 2 2-bus. t he first buffer is emptied and ready for in put from the Q2 2 -bus device before the second buffer has been fil led. This arrangemen t a l lows t he i n terface to provide susta ined throughput at max i m u m Q2 2 -bus transfer rates with n o add it ion a l latency. Q 2 2-bus block-mode OMA read references are translated i nto quadword transfers on the COAL bus. The four words are buffered i n a s i ngle quad word buffe r and suppl ied to the OMA device on demand. Before the buffer is emptied, the next quadword is prefetc hed . This prefetch e l i m i nates addit ional latency on a l l b u t t h e first trans fer . To keep the latency of the first transfer at a m i n i mu m , the CQBIC responds to the OMA device after receiving t he first longword of a q uadword C OAL bus cycle, rather than waiting for t he entire quadword transfer ro com p lete .
To ti t the entire Q 2 2 -bus i nterface i n a single chip, some c hanges had to be made to the bus i nterface archi tectu re of the Micro VAX II system . O n the MicroVAX I I , the scatter-gather map was stored i n a dedicated 3 2 KB static RAM array within the bus i nterface. On the CQBIC, not enough space was ava i lable ro implement this storage array i nternal ro the chip. Moreover, not enough p i ns were avai lable to provide a dedi cated bus ro a n external static RAM array. The sol ution was to srore the scatter-gather map i n a 3 2KB block of main memory and to i mplement a 1 6 -ent ry fu l ly associative cache for map entries i n the CQBIC. The cache functions i n the same manner as an address translation buffer. When translating a Q 2 2-bus address , the cache is checked for the appropriate map entry. If the entry is found, the translation takes place at maxi mum speed . I f the entry is not fou n d , then there is a delay while the entry is fetched from main memory. The translation is then performed . This de lay is e l i mi nated on OMA transfers that cross a page boundary, because the entry that maps the next page is prefetched when the OMA operation reaches a page bou ndary. On most OMA transfers, this delay is negligible because i t i s amorti zed over a la rge number of Q 2 2 -bus transfers . The Digital Technical journal
No. 7 A ugust 1!)88
design ensu res that t he operating system does not attempt tO use the block of memory where the scatter-gather map resides. The on-board fi rmware does not i ncl ude these pages in a l ist of good memory pages that is passed ro the operat i ng system at boot time. An i n teresting side effect of putting the scatter-gather map in main memory was that the relatively l ong latency on some Q2 2-bus OMA cycles u ncovered latent design bugs i n seve ra l Q2 2-bus OMA devices. The designs of these devices had been verified by empirical test ing with existing processors rather than by testi ng to the Q 2 2- bus spec ification .
To maintain software compatibil i ty with the Micro VAX II system , the scatter-gather map is ref ere nced through a 3 2 KB block of ljO space addresses . The CQBIC responds tO wri tes in this address range by buffering the data so t he CVAX cycle can comp lete, u pdati ng the cache i f t here is a hit, req uesting the COAL bus, and upda t i ng the entry in mai n memory. I f any OMA operations are pendi ng, they are completed before CQBIC gives up the CDAL bus. This prevents multiple successive map updates by the CPU from locking out OMA activity long enough to cause Q 2 2 -bus devices to t i me out ( i n I 0 microseconds) .
O n reads to this add ress range that m iss t he cache , the CQBIC has ro latch the address and force the CVAX to retry the cycle. In this way, CQBIC can acq u i re the C OAL bus to fetch the entry from main memory . When the CQBIC re l i n quishes the COAL bus, the CVA.X retries t he cycle, and the CQBIC provides the processor with t he requested map entry. This retry mechanism is a lso used to i mpl ement the i nterl ocked i nstruc tions in the VAX i nstruction set .
On a l l i n terlocked i nstructions, the CVAX gen erates one or more sequences of a read-lock cycle fol l owed i m mediately by a write u nlock cycle. The CVAX identifies these special locked cycles by placi ng a u nique code on the parity l i nes at address t i me. The CQBIC recogn izes t he read lock code and forces the CVAX to retry u n t i l t he CQBIC can become master of the Q 2 2-bus. Once the CQBIC has mastership of the Q 2 2- bus, mem ory is effective l y locked and the cyc le proceeds. The CQBIC releases the Q 2 2 -bus ( u n l ocking memory) on the next CVAX bus transaction even if it is not a write u n lock cycle. This release pre vents me mory from stay i ng locked if the CVAX
has to abort t he i nstruction due ro an error en cou ntered on the read- lock cyc l e .
Li ke the MicroVAX I I Q 2 2-bus interface, the CQBIC gives the CPU the h ighest rat her than the
Overview of the Micro VAX 3500!3 600 Processor Module
lowest priority when arbitrati n g the Q 2 2-bus. This priori ty assignment reduces interrupt latency, since the processor is delayed for a m ax i m u m o f one O MA transaction before being granted the bus to acknowl edge the i n terrupt. Because the CPU accesses memory over a dedi cared i nterconnect rather than through the Q 2 2 -bus, CPU references ro the Q 2 2- bus are very i n frequent. Therefore this prior i ty scheme does not have a nega t i ve i m pact on OMA performance .
To support a range of CVAX m icrocycle ti mes and fixed Q2 2-bus timing, the CQBIC was designed to run at a fixed clock rate , asynchro nously to the CPU/memory su bsystem. This design made it easier for engineers to opt i m i ze perfor mance of the slower asynchronous Q 2 2 -bus (where bandwidth is at a prem i u m ) . These opti mizations are made a t the expense of lower per formance on the faster COAL bus (where there is extra bandwidth) due ro synchron ization dcl ays 7