In addition to the previously described scenarios, there are special scenarios regarding the time of the EADS and AHOLD assertion. The final result depends on the time EADS and AHOLD are asserted relative to other processor-initiated operations.
4.8.7.1 Write Cycle Reordering due to Buffering Scenario: The MESI cache protocol and the ability to perform and respond to snoop cycles guarantee that writes to the cache are logically equivalent to writes to memory. In particular, the order of read and write oper- ations on cached data is the same as if the operations were on data in memory. Even non-cached memory read and write requests usually occur on the external bus in the same order that they were issued in the pro- gram. For example, when a write miss is followed by a read miss, the write data goes on the bus before the read request is put on the bus. However, the posting of writes in write buffers coupled with snooping cycles may cause the order of writes seen on the external bus to differ from the order they appear in the program. Con- sider the following example, which is illustrated in Figure 14. For simplicity, snooping signals that behave in their usual manner are not shown.
Step 1 AHOLD is asserted. No further processor-initi- ated accesses to the external bus can be start- ed. No other access is in progress.
Step 2 The processor writes data A to the cache, re- sulting in a write miss. Therefore, the data is put into the write buffers, assuming they are not full. No external access can be started because AHOLD is still 1. R2 BOFF Data HITM EADS INV AHOLD R1 BRDY BLAST ADS W/R M/IO ADR CLK
W1 to CPU don’t care
W1 W2 W3 W4
W1 from CPU W3 W4
Figure 13. Cycle Reordering with BOFF (Write-Back) Note:
The circled numbers in this figure represent the steps in section 4.8.6.
W2 11 12 R2 from CPU
➄
➃
➅
➇
➉
CACHE➀
R1 from CPU➈
➆
➂
➁
AMD
Step 10 In the same clock cycle, the snooping cache drives HITM back to 1.
Step 11 The write of data A is finished if BRDY transi- tions to 0 (BLAST = 0), because it is a single word.
The software write sequence was first data A and then data B. But on the external bus the data appear first as data B and then data A. The order of writes is changed. In most cases, it is unnecessary to strictly maintain the ordering of writes. However, some cases (for example, writing to hardware control registers) require writes to be observed externally in the same order as pro- grammed. There are two options to ensure serialization of writes, both of which drive the cache to Write-through mode:
1. Set the PWT bit in the page table entries.
2. Drive the WB/WT signal Low when accessing these memory locations.
Option 1 is an operating-system-level solution not di- rectly implemented by user-level code. Option 2, the hardware solution, is implemented at the system level.
BLAST Data BRDY EADS ADS HITM Cached Data AHOLD CLK Write Buffer B original 1 A 2 6 5 B modified 4 3 B B+4 B+8 B+12 8 A Ignored 9 7 XXX Note:
The circled numbers in this figure represent the steps in section 4.8.7.1.
Figure 14. Write Cycle Reordering Due to Buffering
10
11
Step 3 The next write of the processor hits the cache and the line is non-shared. Therefore, data B is written into the cache. The cache line transits to the modified state.
Step 4 In the same clock cycle, a snoop request to the same address where data B resides is started because EADS = 0. The snoop hits a modified line. EADS is ignored due to the hit of a modified line, but is detected again as early as in step 10. Step 5 Two clock cycles after EADS asserts, HITM be-
comes valid.
Step 6 Because the processor-initiated access cannot be finished (AHOLD is still 1), the BIU gives priority to a write-back access that does not re- quire the use of the address bus. Therefore, in the clock cycle, the cache starts the write-back sequence indicated by ADS = 0 and W/R = 0. Step 7 During the write-back sequence, AHOLD is
deasserted.
Step 8 The write-back access is finished when BLAST and BRDY transition to 0.
Step 9 After the last write-back access, the BIU starts writing data A from the write buffers. This is indicated by ADS = 0 and W/R = 0.
32 Am5 86 Microprocessor AMD
4.8.7.2 BOFF Write-Back Arbitration Implementation
The use of BOFF to perform snooping of the on-chip cache is used in systems where more than one cache- able bus master resides on the microprocessor bus. The BOFF signal forces the microprocessor to relinquish the bus in the following clock cycle, regardless of the type of bus cycle it was performing at the time. Consequently, the use of BOFF as a bus arbitrator should be imple- mented with care to avoid system problems.