Besides considering hybrid SRAM and STT-MRAM design to accelerate service to write opera- tions and improve bank accessibility, we also propose an efficient block insertion/migration policy to maximize the SOS throughput as shown in Fig. 4.8. The tag store associated with STT-MRAM banks are equipped with three fields, Read Counter (RC), Write Counter (WC), and PV status. The main idea behind using RC is to identify vulnerable read-intensive blocks in the set. If a frequently-read block is allocated from a high-PV impacted STT-MRAM array, the cache block must be relocated to a low-PV impacted region of the set to guarantee the reliable read sensing operation. We conducted an extensive exploration to evaluate the preferred value for the read threshold level, N Rth, within our design. We found that if N Rth is small, the ratio of blocks that
must be transferred to low-PV impacted region, significantly increases while if N Rthis large, then
SOS utilization significantly decreases because only a few read-intensive cache blocks are selected for migration. Thus, we set N Rth based on extensive study on block access pattern of under test
workloads. In addition, the not access intensive cache blocks located in low-PV impacted data ar- ray in STT-MRAM is selected to replace by vulnerable read-intensive block, if the corresponding RC of one of the high-PV impacted blocks reaches N Rth.
On the other hand, WC is a saturating counter to keep track of write access pattern to a cache block. If WC reaches its N Wth, write threshold level, it is considered as a write-intensive block.
We propose to transfer these blocks to SRAM data array to amortize the latency and high dynamic energy consumption associated with incoming write operations. The PV status determines that whether a cache block is located in low-PV impacted region or in high-PV impacted data array. This bit is set based on a consensus decision-making process in the tag store during POST phase where a PV-aware March Test traverses sub-banks of cache block to identify the PV-impacted sub-banks. Fig. 3 illustrates an example of migration policy for a read-intensive block located in high-PV impacted region of STT-MRAM cache. Upon a read hit on way-2 in STT-MRAM bank, the RC reaches its N Wth, indicating that it is highly possible that the incoming accesses
to this block is read-dominant operation. To reduce the probability of incorrectly sensing the stored value in STT-MRAM, the proposed migration policy swap the selected read-intensive block resided in high-PV impacted region with a not access intensive block located in low-PV impacted region based on LRU stacks in the tag array. A swap buffer is employed to properly enable the block transfer between low-PV impacted region and high-PV impacted data array. This process is completed by updating the LRU stacks associated with each cache block after swap operation.
Ta g Index Offs et T0 Hit 8×1 Selector T7 Data Set 8192 Set 1 Set 2 Set 3 SRAM-based Tag and Write Counter Array
Data Array
SRAM Banks STT-MRAM Banks
Set 8192 Set 1 Set 2 Set 3 Tag V WC = = = = = = = T0 T1 T2 T3 T4 T5 T6 T7 RC PV =
Figure 4.8: The scheme of hybrid 8-way set associative SRAM and STT-MRAM cache design, whereby each bank stores a way. In the above configuration, two SRAM-based banks and six STT-MRAM based banks are illustrated.
High-PV RC=NRth
& PV=1
LRU stacks
SRAM Banks STT-MRAM Banks
3 5 1 2 4 7 8 5
way-0 way-1 way-2 way-3 way-4 way-5 way-6 way-7
swap LRU stacks 4 6 0 3 5 8 0 6 RC<<NRth & PV=0 PH A SE 1 PH A SE 2
Figure 4.9: The migration policy to swap a read-intensive block resided in high-PV impacted region with not access intensive block located in low-PV impacted region.
Algorithm 2: Block Insertion/Migration Policy
1 Assumptions:
2 - RC: Read Counter, WC: Write Counter, PV: Process Variation Status 3 - N Rth: read threshold levell, N Wth: write threshold level
4 - Way 0-1 and Way 2-7 are built in SRAM and STT-MRAM (NVM), respectively in shared LLC 5 Function insertion() /*algorithm for inserting requested block*/
6 begin 8
8 if LLC miss then 10
10 if write miss then 12
12 eviction() /*evict LRU block ∈ LLCSRAM Bank*/ 14
14 copy block ∈ memory into LLCSRAM Bank 15 else if read miss then
17
17 if ∃ block’s address ∈ read intensive block profiler then 19
19 eviction() /*evict LRU block ∈ NVM-BankP V =0*/ 21
21 copy block ∈ memory into NVM-BankP V =0
22 else
24
24 eviction() /*evict LRU block ∈ LLCN V M Bank*/ 26
26 copy block ∈ memory into NVM Bank
27 if read hit then 29
29 if ∃ block’s address ∈ read intensive block profiler then 31
31 update LRU status 33
33 else if read intensive block profiler is full then 35
35 evict LRU entry and fill the profiler with the new entry’s address 36 else if RCblock< N Rththen
38
38 ++RCblock
39 else
41
41 add new entry’s address to read intensive block profiler 43
43 migration() 44 if write hit then 46
46 if block ∈ LLCN V M Bank&W Cblock< N Wththen 48
48 ++W Cblock
49 else if block ∈ LLCN V M Bankthen 51
51 migration()
52 Function migration()/*algorithm for migrate blocks*/ 53 begin
55
55 if read from block ∈ BankP V =1then 57
57 swap (block ∈ BankP V =1, block ∈ (BankP V =0& RC < N Rth)) 59
59 if write into block ∈ LLCN V M Bank then 61