3. EJECUCIÓN POR PRIORIDADES
3.4. EJE 4: Desarrollo sostenible local y urbano
Recovery Detection Sources of µarchitecture SW HW Cost Performance/ Latency Failure Specific Support (complexity) Power Costs RNA [154] No Unbounded Soft + hard
errors, bugs No No Low-Medium
Very low power, No performance TAC [154] Yes
(pipe flush) Bounded
Soft + hard
errors, bugs No No Low-Medium
Very low power, No performance Scoreboard
/ Tag
Reuse [33]
Yes
(pipe flush) Bounded Soft errors Yes No Very low
Very low power, No performance
DDFV [115] No Unbounded Soft + hard
errors, bugs No
Yes + ISA
extensions High Medium Argus [114] No Unbounded Soft + hard
errors, bugs No
Yes + ISA
extensions High Medium Our
approach
Yes
(pipe flush) Bounded
Soft + hard
errors, bugs No No Very low
Very low power, No performance
our baseline core. We can also see for these two policies that increasing the signa- ture size boosts coverage considerably at a small extra cost: area and peak dynamic power overheads grow almost linearly while at the same time the number of unde- tected faults is divided by half (coverage grows in a logarithmic trend). However, the overheads for the round-robin class are noticeable even for 2-bit signatures: the area requirements are roughly similar to the area requirements for a 4-bit enhanced configuration (but at a fraction of the achieved coverage). We therefore conclude that an enhanced policy is the best choice for the coverage-overhead design space.
5.7 Related Work
A few dynamic verification techniques have been proposed to detect errors in the control logic and hardware blocks implementing register dataflow tasks. Table 5.8 summarizes the features and pros and cons of each one of them.
Reddy et al. [154] propose two ad-hoc hardware assertion checkers. The first one, Register Name Authentication (RNA), aims at detecting errors in the destination tags. RNA assumes there is an additional rename table at the commit stage holding architectural mappings. When an instruction is renamed, the previous register tag is stored in the ROB. When the instruction retires, the register mapping in the redundant rename table will necessary contain the previous physical register in the ROB. RNA reads it and compares it with the one in the ROB. In order to detect faults in the free list and in the register allocation, RNA proposes managing two extra bits for every register tag in the free list: a ready and a free bit. When an instruction writes its result back, these bits are accessed and checked to be zero. RNA detects
102
·
Chapter 5. Register Dataflow Validationfaults affecting the tags in the rename table, faults in the architectural rename table, faults in the shadow rename tables, faults affecting the destination tags in the ROB, and faults in the free list and in the register allocator. However, RNA has several limitations and problems: (i) it is not able to detect errors in the source tags, (ii) the detection latency is unbounded, and an error can be architecturally committed before it is detected, and (iii) it requires adding a redundant architectural rename table with non-negligible area and latency overheads.
The second technique, TAC (Timestamp-Based Assertion Checking), detects er- ror in the issue logic by checking that a chain of dependent instructions follow a valid chronological order. TAC assigns timestamps to instructions when they issue, and compares consumer timestamps with producer timestamps. TAC is hard to imple- ment because every instruction must know its issue timestamp, the issue timestamp of its producers, and the latency of its producers. The size of a timestamp is big (13 bits) and does not scale with respect to the ROB and with respect main memory la- tency, incurring in non-negligible hardware costs. Furthermore, TAC does not catch the scenario where an instruction ends consuming wrong values from other datapaths. Carretero et al. [33] propose two light-weight ad-hoc techniques to protect the issue logic. The detection of errors is achieved by: (i) redundantly checking at issue time operand availability by using idle register scoreboard read ports, and (ii) repli- cating the source tag in the CAM storage for those instructions that only require one renamable operand. Faults in the select logic, in the tag broadcast buses, in the CAM memories-matchlines, and in the ready bits can be detected with minimal modifications. However, faults affecting the register scoreboard go unnoticed. Most importantly, these techniques fail to define a comprehensive correct behavior for the register dataflow logic and they are tailored for a specific issue queue design.
Meixner’s DDFV scheme (Dynamic DataFlow Verification) [115] detects faults and bugs in the fetch, decode, and register dataflow logic. DDFV is similar to control flow checkers that verify intra-block instruction sequencing by means of compiler sup- port. DDFV dynamically verifies that the dataflow graph specified by an application is the same as the one computed and executed by the core. First, the compiler com- putes for every basic block a compact representation of its static (expected) dataflow graph, and embeds these signatures into the application binary. At runtime, the dataflow graph for every basic block is reconstructed and compared against the ref- erence one.
A state history signature (SHS) is computed for each architectural register: it cap- tures the instruction that generated the value and the history of the input operands, but not their values. Hence, a signature is recursively dependent on the chain of backward register-dependent instructions. Every register, data bus, value in the
5.7. Related Work
·
103 Table 5.9: Blocks and logic protection for register dataflow validation techniquesFetch Decode Rename Free List - Issue Ld/St ALU RF + Data Load CF ROB pdsts Queue Queue Bypasses Replay Recovery
RNA [154] No No Yes Yes No No No No No No No
TAC [154] No No No No Yes No No No No No No
Scoreboard
/ Tag
Reuse [33]
No No No No Yes No No No No No No
DDFV [115] Yes† Yes† Yes† Yes† Yes† No No Yes† Yes No No
Argus [114] Yes Yes N/A N/A Yes† N/A Yes§ Yes† Yes No Yes
Our
approach No No Yes Yes Yes No Yes Yes Yes Yes No
† : Protection within basic block, not across basic blocks
§: ALU uses different error detection mechanisms than the one used for protecting values
ROB, etc. is extended to keep the SHS associated to that value. When the last instruction in a basic block commits, the SHSs are combined to form the execution- time DFS (dataflow graph signature). DGSs are 24 bits and SHSs are 10 bits each. Big area overheads are clearly required. The most critical issue is that checking is not supported for registers crossing basic blocks, as this information is unknown at compile time. In addition, there is a pressure during fetch, decode and commit be- cause of the extra instructions and the added extra commit cycle. Errors are detected at the end of basic blocks, causing unbounded error detection latencies and errors being committed before being caught. Furthermore, SHSs must be saved by the OS to support exception and interrupt handling.
Argus [114] proposal by Meixner et al. extends DDFV to include computation and control flow checking capabilities. Argus is however meant for simple in-order cores. Unlike DDFV, Argus embeds into each basic block the DGSs of potentially two legal successors, rather than inserting its own DGS. During execution, Argus picks among the two DGSs the one belonging to target basic block. For computation checking, Argus uses residue checking or operand shifting. Even though Argus extends DDFV’s coverage, it poses the same problems: ISA and OS modifications, compiler support, no failure containment and big area and performance overheads.
Table 5.9 summarizes for each of these register dataflow validation techniques the different features, control logic or blocks that are covered.
To begin with, DDFV and Argus are the only solutions that perform control flow checking (’Fetch’ and ’Decode’ are covered): they build upon existing control flow checker techniques that verify intra/inter-block instruction sequencing by means of compiler support (recall Section 3.4). However, DDFV only provides protection
104
·
Chapter 5. Register Dataflow Validationwithin basic blocks, which ultimately ends up limiting the achievable coverage. Our technique does not check the control flow, but the baseline RAS features described in Section 4.2 can actually cover them in a simple manner.
The rename table, rename logic (’Rename’ column), as well as the free list and register allocation-release functionalities (’Free List - ROB pdsts’ column) are covered to a varying degree. RNA detects errors in the rename table and rename logic as long as they affect destination tags, not source operands. DDFV covers all these scenarios but at a basic block level. Conversely, our technique extends the protection to all ’Wrong tag’ and ’Register free list misuse’ cases by removing this basic block level restriction. Argus is meant for in-order cores, and thus these blocks are not covered. None of the techniques cover the ’Load-Store Queue’ logic. For DDFV or Argus, the compiler cannot help identifying producer-consumer memory instruction pairs. In Chapter 7 we introduce a unique solution to verify the Load-Store Queue logic in a targeted manner, so that coverage can be further extended.
ALUs are not covered by DDFV: a parity bit is just added to each produced reg- ister value. Argus does computation checking, but it relies on a set of techniques that are different than the mechanism used to protect values (parity). As a consequence, DDFV and Argus introduce extra delay before and after computation to check and produce the codes for the sources/results. Our technique protects computation and values using a unified mechanism, avoiding extra delays.
Regarding access to the RF and bypasses, neither TAC nor the Scoreboard Reuse techniques protect against scenarios like ’Wrong register file access’, ’Selection of wrong inputs’ or ’Data stall in the bypass network’. DDFV and Argus cover them as long as the consumed operands are produced within the same basic block. Our tech- nique removes this severe constraint and covers against any possible failure scenario. The ’Issue Queue’ column captures faults manifesting as ’Premature Issue’ and ’Wrong Tag’ scenarios. TAC can just detect scenarios where instructions are issued prematurely but cannot detect errors in the operand tags. [33] catches faults in tags for single source instructions, and ’Premature Issue’ is covered as long as the scoreboard is not faulty. DDFV and Argus protect against ’Premature Issue’ and ’Wrong Tag’ scenarios, as long as the wrongly consumed value belong to the same basic block.
None of the existing techniques, but ours, are able to detect ’Load replay errors’. Since DDFV or Argus signatures do not capture value information, a load hitting or missing in the cache will have the same signature.
Finally, in column ’CF Recovery’ we list the techniques that validate that the state of the processor is correctly recovered upon a control flow recovery event (such
5.8. Conclusions