• No se han encontrado resultados

C. ÓRGANOS DE GOBIERNO DE LA EMPRESA

2. El consejo familiar

We briefly describe our contributions in this thesis in the context of related work.

11.1 Formal probabilistic analysis of hardware

For hardware designs, Markov chains have frequently been used to compute high level system performance and power [60]. These models do not represent details of the hardware implementation that are required to compute bit error-related performance. Markov chains have also been used at a circuit level, to design circuits with high error tolerance [100] and to analyze stability [101]. However, these models provide an excess of detail. Therefore, they restrict the size of the systems that can be analyzed.

In [102], the authors use the probabilistic model checking tool PRISM to evaluate the reliability of defect-tolerant systems. However, the evaluation is restricted to gate-level descriptions of the systems. The defects in gate functionality are considered to be stochas- tic in nature. The authors illustrate their technique using a NAND multiplexing example. The state-space of the DTMCs that are used to represent the gate-level descriptions de- pends on the number of gates in the system. RTL designs map to gate-level descriptions that may have hundreds of thousands of gates. Therefore, it is infeasible to use such gate-level analysis techniques to evaluate reliability of RTL designs.

The authors in [103] obtain analytical expressions for the errors introduced in RTL due to internal quantization of data. However, this approach is intractable for complex MIMO designs. Moreover, the analytical expressions do not model the probabilistic nature of errors that are caused by external data corruption.

The Mobius tool [104] provides a flexible formalism that can be used to model and formally analyze probabilistic systems. However, we find that several extraneous variables need to be introduced into the model to represent correct RTL functionality. Therefore, the scalability afforded by this tool is limited.

11.2 Macromodeling

Macromodels [18],[19],[105] propagate information from the lower levels of hardware to the higher levels of design. High-level analyses use macromodels as plugins to provide perfor- mance (e.g., timing and power) estimates early in the design flow. Typically, macromodels provide estimates in RTL that are within 20% of the actual measurements obtained at the gate level.

Statistical timing estimates in RTL can be obtained with commercial CAD tools that use delay macromodels [18],[19]. However, such variation-aware RTL timing analysis tools almost exclusively consider only process variations and cannot be used in the context of input variations. In this work, we consider RTL delay macromodels for both input variations and for process variations.

To the best of our knowledge, ours is the first delay macromodeling strategy in the context of input variations. Our macromodels are constructed offline and can be made more accurate by including more features from the later stages of the design flow. For example, optimizations during logic synthesis can modify the design delay significantly. In this work, we show that our macromodels can faithfully capture delay changes re- sulting due to downstream synthesis optimizations. In future work, we could refine our macromodels to model other downstream features such as parasitics, interconnect de- lay and crosstalk. These refined macromodels can then be plugged into our SHARPE methodology to achieve the desired level of estimation accuracy. Since the macromodels are but plugins, refining them will not require departure from the fundamental SHARPE methodology that we propose in this thesis.

11.3 Performance analysis of MIMO systems

Conventionally, performance estimation is done by performing Monte Carlo simulations [61] of MIMO RTL using random input vectors. Estimates that are reasonably accurate can be obtained by simulating the MIMO systems [25] over many cycles. This technique is time-consuming and incomplete. FPGA implementations [106] and ASIC prototypes [107] provide accelerated simulations, thereby speeding up performance estimation. However, both these methods involve significant overheads in terms of cost.

The performance of high level systems can be computed formally using probabilistic model checking [108] and Markov chains [60]. Markov chains have also been used at a circuit level, to design circuits with high error tolerance [100] and to analyze stability [101]. In [62], we present a novel methodology that uses probabilistic model checking at RTL

in order to estimate error-related performance. However, none of the above techniques model faults that may be present in the physical hardware implementation. Therefore, they cannot be used to formally analyze the vulnerability of performance to physical faults.

Several simulation-based techniques exist that study the effects of physical faults by injecting them into RTL designs [8],[67]. In [70], the authors propose a formal verification methodology in order to determine whether a fault that is present in the interior of an RTL design can propagate to an output of interest. However, we are interested in computing the average probability with which such propagations can occur rather than checking for a single instance of their occurrence.

Although several techniques exist that perform a probabilistic analysis of the effects of hardware faults [9],[109],[110] they are mostly simulation-based, and therefore not rig- orous. In [102], the authors use probabilistic model checking to formally evaluate the reliability of defect-tolerant systems. However, the evaluation is restricted to gate-level descriptions of the systems. An exact probabilistic analysis of faults for RTL designs is presented in [111]. However, each bit of an RTL variable needs to be represented indi- vidually. Therefore, it is infeasible to use the techniques in [102],[111] for complex RTL designs.

To the best of our knowledge, ours is the first work that provides a unified framework which incorporates the effects of physical faults while formally analyzing BER performance from the RTL description.

11.4 Probabilistic timing analysis

At the lower levels of design (e.g., gate-level), statistical static timing analysis (SSTA) is expected to provide highly accurate estimates. Gate-level SSTA is a well-established research topic that has matured over the past few decades. During this time period, SSTA has evolved to use sophisticated delay models in order to provide statistical timing estimates that are highly accurate. At the gate-level, timing verification methods include SSTA techniques such as [5],[75],[76],[77]. Some circuit design techniques like [72] use such gate-level timing analyses to enable better design goals than pessimistic worst case design in the presence of process variations. For input-dependent timing variations, the performance of better-than-worst-case designs [2],[3],[73],[74] can be verified using gate- level probabilistic timing analysis described in [26]. However, such gate-level techniques do not offer a scalable solution for statistical timing analysis at the higher levels of design.

In recent work, SSTA has been adapted to be employed early in the design flow, during high-level synthesis [7],[78],[79]. These high-level SSTA techniques use relatively sim- ple delay macromodels to introduce variation-awareness early in the design flow. The authors demonstrate that such variation-awareness can improve high-level design explo- ration. Such high-level approaches emphasize on predictability and are not intended to be highly accurate in comparison to downstream analysis.

Unlike existing high-level SSTA, our SHARPE methodology is applied to RTL. More- over, our methodology can be applied in the context of both input variations and process variations. Therefore, a direct comparison of our SHARPE methodology with existing high-level SSTA is not possible.

11.5 Aging analysis in hardware

Commercial tools like RelXpert [112] perform extensive simulations at the transistor- level in order to estimate the delay degradation of the circuit. In [30],[31],[32],[33],[34] aging effects are analyzed at the gate-level. Our methodology is more scalable than these techniques since we perform analysis at a higher level of abstraction. In [32], the delay degradation of microarchitectural components is estimated by synthesizing them into gate-level netlists. Our methodology operates at a finer granularity since the degradation is estimated for each RTL statement.

11.6 Compositional reasoning for formal verification

Compositional techniques have been used before to improve the scalability of formal hardware verification [38]. A form of circular assume-guarantee reasoning is used while model checking the individual components. Additionally, a case-splitting technique is used while verifying properties of a component over a set of different data values. However, the decomposition strategy is not automatic and is specifically intended for non-probabilistic model checking.

Automatic decomposition has been proposed for systems with Boolean variables [90]. The dependence between components is expressed through relations that are obtained through learning-based techniques [90],[113]. However, this approach considers only non- probabilistic systems, and therefore cannot be extended to probabilistic model checking of hardware designs.

Several compositional reasoning approaches have been presented in the context of prob- abilistic model checking [37],[39],[40],[41],[42]. However, these approaches rely on the ability of the designer to identify each Mi and the corresponding φi. These approaches do not describe any automatic methodology to derive the components and their corre- sponding properties. Therefore, a large amount of manual intervention is demanded while employing such techniques. Moreover, these approaches are not intended specifically for hardware designs, and therefore cannot exploit the characteristics of hardware systems.

11.7 Abstraction for formal verification

In the realm of software verification, there exist several techniques [114],[115] for predicate abstraction. Properties regarding program correctness/safety can be expressed using a set of predicates, that are either specified or automatically inferred. These predicates can be used to abstract a program and convert it into a Boolean program on which the properties can be easily verified. More generally, abstract interpretation [20] is the theory of reasoning with the approximate semantics of a large program rather than the set of all possible concrete behaviors. However, unlike predicate abstraction, all such abstractions are not necessarily property-specific. In all these abstractions, the concrete numeric values of data can either be completely abstracted out of the program or can be restricted to finite intervals [116].

Data abstraction techniques have been applied even in the context of hardware ver- ification [92]. These techniques employ predicate abstraction in order to focus on the verification of Boolean control logic for which the exact numeric values of datapath vari- ables are inconsequential. In [44], RTL designs are verified by restricting data values to intervals that are imposed by the execution of the RTL program. Therefore, these intervals are not property-specific.

Abstraction techniques have been employed in the context of probabilistic systems as well [42],[43],[117],[118]. In [118], the abstraction is performed on the source code itself. However, this technique is intended for probabilistic software and cannot be extended to RTL designs.

CHAPTER 12