4.4.1 Introduction to Static Timing Analysis
Before describing the timing-driven version of the hierarchical parti- tioning method, we describe how timing properties of a circuit are an- alyzed. The timing properties are typically analyzed statically without simulating the full circuit. In synchronized systems, the clock signal conducts the synchronized elements such as flip-flops or latches, which copy their input to their output at the rising or falling edge of the clock signal. Logic signals are supposed to move in lockstep, advancing one stage on each tick of the clock signal. During static timing analysis the timing constraints are checked. A typical constraint is a lower limit for the clock frequency. Only two kinds of timing violations are possible:
• Setup time violation: a signal arrives too late at the input pin of a sequential element
• Hold time violation: an input signal of a sequential element changes too soon after the clock’s active transition.
To analyze when violations will occur, a timing graph is constructed in which each node is a pin of a primitive block and each edge a con- nection between these primitives. After technology mapping little is known about the routing pathways of the connections so typically each edge has the same delay. In the packing stage a different delay can be assigned to the edges because connections on a higher hierarchical level will be slower than low level connections. The further downstream in the compilation flow, the more accurate the timing graph will become.
The arrival time of a signal is the time elapsed for a signal to ar- rive at a certain point, Tarr. The reference, or time 0.0, is often taken
as the arrival time of the clock signal. Calculating the arrival time re- quires a forward topological traversal through the timing graph. The required time, Treq, is the latest time at which a signal can arrive at a
node without timing violations. The timing constraints determine the required times at the sequential inputs. A backward topological traver- sal is required to calculate the required time at each node. The slack is the difference between the required time and the arrival time.
A positive slack at a node implies that the arrival time at that node may be increased without affecting the maximum delay in the circuit. Conversely, negative slack implies that a path is too slow, and the path must be sped up (or the reference signal delayed) if the timing con- straints have to be met. The worst negative slack (WNS) indicates the critical path of the circuit. The timing violations will not improve unless this path is taken care of. To indicate how critical an edge is a criticality measure is introduced:
Crit(edge) = Slack(edge)
W N S (4.3)
In case the only timing constraint is that the clock frequency should be as high as possible then the critical path is defined as the path with the maximum delay, Dmax.
4.4.2 Timing Edges in Partitioning
Edges on the critical or on a near critical path should not be cut on high hierarchical levels because this leads to longer and slower connections in the interconnection network. These edges should remain uncut as long as possible. For this reason weighted timing edges are added to the circuit graph before it is passed to the partitioning tool. These tim- ing edges avoid that the critical or a near critical path is cut when a partition is possible without cutting this path. The amount of timing edges added and the weight of these edges are important parameters to obtain good results for both the total wirelength and the critical path delay. Adding too many timing edges with too large weights results in partitions that violate the natural hierarchy of the circuit and leads to an increase in total wirelength.
A timing analysis of the mapped circuit determines where the tim- ing edges should be added. We optimistically assume that the fastest possible connection is used between two blocks. A timing edge is added for each edge in the circuit that has a criticality larger than the threshold value. However there are two types of designs: designs with only a few long paths and circuits with a more gradual path delay dis- tribution.
For the circuits with only a few long paths the process is straight- forward. A treshold criticality is set to 0.65. All edges with a higher criticality are added, thereby assuring that they are not cut during par- titioning. However, many circuits have a more gradual path delay dis- tribution. Adding every critical connection would lead to too many timing edges in the circuit. For these circuits we avoid adding too many edges by considering only the 20% most critical edges.
We also add a weight to each timing edge, in order to differentiate between the critical and near critical edges. The larger the criticality of an edge, the larger its corresponding weight will be. hMetis only allows integer values for the weights [68], so the criticality of the edge is multiplied with a factor M and rounded to the nearest integer value. We found experimentally that 12 is an optimal value for M .
Timing Edge Weight Update
If a critical or near critical path is cut during recursive bi-partitioning, then this path should have special attention. Otherwise it could be cut several times and this would lead to multiple slow connections in the interconnection network. To prevent this, all timing edges on a crit- ical or near critical path have a very large weight assigned once it is cut. This ensures that other uncut critical connections are cut first if a partition is not possible without cutting critical edges.