In some cases, one is interested in good solutions that provide an upper bound for Wnf
opt(F ) and can be computed efficiently. In particular, we pro-
pose Algorithm 2 to compute such an upper bound.
Lemma 5. If Algorithm 2 returnsΛbest, thenΛbestis a legal layout with W(Λbest) ≥
Woptnf(F ).
Proof. It suffices to show thatΛbestis a legal layout. In fact, we claim that in every execution of line 4 in Algorithm 3 a legal layout is stored, and that the line is executed at least once. For the legality observe that the recursive calls guarantee thatF = Fl∪ Fualways holds, and that x and cl are set in a way
thatFl is laid out legally every time Algorithm 3 is entered.
To see thatΛbestis updated at least once, which is necessary for the resulting
layout to be legal, consider the point where the execution of Algorithm 3 ends. This can only happen whenFu = ∅(in this case Λbest was updated)
or after a recursive call of Algorithm 3 with a strictly smallerFu. The return
statement in line 2 cannot be reached as long as Wbest has not been set to a
finite value. In other words, backtracking does not happen before line 4 was reached for the first time.
Note that even though no optimal placement is found by this upper bound computation, it may require an exponential runtime.
Lemma 6. There exist instances with n transistors for which Algorithm 2 requires
Ω(2n)steps.
Proof. LetF = {F1, . . . , Fn}andN = {N0, . . . , Nn+2}. We set nf(F) = 2 for
4.4. ALGORITHMS 63
Algorithm 3: ContinuePlacementUB
Data: FETsFl with a legal layoutΛl, unplaced FETsFu, and nf(F)for
all F∈ Fu
1 if LowerBound(Fl,Fu)≥Wbestthen
2 return 3 ifFu= ∅then
4 Λbest ←Λl, Wbest ←W(Λl)
5 else
6 Fprev←Rightmost FET inFl
7 forall the F∈ Fudo
8 if F can legally overlap the right contact of Fprevthen
9 x(F) ←x(Fprev) +nf(Fprev)
10 Set cl(F)such thatFl∪ {F}is placed legally
11 ContinuePlacementUB(Fl∪ {F},Fu\ {F})
12 if Fprevcan be swapped without losing legalitythen
13 Swap Fprevand repeat steps 7–11
14 if line 11 was not yet reached then
15 Let F∈ Fusuch that F can be legally placed with the smallest possible x-coordinate and set x(F)accordingly
16 Set cl(F)such thatFl∪ {F}is placed legally
17 ContinuePlacementUB(Fl∪ {F},Fu\ {F})
for 1≤ i < nand finally Ns(Fn) = Nn+1 and Nd(Fn) = Nn+2. When no FET is swapped, the configuration graph consists of n−1 loops attached to vN0 and one loop attached to vNn+2.
We can assume that the loop in line 7 iterates over the transistors from F1to Fn
in the order of the indices. Then the algorithm will recurse in line 11 until Fn
is reached, subsequently placing Fn with a gap to the mutually overlapping
chain formed by the first n−1 FETs. Afterwards, Wbestis set to 2n+2.
As long as Fn ∈ Fu, the result of the lower bound computation in line 1 is
always 2n, because FETs with an even number of fingers cannot be used by Algorithm 1 for the prediction of gaps. Consequently, the bounding in line 2 is only reached whenFu= ∅.
This means that all permutations of the n−1 first unswapped FETs are gen-
erated by the algorithm, which only backtracks when the last remaining FET is placed. This implies a runtime ofΩ((n−1)!), which exceedsΩ(2n).
Although upper bounds could also be determined in polynomial or even lin- ear time, the following reasons support the method proposed in this section: • The approach does not make any assumptions on the distance function. For specific examples of distance functions, for example d0/2, even the
Table 4.2: Evaluation of upper bounds computed for 191 FET rows.
Number of FET rows for which the
... upper bound is already optimal 134 ... upper bound is 1 track larger than optimum 26 ... upper bound is 2 tracks larger than optimum 27 ... upper bound is>2 tracks larger than optimum 4
Sum of runtimes for
... all 191 experiments (sec.) 0.159 ... the 187 fastest experiments (sec.) 0.029
size of an optimal placement could be computed efficiently. However, an implementation for a general distance function is able to flexibly react to rule changes as the technology evolves.
• Experiments show that the performance of this method is good and fast enough: In many cases its results are already optimal and they rarely exceed the optimum by more than 2 tracks. The runtime of the method is mostly in the region of milliseconds, including all initialization steps. Table 4.2 summarizes the results on 191 FET rows extracted from the CLCtest bed (cf. Section 7.1).
• The bounding that is performed in line 2 of Algorithm 3 can be ex- tended to cover more complicated design rules coming from the tech- nology. If a new rule is introduced that excludes specific configurations, the bounding function can be amended by a check of such a forbidden situation. If it is detected, LowerBound returns∞, therefore preventing a further branching. All layouts that are saved toΛbestin Algorithm 3
are guaranteed to be legal by that extended definition. Considering the implementation, this is even more convenient because later algorithms will employ the same bounding function and consequently satisfy the same constraints. With this technique, we gain a significant amount of flexibility that outweighs the slow runtime compared to a faster, more specialized upper bound routine.
In a setting like Algorithm 3, where a lower bound computation is performed in every node of a branch and bound tree, this computation can even be im- plemented to run in constant time.
Lemma 7. Algorithm 1 can be implemented with runtimeO(1)within the setting of Algorithm 3.
Proof. W(Fl)and ∑F∈Funf(F)can easily be updated with a single addition
respective subtraction before every call of Algorithm 3. Using an array that maps every FET class to the number of FETs inFu∪ {Fprev}that belong to
4.4. ALGORITHMS 65
Algorithm 4: ContinuePlacement
Data: FETsFl with a legal layoutΛl, unplaced FETsFu, and nf(F)for
all F∈ Fu
1 if LowerBound(Fl,Fu)≥Wbestthen
2 return 3 ifFu= ∅then 4 Λbest ←Λl, Wbest ←W(Λl) 5 else 6 forall the F∈ Fudo 7 for cl(F) ∈ {Ns(F), Nd(F)}do
8 Set x(F)to the smallest legal value
9 ContinuePlacement(Fl∪ {F},Fu\ {F})
the classes’ configuration graphs and the numbers of odd vertices in those graphs also change in a constant number of places in every step. Therefore, all expressions in the return statement can be updated in constant time with every recursive call and need not be recomputed every time.