3. Responsabilidad Social: Factor de Éxito en las Empresas
3.1. COLGATE
Tree: /
i <expr>
Possible code: a: <expr> → register n STORE n, #t1 LOAD n, i DIV n, #t1 d: <expr> → register m LOAD n, i DIVr n, m b: LOAD n, i <expr> → register m DIVr n, m e: <expr> → register n xDIV n, i c: <expr> → register n LOAD m, i DIVr m, n ?(LOADr n, m)
Figure 5.5: The cost of a reverse divide operation
Figure 5.5 shows that reverse operations are occasionally expensive, and that therefore they should be avoided whenever possible. I show below that in many cases it is difficult to avoid reverse operations, particularly when there aren’t enough registers available to calculate the value of a complicated expression.
5.2
Register Dumping
Although the TranBinOp procedure of figure 5.4 reduces the number of registers required to calculate the value of an expression, it assumes that there are always a sufficient number available, no matter how complicated the expression. Even on multi-register machines this may not be so – some of the registers may be devoted to other purposes as chapter 11 shows – and on a machine with only one or two registers it is vitally important to use a translation algorithm which doesn’t demand an infinite supply of registers. It is easy to invent an expression which uses all the available registers: if an expression contains a function call, for example, it is reasonable to assume (see below) that it requires all the registers no matter how many there are available. Minimising the number of registers required in order to evaluate an expression is an issue whether the object machine has only one register or has thirty-two or even (as on ATLAS of fond memory) one hundred and twenty-eight.
Consider, for example, the expression shown in figure 5.6. Figure 5.7 shows the
2 I leave it as an exercise for the reader to discover the modifications required to the Tran-
BinOp procedure, as developed in this chapter, which would allow it to generate case (d) from figure 5.5.
86 CHAPTER 5. TRANSLATING ARITHMETIC EXPRESSIONS
Expression: a*b / ((c*d-e*f) / (g+h))
Tree: / * / - + * * a b c d e f g h
Figure 5.6: A more complicated example expression
LOAD 1, a MULT 1, b /* r1 := a*b */ LOAD 2, c MULT 2, d /* r2 := c*d */ LOAD 3, e MULT 3, f /* r3 := e*f */ SUBr 2, 3 /* r2 := c*d-e*f */ LOAD 3, g ADD 3, h /* r3 := g+h */ DIVr 2, 3 /* r2 := (c*d-e*f)/(g+h) */ DIVr 1, 2 /* r1 := a*b/((c*d-e*f)/(g+h)) */
5.2. REGISTER DUMPING 87 { let firstreg = TranArithExpr(first, regno)
let secondreg = TranArithExpr(second, regno+1)
if dumped(firstreg) then
{ Gen(reverse(op), secondreg, placedumped(firstreg))
resultis secondreg
}
else
{ Gen(op++‘r’, firstreg, secondreg)
resultis firstreg
} }
Figure 5.8: Dealing with dumped registers
code which would be produced by the TranBinOp algorithm, as developed so far, when given this rather involved expression. The code is acceptable provided the machine has three registers available: if there are only two then it seems that the translator must generate instructions which ‘dump’ the value held in one of the registers during evaluation of the expression. It’s complicated to describe a register-dumping scheme and I won’t attempt to give one in detail here. It would require quite a lot of modification to make the procedure of figure 5.4 dump registers. First it would need a table of ‘registers in use’, filled in by the TranArithExpr procedure each time it generates an instruction which loads a value into a register and altered by TranBinOp when that value is no longer required. Second, if a procedure is called with a ‘regno’ which is too large (register 3 when there are only registers 1 and 2, say) the procedure must dump one of the registers currently in use and perform the operation in that dumped register. Third, each procedure must return as its result the number of the register in which the operation was actually performed – because of register dumping this may not be the same as the argument ‘regno’. The section of the TranBinOp procedure which handles ‘<expr>op<expr>’ nodes, in which neither operand is a leaf, must take account of the fact that something which it loads into a register may be dumped later. Part of a possible procedure is shown in figure 5.8.
The trouble with this procedure is that it may dump the ‘wrong’ register. Figure 5.9 shows the effect of dumping the ‘wrong’ register (l.h. column) compared to the ‘right’ register (r.h.column). Note that the two different dumping strategies produce the final answer in different registers.3 The dumping instructions are
marked ‘*’, the corresponding reverse operations with ‘!’. Note that the more
3 In some cases – e.g. where the value of an expression is to form the result of a function –
it may be important that the value is actually loaded into a particular register, so there is more to this problem than appears at first sight.
88 CHAPTER 5. TRANSLATING ARITHMETIC EXPRESSIONS ‘Wrong’ register: LOAD 1, a
MULT 1, b LOAD 2, c MULT 2, d * STORE 2, #t1 LOAD 2, e MULT 2, f ! xSUB 2, #t1 * STORE 2, #t2 LOAD 2, g ADD 2, h ! xDIV 2, #t2 DIVr 1, 2
‘Right’ register: LOAD 1, a MULT 1, b LOAD 2, c MULT 2, d * STORE 1, #t1 LOAD 1, e MULT 1, f SUBr 2, 1 LOAD 1, g ADD 1, h DIVr 2, 1 ! xDIV 2, #t1
Figure 5.9: Dumping ‘right’ and ‘wrong’ registers
dumping that goes on, the more reverse operations are required and the longer and slower the object program. Remember also that reverse operations may be more expensive than the forward variant, as illustrated in figure 5.5.
If the translation algorithm is to generate the code shown in the right-hand column of figure 5.9, it must always dump the register whose next use is farthest away in the list of machine instructions. This is easy enough to decide in simple cases like that in figure 5.6, but in general it needs proper flow analysis of the program and flow analysis isn’t simple translation – it’s optimisation. In any case, there is a better answer to the problem of register dumping in the shape of the tree-weighting algorithm developed below.
When a program uses temporary ‘dump’ locations – e.g. #t1 in figure 5.9 – they can be allocated to run-time memory locations in the same way as other variables (see chapter 8). It’s tempting, on a machine with a hardware stack, to use the stack for dump locations but it’s usually inefficient and always unnecessary.