First, we show that a single call-free IR step simulates a single call-free bytecode step, where the abstract stacks from theBC2IRinstralgorithm describe the operand stacks in
the bytecode semantics.
Proposition B.3 Suppose we have the transition(m,i,s,h,ρ) −→B ¦0(m,i0,s0,h0,ρ0). Let stbe a temporary variable store, and as,as0be abstract stacks andIbe an IR instruction
such that
JasK
¦
s,st,h=ρ and BC2IRinstr(i,B,as)=(I,as
0).
Then there exists a temporary variable store s0tsuch that(m,i,s,st,h) I −→¦0(m,i0,s0,s0t,h0) and Jas 0 K ¦ s0,s0 t,h0=ρ 0.
PROOF By case distinction of the bytecode instructionB.
• B=nop. ThenI=block², andas0=as. By bytecode and IR semantics, we get s0=sandh0=handρ0=ρandst0=st, hence the proposition follows from the
assumptions.
• B=pushc. ThenI=block²andas0=c::as by definition ofBC2IRinstr. We
have the bytecode transition (m,i,s,h,ρ) −→B ¦0(m,i+1,s,h,JcK
¦::ρ) and the IR transition (m,i,s,st,h) I −→¦0(m,i+1,s,st,h). SinceJasK ¦ s,st,h=ρ, it follows Jas0K ¦ s,st,h=Jc::asK ¦ s,st,h=JcK ¦ s,st,h::JasK ¦ s,st,h=JcK ¦::ρ =ρ0.
• B=pop. ThenI=block²andas=e::as0. Stores and heaps remain unchanged in both the bytecode and IR transitions. We only need to showJas0K
¦ s,st,h=ρ
0,
but this follows fromJe::as0K
¦
s,st,h=v::ρ
0, wherevis the top-most value on the
initial stack.
• B=primop. ThenI=block²andas=e2::e1::eandas0=(e1ope2) ::efor some
e. We have (m,i,s,h,v2::v1::v)
B
−→¦0(m,i+1,s,h, (v1opv2::v)) for somev, and
(m,i,s,st,h) I
−→¦0(m,i+1,s,st,h). With the assumption, we havev=JeK
¦ s,st,h andv1=Je1K ¦ s,st,handv2=Je2K ¦ s,st,h, hencev1opv2=Je1ope2K ¦ s,st,h. Therefore, Jas 0 K ¦ s,st,h=ρ.
• B=loadx. Sincexappears in the bytecode, it cannot be a temporary variable, but ratherx∈dom(s). We haveI=block²andas0=x::as. The transitions are (m,i,s,h,ρ) −→B ¦0(m,i+1,s,h,s(x) ::ρ) and (m,i,s,st,h) I −→¦0(m,i+1,s,st,h). Thus we haveJas0 K ¦ s,st,h=JxK ¦ s,st,h::JasK ¦ s,st,h=s(x) ::ρ.
B.2 Proof of Semantics Preservation
• B=storex. Sincexappears in the bytecode, it cannot be a temporary variable, but ratherx∈dom(s). We havev::ρ=Je::asK
¦ s,st,h, hencev=JeK ¦ s,st,h. We define s[x7→v]=s[x7→JeK ¦ s,st,h]=s 0. Then (m,i,s,h,v::ρ) B −→¦0(m,i+1,s0,h,ρ). By definition,BC2IRinstr(i,B,e::as)=(block[x:=e],as[ti0/x]). We get the IR
transition (m,i,s,st,h) −→I ¦0(m,i+1,s0,s0t,h), wheres0t=st[ti07→s(x)]. In other
words, the IR step results in the same stores0, and the heaphis unchanged. We still have to show thatqas[ti0/x]y¦s0,s0
t,h=ρ. Because of the side conditiont
0 i 6∈as, we getqas[ti0/x]y¦s0,s0 t,h= q as[ti0/x]y¦s[x 7→v],st[ti07→s(x)],h=J as[s(x)/x]K¦s[x7→v],s t,h= JasK ¦ s,st,h=ρ.
• B=bnz j. Letρ =v ::v. There are two cases, depending on the valuev on top of the stack. We only treat the jump case here, i.e., wherev6=0. We have the bytecode transition (m,i,s,h,v::v) −→B ¦0(m,j,s,h,v). Also, by definition
BC2IRinstr(i,B,e::e)=(ife j,e) andas0=e. AsJe::eK
¦ s,st,h=JasK ¦ s,st,h=v::v, we getJeK ¦
s,st,h=v6=0. Therefore, the IR step is (m,i,s,st,h)
I
−→¦0(m,j,s,st,h).
Stores and heaps remain unchanged, andρ0=v=JeK
¦ s,st,h=Jas 0 K ¦ s,st,h.
• B=jmpj. ThenBC2IRinstr(i,B,as)=(jmp j,as). We have the two transitions
(m,i,s,h,ρ) −→B ¦0(m,j,s,h,ρ) and (m,i,s,st,h)
I
−→¦0(m,j,s,st,h). Stores, heaps,
and stacks remain unchanged.
• B=cpush j. This case is the same as forB=nop, since the IR semantics of cpushjis the same as forblock².
• B=cjmp j. This case is the same as forB=jmpj, since the IR semantics of cjmpjis the sam as forjmpj.
• B=newC. Letr be fresh inh, andh0=h∪[r 7→(C, [fields(C)7→v])]. We have the bytecode transition (m,i,s,h,v::ρ) −→B ¦0(m,i+1,s,h0,r::ρ). Also, by defi-
nitionBC2IRinstr=(i,B,e::as)=(block[ti0:=newC(e)],ti0::as). Since we can
use the same memory locationr in the IR semantics, we get the IR transition (m,i,s,st,h)
I
−→¦0(m,i+1,s,st0,h0), wheres0t=st[ti07→r]. That is, both resulting
heaps are the same, and the storesremains unchanged. Sinceti06∈asby the side condition, we getJas0K ¦ s,s0 t,h0= q ti0y¦s,s0 t,h0::JasK ¦ s,st,h=r::ρ=ρ 0.
• B=getf f. We haveBC2IRinstr(i,B,e::e)=(block²,e.f ::e). Thus, we have
the bytecode transition (m,i,s,h,r ::ρ) −→B 0¦(m,i+1,s,h,h(r)(f) ::ρ) and the IR transition (m,i,s,st,h)
I
−→¦0(m,i+1,s,st,h). Stores and heaps remain un-
changed. By assumptionJasK ¦ s,st,h=Je::eK ¦ s,st,h=r ::ρ, thusJeK ¦ s,st,h=r. It fol- lowsJas0 K ¦ s,st,h=Je.f ::eK ¦ s,st,h=h(JeK ¦ s,st,h)(f) ::ρ=h(r)(f) ::ρ=ρ 0.
• B=putf f. We have the transition (m,i,s,h,v::r ::v) −→B ¦0 (m,i+1,s,h0,v),
where h0 =h[r 7→h(r)[f 7→v]]. The input stack is as =e0 ::e ::e, such that
JasK
¦
s,st,h=Je0::e::eK
¦
s,st,h=v::r::v.
By definitionBC2IRinstr(i,B,as)=(block[ti:=e,e.f:=e0],ti). When we execute
the IR instruction, we get exactly the heaph0. Since
JeK
¦
s,st,h=vandti6∈eby the side condition, we get the IR transition (m,i,s,st,h)
I −→¦0(m,i+1,s,s0 t,h0), where s0t=st[ti7→v]. Finally, we haveJas0K ¦ s,s0 t,h0=JtiK ¦ s,s0 t,h0=v=ρ 0.
To show the semantic preservation for big-step transitions, we now consider a fixed IR program that has been created by theBC2IRalgorithm, i.e.,PIR=BC2IR(PBC), along
with the arraysASin[m,i] andASout[m,i]. We first show a result for call-free big-step
transitions.
Proposition B.4 Suppose we have(m,i,s,h,ρ) ==⇒
BC ¦
0(m,i0,s0,h0,ρ0). Let stbe a tempo-
rary variable store such thatρ=JASin[m,i]K
¦
s,st,h. Then there exists a store s
0 tsuch that (m,i,s,st,h) =⇒ IR ¦ 0(m,i0,s0,s0t,h0)andρ0=JASin[m,i0]K ¦ s0,s0 t,h0.
PROOF By induction on the number of execution steps, i.e., the definition of big-step semantics.
• If zero steps are taken, i.e., (m,i,s,h,ρ) ==⇒
BC ¦
0(m,i,s,h,ρ), then we can derive
(m,i,s,st,h) =⇒ IR
¦
0(m,i,s,st,h). Sincei0=i, we get the desired properties from
the assumptions.
• Suppose more than zero steps have been taken, that is, we have the small-step transition (m,i,s,h,ρ) −−−−−→IR[m,i] 0¦(m,i00,s00,h00,ρ00) and the big-step transition (m,i00,s00,h00,ρ00) ==⇒
BC ¦
0(m,i0,s0,h0,ρ0). SincePIR=BC2IR(PBC), we get by defi-
nition of the algorithmBC2IRinstr(i,BC(m,i),ASin[m,i])=(IR[m,i],ASout[m,i]).
With Proposition B.3, we immediately get (m,i,s,st,h) IR[m,i]
−−−−−→¦0(m,i00,s00,s00t,h00) andρ00=JASout[m,i]K
¦ s00,s00
t,h00. Since (m,i)
IR[m,i]
7−−−−−→(m,i00), we get with Propo- sition 5.5 on page 73ASout[m,i]=ASin[m,i00], henceρ0=JASin[m,i00]K
¦ s00,s00
t,h00. Therefore, we can apply this theorem inductively, and get the transition relation (m,i00,s00,s00
t,h00) =⇒IR ¦0(m,i0,s0,st0,h0) andρ=JASin[m,i0]K
¦ s0,s0
t,h0. Also, by defini- tion of IR big-step semantics, we get (m,i,s,st,h) =⇒
IR ¦
0(m,i0,s0,s0t,h0).
The following proposition applies to arbitrary big-step transitions. Proposition B.5 Suppose we have(m,i,s,h,ρ)==⇒
BC ¦
n(m,i0,s0,h0,ρ0). Let stbe a tempo-
rary variable store such thatρ=JASin[m,i]K
¦
s,st,h. Then there exists a store s
0 tsuch that (m,i,s,st,h) =⇒ IR ¦ n(m,i0,s0,s0t,h0)andρ0=JASin[m,i0]K ¦ s0,s0 t,h0.
B.2 Proof of Semantics Preservation
We call the previous propositionP(m,n), and prove it by strong induction onn. For this, we need the following auxiliary proposition.
Proposition B.6 Let n≥0, and supposeP(m,k)holds for all methods m and all call depths k<n. Suppose(m,i,s,h,ρ) −−−−−−→BC(m,i) n¦ (m,i0,s0,h0,ρ0). Let stbe a store such that
ρ=JASin[m,i]K ¦ s,st,h. Then there is an s 0 tsuch that(m,i,s,st,h) IR[m,i] −−−−−→¦n(m,i0,s0,s0t,h0) andρ0= JASin[m,i0]K ¦ s0,s0 t,h0.
PROOF Ifn=0, we can directly apply Proposition B.3 and get the desired properties. As- sumen>0. By construction of the bytecode semantics, it followsBC(m,i)=callm0.
We have (m,i,s,h,v::r ::v0) −→B ¦
n(m,i+1,s,h0,s00(ret) ::v0) and the method exe-
cution (s0,h) m
0
==⇒
BC ¦
n−1(s00,h0), wheres0=[this7→r]∪[margs(m0)7→v]∪[ret7→defval].
By definition of big-step method executions, this means there existsρ00such that (m0,mentry(m0),s0,h,²) ==⇒
BC ¦
n−1 (m0,mexit(m0),s00,h0,ρ00). From Proposition 5.4 on
page 73, we knowASin[m0,mentry(m)]=². Therefore, we applyP(m0,n−1), and get
that there is a storest00such that
(mentry(m0),i,s, [tvars(m0)7→defval],h) =⇒
IR ¦
n−1(m0,mexit(m0),s00,s00t,h0)
which means by definition (s0,h) =⇒m0 IR
¦
n−1(s00,h0). (*)
At the same time, we haveIR[m,i]=block[ti:=e0;ti0:=e.m0(e)] and the stacks
ASin[m,i]=(e::e::e0) andASin[m,i+1]=ti. With the assumption, we know that
q
e::e::e0y¦
s,st,h=v::r::v
0.
For the execution of the block, we get with IR semantics and result (*) the transition (m,i,s,st,h) IR[m,i] −−−−−→¦n(m,i+1,s,s0t,h0), wherest0=[ti7→ q e0y¦ s,st,h]∪[t 0 i 7→s00(ret)]. All
that is left to show is thatJASin[m,i+1]K
¦ s,s0 t,h0= q ti0::tiy¦s,s0 t,h0=s 00(ret) ::v0=ρ0. But
this follows fromqti0y¦s,s0
t,h0=s 00(ret) andrt0 i z¦ s,s0 t,h0 =qe0y¦ s,st,h=v 0.
PROOF(OFPROPOSITIONB.5) By induction on the length of the executionc.
• If zero steps are taken, we have (m,i,s,h,ρ) ==⇒
BC ¦
n(m,i,s,h,ρ). For anyst, we
can derive (m,i,s,st,h) =⇒ IR
¦
n(m,i,s,st,h) and get the desired properties directly
from the assumptions.
• Ifc>0 steps are taken, then suppose this propositionP(m,n) holds for allc0<c.
By definition of big-step semantics, we have
(m,i,s,h,ρ) −−−−−−→BC(m,i) ¦n1(m,i 00,s00,h00,ρ00) ==⇒ BC ¦ n2(m,i 0,s0,h0,ρ0)
wheren1≤nandn2≤n. We can apply Proposition B.6 to the first step and the
induction hypothesisP(m,n) to the remaining execution, because it contains less execution steps. It follows
(m,i,s,st,h) IR[m,i] −−−−−→¦n1(m,i 00,s00,s00 t,h00) =⇒ IR ¦ n2(m,i 0,s0,s0 t,h0) for somes0 t,s00t, such thatJASin[m,i00]K ¦ s00,s00 t,h00=ρ 00and JASin[m,i0]K ¦ s0,s0 t,h0 =ρ 0.
Putting the executions together, we get (m,i,s,st,h)=⇒ IR
¦
n(m,i0,s0,s0t,h0).
Theorem B.7 Let PBC be a bytecode program, and PIR be an IR program such that
PIR=BC2IR(PBC). Let m be a method, s,s0be stores, and h,h0be heaps, and n be a call
depth. If (s,h) ==⇒m BC ¦ n(s0,h0) then (s,h) =⇒m IR ¦ n(s0,h0) PROOF By definition of (s,h) ==⇒m BC ¦
n(s0,h0) semantics, we know there is a stackρsuch
that (m,mentry(m),s,h,²) ==⇒
BC ¦
n (m,mexit(m),s0,h0,ρ). Letst =[tvars(m)7→defval].
With Proposition 5.4 on page 73, we haveASin[m,mentry(m)]=², therefore we know
thatJASin[m,mentry(m)]K
¦
s,st,h=². We can apply Proposition B.5 and get that there is some temporary states0tsuch that (m,mentry(m),s,st,h) =⇒
IR ¦
n(m,mexit(m),s0,s0t,h0),
which by definition means (s,h) =⇒m
IR ¦
C
Correctness Proof for the IR Type System
This appendix contains the complete soundness proof for the type system of the intermediate representation. In the following, the meta-variableωis used for IR state triples (s,st,h).
C.1 Properties of Single Executions
In this section, we first show how assignments in the IR language are related to as- signments in the high-level DSD language. Then we define equivalence relations for confluence point stacks and for IR configurations. Finally, we show that a program does not change “low” variables and fields under a “high”pclabel, and other properties that apply to single executions of well-typed IR instructions.
C.1.1 Connection to the High-Level Language
We extend some high-level definitions to IR program states:
constraint set satisfiability: (s,st,h)|=¦Q ⇐⇒ (s∪st,h)|=¦Q
expression evaluation: JeK¦s,s
t,h = JeK
¦ s∪st,h
We formalize the similarity between assignment blocks and the respective high-level sequential compositions. We define an auxiliary functionmakestmtthat combines a
sequence of assignment statements into a valid high-level statement (either a nested sequential composition orskip):
makestmt(a)= skip ifa=² a ifa=a a;makestmt(ar) ifa=a::ar
Then for call-free assignment blocks, the IR semantics corresponds directly to the high-level semantics.
Lemma C.1 Let blocka be an assignment block IR instruction. If the small-step IR transition(m,i,s,st,h)
blocka
−−−−−−→¦0(m,i+1,s0,s0t,h0)can be derived, then high-level big-
step transition(s∪st,h) makestmt
(a)
−−−−−−−−−→¦(s0∪s0t,h0)can be derived.
PROOF Follows directly from the definition of =⇒a ¦0.
The correspondence of the IR typing for assignment blocks and the high-level typing judgement is formalized by the following lemma.
Lemma C.2 Let I=blocka be an assignment block instruction. If the IR typing judge- mentΓ,Γt `i,pc,∆,Q
I
−→ i+1,pc,∆,Q0can be derived, then high-level typing judge- mentΓ∪Γt,pc ` {Q} makestmt(a) {Q0}can be derived.
PROOF Follows directly from the definition ofΓ,Γt,∆,pc `{Q} a {Q0}.
Finally, we show that the evaluation ofpcand∆remains unchanged during the execution of assignment blocks.
Lemma C.3 LetI=blocka. If we can derive the transition(m,i,ω) −→I ¦n(m,i+1,ω0)
and the typing judgmenetΓ,Γt `i,pc,∆,Q I −→ i+1,pc,∆,Q0, then JpcK ¦ ω=JpcK ¦ ω0 andJ∆K¦ω=J∆K¦ω0.
PROOF This follows from the definition of the typing rules. They require that the variables or fields that can possibly change do not occur syntactically in∆orpc. C.1.2 Equivalences
We define the following equivalence relation for IR program states, which naturally extends the definition of DSD program states:
C.1 Properties of Single Executions
We also give a definition for confluence point stack equivalences: we require that there exists a common “low” prefix such that the program points and evaluated labels in this “low” prefix are equal, the labels evaluate to some domain that is less or equal tok, and in the remaining “high” parts of the stacks, the labels evaluate to a domain that is neither less nor equal tok.
Definition C.4 Letω1,ω2be IR states, and¦be a domain lattice, and k∈Dom¦. Two
evaluated confluence point stacksJ∆1K
¦ ω1=i1,k1andJ∆2K ¦ ω2=i2,k2are¦,k-equivalent, written `¦J∆1K ¦ ω1 k ∼J∆2K ¦ ω2 if there exists a prefix index p such that
• 0≤p≤min(|k1|,|k2|),
• ∀j∈1, . . . ,p.k1.j=k2.j≤¦k ∧ i1.j=i2.j ,
• ∀j∈{p+1, . . . ,|k1|}.k1.j6≤¦k, and
• ∀j∈{p+1, . . . ,|k2|}.k2.j6≤¦k.
Definition C.5 (Equivalent IR configurations) Let¦be a domain lattice, and let k be a domain with k∈Dom¦. Two IR configurations are¦,k-equivalent, written
Γ,Γt`(m,i1,ω1) :pc1,∆1,Q1≈¦k,β(m,i2,ω2) :pc2,∆2,Q2
if and only if:
• `¦ω1∼Γβ,Γt,kω2, • ω1|=¦Q1andω2|=¦Q2, • `¦J∆1K ¦ ω1 k ∼J∆2K ¦ ω2, and • eitherJpc1K ¦ ω1=Jpc2K ¦ ω2≤ ¦k and i 1=i2, orJpc1K ¦ ω16≤ ¦k and Jpc2K ¦ ω26≤ ¦k
Lemma C.6 All equivalence relations defined in this section are reflexive, transitive, and symmetric.
C.1.3 Properties of single executions
Then we show in two lemmas that for a single execution of a small step, • if the pre-state satisfiesQ, then the post-state satisfiesQ0,
• if the step is executed under a “high” pc, then the states are indistinguishable, • there is an invariant on∆: it contains ascendingpclabels, and if there is a bottom
element, it does not change during the execution, and if the finalpcis “high”, the stack does not change its “low” prefix.
Using these lemmas, we then show that the satisfiability of constraint setsQand the indistinguishability of states under a highpcremain invariant for single executions of an arbitrary chain of steps.
Lemma C.7 LetΓ,Γt `i,pc,∆,Q I
−→ i0,pc0,∆0,Q0and(m,i,ω) −→I ¦0(m,i0,ω0).
Ifω|=¦Q, thenω0|=¦Q0.
PROOF By induction over the type derivation.
• If the typing judgement has been derived by the weakening rule, we have the judgementΓ,Γt `i,pc,∆,Q0
I
−→ i0,pc0,∆0,Q00withQ ⇒ Q0 andQ00 ⇒ Q0.
Sinceω|=¦Q, we get with Lemma 3.5 on page 34 thatω|=¦Q0. Hence we can
apply the theorem inductively and getω0|=¦Q00, and hence with lemmaω|=¦Q0. • I=ife j.
We have (m,i,ω) −→I ¦0(m,i0,ω). We invert the rule for the instruction and make a
case distinction over the shape of the postcondition: – IfQ0=Q, thenω|=¦Q0follows trivially.
– IfQ0=Q∪{`1v`2}, thene=`1v`2. Also,i0=j, thus (m,i,ω)
I −→¦0(m,j,ω), thusJ`1v`2K ¦ ω6=0. Therefore,J`1K ¦ ω≤¦J`2K ¦
ω, and withω|=¦Q, we get