The particularity in our approach to pipeline analysis is that the state of the dynamic allocated sequences is simultaneously updated after each pipeline stage, due to the invo- cation of the abstract state-transformers of the allocated resources in order to obtain the “actual” resource state. These allocated resources are the register and data memory abstract environments as well as the instruction abstract cache contents. For this reason, we re-define the notion of concrete pipeline state in [122] and introduce the notion of an “hybrid pipeline state”, which combines concrete timing information with the abstract state of resources in a single definition.
We define an abstract pipeline state, denoted byP], as a collection of hybrid pipeline states, P, computed for each program point. This definition corresponds the canonical extension of the hybrid pipeline states to sets of states. As already mentioned in [122], the design of the abstract pipeline domain in this way is enforced by the fact there is not a known abstraction of sets of concrete pipeline states. Although the efficiency the pipeline analysis highly depends on the number of hybrid pipeline states that must be computed, the termination of the analysis is guaranteed because, for a particular program, there are only finitely many hybrid pipeline states.
Formally, the abstract domain pipeline P] is defined as the collecting semantics P], 2P of hybrid pipeline states. It forms the powerset complete lattice with set inclusion ⊆ as its partial order, set union ∪ as its least upper bound, set intersection ∩ as its greatest lower bound,⊥= ∅as its least element and>= 2P as its greatest element. Consequently, the join of abstract pipeline states is given by set union: p1] t]Pp2] = p1] ∪ p2].
Previously, in Section6.3, we introduced the notion of store buffers to express the necessity to store intermediate abstract states of the allocated resources allocated during the pipeline analysis of every single instruction. Since the least upper bound between the store buffers R0],D0],C0] and M0] and top level domains R], D],C] and M] is performed at the level of the abstract pipeline state-transformer, the definition of hybrid pipeline state, P is defined in terms of the store buffers:
P , (Time × Pc × Demand × R0]× D0]× C0]× M0]× Coord) (6.29) where Time is the global number of CPU cycles, Pc is “program counter” of the next instruction to fetch,Demandis a 32-bit word used to model the dependences between registers in such a way that each register is either a blocked or unblocked resource, and Coord is a
N-sized vector, being N the number of instructions allowed inside the pipeline.
Coord , [TimedTask ]N (6.30)
A TimedTask is defined for a single instruction, Instr, and consists in the current elapsed CPU Cycles and the current Stage of a given Task. A Task is associated to one instruction and holds also store buffers inside the “context” of an hybrid state.
TimedTask , (Cycles × Stage × Task ) (6.31)
Cycles , Int (6.32)
Stage , FI | DI | EX | MEM | WB (6.33)
Task , (Instr × Pc × Demand × R0]× D0]× C0]× M0]) (6.34) For the purpose of WCET analysis we are then interested in timing properties of instructions already at the end of the WriteBack stage. These properties are measured as CPI (Cycles Per Instruction) and are easily extracted from an hybrid pipeline state by selecting from the Coord N-sized vector, theTimedTask of the desired instruction (Instr) and then extract from it the value ofCycles when the stage isWB.
The Haskell definitions for the domain definitions are obtained straightforwardly. In order to distinguish the concrete part of an hybrid state we use the parametrized datatype P a, where the type variableadenotes a concrete timingproperty. To emphasize that the number
N of instructions inside the pipeline is variable, type of the Coord coordinates vector is isomorphic to the list type.
data P a = P {time::Int,nextpc:: Word32,demand:: Word32,
regs::R],datam::D],instrm:: I],
coords:: Coord a }
newtype Coord a = Coord [TimedTask a ] data Stage = FI | DI | EX | MEM | WB
data TimedTask a = TimedTask {property:: a,stage:: Stage,task:: TaskState}
As already mentioned, the resource associations is a pair of a stage s ∈ PS and a set of resources. In our pipeline functional model, a resource association is denoted byTaskState, which uses the constructors Ready, Fetched, Decoded, Stalled, Executed and Done, to distinguish the different resource associations inside an allocation resource sequence. Moreover, since we combine the analysis of the resources simultaneously with the pipeline analysis, some instances ofTaskStaterequire also a temporary register fileR]. The datatype Reasonis used to specify the cause of a stall. It contains only the constructors for structural and data hazards because the control hazards are handled in a particular way, as will be described latter in this section.
data TaskState = Fetched TaskR]| Decoded TaskR]| Stalled Reason TaskR]
| Executed TaskR] | Done Task | Ready Task data Reason = Structural | Data
data Task = Task {taskInstr::Instr,taskN extP c:: Word32,taskDemand:: Word32,
taskRegs::R],taskDmem::D],taskIM em:: I]}
The partial order on the domain (TimedTaska) is simply the order on natural numbers (6) on its record functionproperty. Hence, the partial order on a coordinates vector (Coord a) is determined by the maximal element of theN elements of the corresponding list. Finally, the partial order on (P a) combines the global timingtime with the relative elapsed CPU cycles
contained in the coordinates vectorCoord. The partial order on (P a) is solely determined by its concrete components, defined by a proper instance of the type classOrd. The combination of these two timing properties is compared using the componentwise ordering (p1, p2) v2P
(q1, q2) , p1 vP q1∧p2 vP q2.
instance (Eqa,Orda,N uma) ⇒Ord(P a) where
comparea b =compare(timea,maxCycles(coordsa)) (timeb,maxCycles(coordsb))
The maximal element inside a coordinate vector is given by the function maxCycles. This function selects the timing property of each TimedTask using the function map and then computes the maximal value using the function maximum.
maxCycles:: (Orda,N uma) ⇒ Coord a → a
maxCycles(Coord vec) =maximum$map propertyvec