This chapter gave the background of the dissertation. First, a short introduc-tion of real-time systems and scheduling is shown. Second, two run-time energy management techniques, DPM and DVS, are thoroughly reviewed.
In particular, the main problem when using DPM and DVS together in the context of hard real-time systems is discussed. In addition, the three types of multi-core processor platforms in terms of the DPM and DVS capabilities are explained. After that, a state-of-the-art analysis regarding energy efficient real-time scheduling is presented. Finally, before this chapter is concluded, a brief overview of the ACPI standard is given.
Chapter 3
System Models and Problem Formulation
This chapter formally introduces basic notions and terminologies and defines the fundamental system models with special focus on power aspects. In the context of this dissertation, the hardware of a system is composed of a pro-cessor and multiple I/O devices, such as hard disk, ethernet controller, flash card controller and other possible peripherals. The processor power model and device power model are described in the first and second section, respec-tively. This dissertation assumes that a processor supports both DPM and DVS techniques while a device is only equipped with DPM capability. This assumption is widely applied in the context of system-wide energy optimiza-tion and can be found in numerous other studies, e.g., [DA10], [NL11] and [CG09]. Moreover, a real-time task model is described as well to address the software components of a system. Finally, before this chapter is concluded, the energy optimization problems in hard real-time systems are formulated.
3.1 Processor Power Model
Since this work primarily targets energy optimization, the definitions here concentrate on the power characteristics of processors rather than the func-tional aspects. The processor power model is defined following the ACPI recommendation by adopting the concept of C-states and P-states explained in the previous chapter. The C-states mainly describe different power states containing one active state and multiple low power (sleep) states with dif-ferent sleep depth while the P-states reveal difdif-ferent performance states with the corresponding operating speed. In other words, C-states represent states related to the DPM technique and P-states can be identified as states with regard to DVS. Therefore, in the remaining text the C-states and P-states are also referred to as the DPM states and DVS states, respectively. In what fol-lows, the power model of single-core processors and multi-core processors are defined in detail.
πͺπ πͺπ πͺπ πΊπ
πΊπ
πΊπ
πͺπ
π»(πΊπ, πΊπ) π·(πΊπ, πΊπ)
π»ππβπππ(πͺπ)
π·ππβπππ(πͺπ) π»πππβππ(πͺπ) π·πππβππ(πͺπ) π·(πΊπ) π(πΊπ)
π·(πͺπ) π»ππ(πͺπ)
Figure 3.1: The power state machine of a processor with 3 P-states and 4 C-states
3.1.1 Single-Core Processor
Let C = {C0,C1, ...,Cc} denote a finite set of C-states of a processor, where C0 is the only working state, i.e., the processor can only execute tasks in this state and C1,C2, ...,Cc are low power states (sleep states) in non-increasing order of their power consumption. Moreover, the state C0is defined as a superstate that contains a finite set of P-states denoted by S = {S1, S2, ..., Ss}. S1is the full performance state with the maximal operating speed and the remaining states are sorted in non-increasing order of their power consumption. βi : 1 β€ i β€ s, F(Si) and P(Si) denote the corresponding frequency (normalized with regard to F(S1)) and the power consumption, respectively. For instance, if a processor supports two operating speeds 100 MHz and 50 MHz, then F(S1) = 1 and F(S2) = 0.5. The overhead for state switching among different P-states is defined as follows:
β’ P(Si, Sj) is the power consumption of state switching from Sito Sj.
β’ T (Si, Sj) is the latency of state switching from Sito Sj.
Furthermore, βi : 1 β€ i β€ c, P(Ci), Tonβo f f(Ci), To f fβon(Ci), Ponβo f f(Ci), Po f fβon(Ci) and Tbe(Ci) are defined as follows:
β’ P(Ci) is the power consumption of the state Ci.
β’ Tonβo f f(Ci) is the latency for state switching from C0to Ci.
β’ To f fβon(Ci) is the latency for state switching from Cito C0.
β’ Ponβo f f(Ci) is the power consumption for state switching from C0 to Ci.
β’ Po f fβon(Ci) is the power consumption for state switching from Ci to C0.
β’ Tbe(Ci) is the break-even time to enter the state Ci.
In general, the C- and P-states of a processor can be modeled as a so-called power state machine [BBM00]. Figure 3.1 shows an example power state machine of a processor with 3 P-states and 4 C-states. In the figure the power properties of the state C3 are illustrated. As observed, this dissertation as-sumes that C-state switching may only take place between the active state and the low power states. A switching between two low power states is not allowed. In addition, if a processor needs to be switched to a low power state, then the current P-state is remembered as the history state that will be entered when the processor wakes up later.
The definition of break-even time is introduced in the previous chapter and its calculation formula is given in (2.20). Thus, βi : 1 β€ i β€ c, Tbe(Ci) can be derived as follows.
Tbe(Ci) = max{Toverhead(Ci),Eoverhead(Ci) β P(Ci) Β· Toverhead(Ci)
Ponβ P(Ci) } (3.1)
where Toverhead(Ci) and Eoverhead(Ci) are defined in (3.2) and (3.3), respec-tively. Intuitively, they are the latency and energy consumption required to enter and exit the low power state.
Toverhead(Ci) = To f fβon(Ci) + Tonβo f f(Ci) (3.2)
Eoverhead(Ci) = To f fβon(Ci) Β· Po f fβon(Ci) + Tonβo f f(Ci) Β· Ponβo f f(Ci) (3.3)
Furthermore, Pon denotes the power consumption when the processor is in the active state C0. As C0is a superstate, the value of Ponis clearly dependent on the concrete P-state. In other words, Tbe(Ci) becomes different if different P-states are selected. As break-even time is defined as the minimum time required to stay in a low power state, choosing the maximal value of Tbe(Ci) keeps the computation on the safe side. Thus, the final calculation formula of Tbe(Ci) is derived.
Tbe(Ci) = max
SjβS{Toverhead(Ci),Eoverhead(Ci) β P(Ci) Β· Toverhead(Ci)
P(Sj) β P(Ci) } (3.4)
3.1.2 Multi-Core Processor
The multi-core processor power model is a generalization of the single-core processor power model, where a finite set of processor cores are denoted by O = {O1, O2, ..., Oo}. Each processor core Oi is represented by a pair Oi= (Ci, Si), where Ciand Si denote the set of its C-states and P-states, re-spectively. More concretely, Ci = {C0i,C1i, ...,Ccii} and Si= {Si1, Si2, ..., Sisi} hold. The power consumption and switching overhead of each state are de-fined in the same way as in the single-core processor power model.
Furthermore, this dissertation considers cluster-based multi-core processors, where processor cores are clustered into disjoint groups G = {G1, G2, ..., Gg} with
O =
g
[
i=1
Gi (3.5)
βGi, Gj: i 6= j β Giβ© Gj= β (3.6) and
βGi: Giβ G, Giβ O (3.7)
To formally define the clustering, the function group : O β G is introduced.
Hereby group(Oi) expresses the group containing Oi. One important con-straint on cluster-based multi-core processor is that the cores in the same cluster may only operate at the same speed. In case of a speed conflict, i.e., two cores in one cluster require two different speeds, the highest speed is used as the cluster wide operating speed. This speed coordination strategy, on the one hand, is applied due to hard real-time constraints (more details are shown later) and on the other hand is even obliged on some platforms due to hardware restriction, e.g., Intel Coreβ’ 2 Quad [Intc]. Instinctively, this work assumes that the processor cores in the same cluster share a common power model, i.e., βOiβ O, Ojβ O, if group(Oi) = group(Oj), then Oi= Oj.