Programa de Mejoras - Aplicacion de IDEAL hasta la Etapa de Diagnostico para las areas de proce

This chapter gave the background of the dissertation. First, a short introduc-tion of real-time systems and scheduling is shown. Second, two run-time energy management techniques, DPM and DVS, are thoroughly reviewed.

In particular, the main problem when using DPM and DVS together in the context of hard real-time systems is discussed. In addition, the three types of multi-core processor platforms in terms of the DPM and DVS capabilities are explained. After that, a state-of-the-art analysis regarding energy efficient real-time scheduling is presented. Finally, before this chapter is concluded, a brief overview of the ACPI standard is given.

Chapter 3 System Models and Problem Formulation

This chapter formally introduces basic notions and terminologies and defines the fundamental system models with special focus on power aspects. In the context of this dissertation, the hardware of a system is composed of a pro-cessor and multiple I/O devices, such as hard disk, ethernet controller, flash card controller and other possible peripherals. The processor power model and device power model are described in the first and second section, respec-tively. This dissertation assumes that a processor supports both DPM and DVS techniques while a device is only equipped with DPM capability. This assumption is widely applied in the context of system-wide energy optimiza-tion and can be found in numerous other studies, e.g., [DA10], [NL11] and [CG09]. Moreover, a real-time task model is described as well to address the software components of a system. Finally, before this chapter is concluded, the energy optimization problems in hard real-time systems are formulated.

3.1 Processor Power Model

Since this work primarily targets energy optimization, the definitions here concentrate on the power characteristics of processors rather than the func-tional aspects. The processor power model is defined following the ACPI recommendation by adopting the concept of C-states and P-states explained in the previous chapter. The C-states mainly describe different power states containing one active state and multiple low power (sleep) states with dif-ferent sleep depth while the P-states reveal difdif-ferent performance states with the corresponding operating speed. In other words, C-states represent states related to the DPM technique and P-states can be identified as states with regard to DVS. Therefore, in the remaining text the C-states and P-states are also referred to as the DPM states and DVS states, respectively. In what fol-lows, the power model of single-core processors and multi-core processors are defined in detail.

𝑪_𝟏 𝑪_𝟐 𝑪_𝟑 𝑺𝟏

𝑺𝟐

𝑺𝟑

𝑪_𝟎

𝑻(𝑺_𝟏, 𝑺_𝟑) 𝑷(𝑺_𝟏, 𝑺_𝟑)

𝑻𝒐𝒏→𝒐𝒇𝒇(𝑪𝟑)

𝑷𝒐𝒏→𝒐𝒇𝒇(𝑪𝟑) 𝑻𝒐𝒇𝒇→𝒐𝒏(𝑪𝟑) 𝑷𝒐𝒇𝒇→𝒐𝒏(𝑪𝟑) 𝑷(𝑺_𝟑) 𝑭(𝑺𝟑)

𝑷(𝑪𝟑) 𝑻_𝒃𝒆(𝑪_𝟑)

Figure 3.1: The power state machine of a processor with 3 P-states and 4 C-states

3.1.1 Single-Core Processor

Let C = {C₀,C₁, ...,C_c} denote a finite set of C-states of a processor, where C₀ is the only working state, i.e., the processor can only execute tasks in this state and C1,C₂, ...,C_c are low power states (sleep states) in non-increasing order of their power consumption. Moreover, the state C₀is defined as a superstate that contains a finite set of P-states denoted by S = {S₁, S₂, ..., S_s}. S₁is the full performance state with the maximal operating speed and the remaining states are sorted in non-increasing order of their power consumption. ∀i : 1 ≤ i ≤ s, F(S_i) and P(S_i) denote the corresponding frequency (normalized with regard to F(S1)) and the power consumption, respectively. For instance, if a processor supports two operating speeds 100 MHz and 50 MHz, then F(S₁) = 1 and F(S₂) = 0.5. The overhead for state switching among different P-states is defined as follows:

• P(S_i, S_j) is the power consumption of state switching from S_ito S_j.

• T (Si, S_j) is the latency of state switching from S_ito Sj.

Furthermore, ∀i : 1 ≤ i ≤ c, P(C_i), T_{on→o f f}(C_i), T_{o f f→on}(C_i), P_{on→o f f}(C_i), P_{o f f}_→on(C_i) and T_be(C_i) are defined as follows:

• P(C_i) is the power consumption of the state C_i.

• Ton→o f f(C_i) is the latency for state switching from C₀to Ci.

• T_{o f f}_→on(C_i) is the latency for state switching from C_ito C₀.

• P_{on→o f f}(C_i) is the power consumption for state switching from C₀ to C_i.

• P_{o f f}_→on(C_i) is the power consumption for state switching from C_i to C₀.

• T_be(C_i) is the break-even time to enter the state C_i.

In general, the C- and P-states of a processor can be modeled as a so-called power state machine [BBM00]. Figure 3.1 shows an example power state machine of a processor with 3 P-states and 4 C-states. In the figure the power properties of the state C₃ are illustrated. As observed, this dissertation as-sumes that C-state switching may only take place between the active state and the low power states. A switching between two low power states is not allowed. In addition, if a processor needs to be switched to a low power state, then the current P-state is remembered as the history state that will be entered when the processor wakes up later.

The definition of break-even time is introduced in the previous chapter and its calculation formula is given in (2.20). Thus, ∀i : 1 ≤ i ≤ c, T_be(C_i) can be derived as follows.

T_be(C_i) = max{T_overhead(C_i),E_overhead(C_i) − P(C_i) · T_overhead(C_i)

P_on− P(C_i) } (3.1)

where T_overhead(C_i) and E_overhead(C_i) are defined in (3.2) and (3.3), respec-tively. Intuitively, they are the latency and energy consumption required to enter and exit the low power state.

T_overhead(C_i) = T_{o f f}_→on(C_i) + T_{on→o f f}(C_i) (3.2)

E_overhead(C_i) = T_{o f f}_→on(C_i) · P_{o f f}_→on(C_i) + T_{on→o f f}(C_i) · P_{on→o f f}(C_i) (3.3)

Furthermore, P_on denotes the power consumption when the processor is in the active state C₀. As C₀is a superstate, the value of P_onis clearly dependent on the concrete P-state. In other words, T_be(C_i) becomes different if different P-states are selected. As break-even time is defined as the minimum time required to stay in a low power state, choosing the maximal value of T_be(C_i) keeps the computation on the safe side. Thus, the final calculation formula of T_be(C_i) is derived.

T_be(C_i) = max

Sj∈S{T_overhead(C_i),E_overhead(C_i) − P(C_i) · T_overhead(C_i)

P(S_j) − P(C_i) } (3.4)

3.1.2 Multi-Core Processor

The multi-core processor power model is a generalization of the single-core processor power model, where a finite set of processor cores are denoted by O = {O₁, O₂, ..., O_o}. Each processor core O_i is represented by a pair O_i= (Cⁱ, Sⁱ), where Cⁱand Sⁱ denote the set of its C-states and P-states, re-spectively. More concretely, Cⁱ = {C₀ⁱ,C₁ⁱ, ...,C_cⁱ_i} and Sⁱ= {Sⁱ₁, Sⁱ₂, ..., Sⁱ_s_i} hold. The power consumption and switching overhead of each state are de-fined in the same way as in the single-core processor power model.

Furthermore, this dissertation considers cluster-based multi-core processors, where processor cores are clustered into disjoint groups G = {G₁, G₂, ..., G_g} with

O =

[

i=1

G_i (3.5)

∀G_i, G_j: i 6= j ⇒ Gi∩ G_j= ∅ (3.6) and

∀G_i: G_i∈ G, G_i⊆ O (3.7)

To formally define the clustering, the function group : O → G is introduced.

Hereby group(O_i) expresses the group containing O_i. One important con-straint on cluster-based multi-core processor is that the cores in the same cluster may only operate at the same speed. In case of a speed conflict, i.e., two cores in one cluster require two different speeds, the highest speed is used as the cluster wide operating speed. This speed coordination strategy, on the one hand, is applied due to hard real-time constraints (more details are shown later) and on the other hand is even obliged on some platforms due to hardware restriction, e.g., Intel Core^™ 2 Quad [Intc]. Instinctively, this work assumes that the processor cores in the same cluster share a common power model, i.e., ∀O_i∈ O, O_j∈ O, if group(O_i) = group(O_j), then O_i= O_j.

In document Aplicacion de IDEAL hasta la Etapa de Diagnostico para las areas de procesos del nivel de madurez 2 de People CMM en el proyecto de Calidad de la Facultad 10 de la UCI (página 55-60)