3.3 SELECCIÓN DE ENSAYOS
3.3.8 Cuantías de armadura longitudinal
As stated in previous sections, thecpuidlesubsystem is a very com-plex component. Kernel timer events are the main input for the governor algorithm as they often indicate the next wake-up of the CPU, but the running average used to scale the latter makes it unpre-dictable, since it depends on the recent state of the whole machine.
The purpose of this section is to shed some light on the linkage be-tween the residence time of C-states, the number of wake-ups per second, the CPU load and the transmission of wireless frames.
We implemented a very simple application18with two modes of 18Available athttps://github.com/
Enchufa2/udperf
operation: it is capable of setting a kernel timer at a given constant rate and, when this timer is triggered, it (i) does nothing or (ii) sends a UDP packet. At the same time, it calculates the mean residence time of each C-state over the whole execution.
Figures4.7have been compiled using this tool. The additional CPU load was added on top of the latter using a modified version of
lookbusy19. Figure4.8compares the two previous figures in terms of 19http://www.devin.com/lookbusy
power consumption.
Figure 4.7: Residence time of each C-state vs. wake-ups/s for kernel 3.14.
Each wake-up does nothing (left) or performs a UDP transmission (right).
In Figure4.7(left), the only source of wake-ups is the kernel timer that our tool sets. Each C-state is represented by a different colour, and shapes and line types distinguish between CPU loads. The first observation is that the addition of a substantial source of CPU load has no impact on the distribution of residence times. Another im-portant observation is that, up to 2000 and from 3500 wake-ups/s onwards, there is only one active idle state (C2 or C1 respectively), and the behaviour is linear. This fact can be verified by checking the power consumption (Figure4.8, red lines). From 2000 to 3500 wake-ups/s, the transition between C-states occurs in a non-linear way.
In Figure4.7(right), on other hand, there is another source of wake-ups: hardware interrupts caused by the wireless card each time a packet is sent. The transition between states occurs earlier because
6
P [W] Figure 4.8: Power consumption offset
vs. wake-ups/s for kernel 3.14.
there is actually twice the number of wake-ups. And, again, the CPU load shows no impact on the distribution of residence times.
These are partial results and are limited to constant rate wake-ups, but these findings are in line with the non-linearities previously discovered in the cross-factor and they confirms the enormous com-plexity we face.
4.6 Summary
This chapter follows the path set out bySerrano et al.[2015] with the discovery of the cross-factor, an energy toll not accounted by clas-sical energy models and associated to the very fact that frames are processed along the network stack. We have introduced the laptop as a more suitable device to perform whole-device energy measure-ments in order to deseed the root causes of the cross-factor by taking advantage of the wide range of debugging tools that such platform enables.
Our results20, albeit preliminary, provide several fundamental 20I. Ucar, A. Azcorra, and A. Banchs.
Deseeding Energy Consumption of Net-work Stacks. In 6th Annual International IMDEA Networks Workshop, June 2014.
(invited poster); and I. Ucar and A. Az-corra. Deseeding energy consumption of network stacks. In IEEE 1st Interna-tional Forum on Research and Technologies for Society and Industry Leveraging a better tomorrow (RTSI), pages 7–16, Sept.
2015. ISBN 978-1-4673-8167-3. d o i: 10.1109/RTSI.2015.7325085 Ucar et al. [2014], Ucar and Azcorra [2015]
insights on this matter:
• We have identified the CPU as the leading cause of the cross-factor in laptops. Thus, the cross-factor shows absolutely no dependence on the frame size, because the RAM memory has no significant impact in the overall energy consumption of wireless transmis-sions. On the other hand, low-powered devices, like the Soekris, show a very small but perceptible dependency that can be ascribed to the RAM memory.
• The CPU’s C-state management plays a central role in the
en-ergy consumption, because a CPU spends most of the time in idle mode.
• When the C-state management subsystem is not present in the OS, the device enters C1 in idle mode (halted) and cannot benefit from lower idle states.
• In contrast to low-powered devices, the C1 state of a laptop’s CPU saves a very small amount of power.
• With a fully functional C-state management subsystem, the linear behaviour disappears. In consequence, we cannot talk about cross-factor as a fixed energy toll per frame.
• A non-linear behaviour implies that we cannot perform energy breakdowns by dropping packets inside the transmission chain.
Therefore, new methodologies and techniques are required to enable energy debugging.
• C-state residence times depend primarily on the number of wake-ups per second produced by software and hardware interrupts.
However, they show no dependence on the CPU load.
Further research is needed in order to fully understand the key role of the C-state subsystem in the energy consumption of wireless communications, as well as to investigate other processor capabili-ties not accounted for in this work, such as P-states and multicore support.
In this chapter, we revisit the idea of packet overhearing as a trigger for sleep opportunities, and we take it one step further to the range of microseconds. To this end, we experimentally explore the timing limitations of 802.11 cards. Then, we analyse 802.11 to identify potential micro-sleep opportunities, taking into account practical CSMA-related issues (e.g., capture effect, hidden nodes) not considered in prior work.
Building on this knowledge, we design µNap, a local standard-compliant energy-saving mechanism for 802.11 WLANs. With µNap, a station is capable of saving energy during packet overhearing au-tonomously, with full independence from the 802.11 capabilities supported or other power saving mechanisms in use, which makes it backwards compatible and incrementally deployable.
Finally, the performance of the algorithm is evaluated based on our measurements and real wireless traces. In a brief discussion of the impact and applicability of our mechanism, we draw attention to the need for standardising hardware capabilities in terms of energy in 802.11.