about 1.500 nodes located in ten different sites. Grid5000 experimenters benefit from a controlled environment and reproducible experimental conditions. To experiment on Grid5000, users configure the complete software stack using virtual operating system images and deploy these images on each node. Obviously, this is not enough when eval- uating a Desktop Grid system, because the computing nodes and networks have very different characteristics. Thus we proposed a new methodological approach aiming at assessing the feasibility of running a system in a real world Desktop Grid infrastructure. The experimental protocol consists of ten experiments, all very simple to set-up and run on Grid5000, which individually test one aspect of a Desktop Grid deployment, and all together give a good perspective on how a system would behave when deployed in a real environment. We call these experiments the Desktop Grid checklist because to be validated, a system has to pass successfully all the tests that address network connec- tivity (firewall, NAT), node and network failures, sabotage, heterogeneous network and computing nodes, stragglers and more. We applied this method broadly to fill the gap between in-house development and real-world deployment. For example, it has been used in [173] to compare between regular Hadoop and our own implementation of MapReduce for Desktop Grid when deployed on WAN.
3.3 DSL-Lab: a Platform to Experiment on Domestic
Broadband Internet
Experimental platforms such as PlanetLab [189] and Grid’5000 are promising method- ological approaches to study distributed systems. However, both platforms focus on high-end service and network deployments only available on a restricted part of the In- ternet, leaving aside the possibility for researchers to experiment in conditions close to what is usually available with domestic connection to the Internet. High-speed Inter- net access has become common in home families; ADSL (Asymmetric Digital Subscriber Line) lines are wide-spread and fiber optic communication is now gaining significant mar- ket penetration. The progress realized by these technologies allows Internet provider to offer their customer an Internet connection comparable, in term of bandwidth, to local area network (up to 1Gb/sec). However, the architecture of a network of home PCs interconnected by ADSL presents special characteristics: i) the physical characteristics of the network differ substantially from the LAN characteristics, already well studied, because of the asymmetric communication performance (download/upload) and the in- ternal ISP topologies; ii) within each family home, users share their Internet connection between several machines, using wired and/or WiFi local network as well as NAT and Firewalls to protect their network; iii) new classes of network appliance, beside the regular PC join this network: wifi phones, media center and IPTV, Network Attached Storage, networked gaming console etc. Furthermore, the network resource might be shared between several communication demanding applications (VOIP, P2P, gaming).
In 2007, I coordinated the DSL-Lab project, which was aiming at establishing a plat- form to experiment on distributed computing over broadband domestic Internet [190].
DSLLAB was a collaboration with Laurent Lefevre and Jean Patrick Gelas from the INRIA Reso team at Lyon and Oliver Richard and George Da Costa from the MESCAL team in Grenoble. The two main contributors for this platform were two PhD students: Lucas Nussbaum, advised by Olivier Richard, and Paul Mal´ecot, co-advised by Franck Cappello and myself.
DSL-Lab is a complementary approach to PlanetLab and Grid’5000 to experiment with distributed computing in an environment closer to how Internet appears, when applications are run on end-user PCs. DSL-Lab is a set of 40 low-power and low-noise nodes, which are hosted by participants, using the participants’ xDSL or cable access to the Internet. The objective is to provide a validation and experimentation platform for new protocols, services, simulators and emulators for these systems. DSLLab features:
• Hardware and Network: we had to select specialized hardware so that it would be powerful enough for conducting all our experiments, but low profile enough so that it won’t disturb volunteers. We selected the Neo CI852A-4RN10 barebone (Celeron M 1GHz, 512MB RAM, 2 Gb Compact Flash storage), which belongs to the Mini- ITX class of PC, characterized by a small size form factor, an absolute silence, thanks to the absence of fan or moving part, and low power processor. In January 2009, 32 DSLnodes were distributed on the French major DSL providers (Orange, Free, Neuf, Tele2), giving a good perspective on the broadband heterogeneity, as 4 technologies are allowed in France: ADSL, ADSL2, ADSL2+ and ReADSL. • Remote OS Deployment The DSL-Lab system is able to deploy remotely a new
OS on every DSLnode without asking for volunteer intervention. One of the major concerns when designing the DSL- Lab platform was to avoid as much as possible volunteer intervention on the nodes. So, we needed to be able to re- install (in case an experimenter breaks the installed system by mistake) and upgrade (for security reasons, or to install additional software) the whole software stack, including the operating system installed on DSLnodes.
• Connectivity and Security The platform is managed in such a way that only identi- fied experimenters have access to it. Experimenters first log into a central DSLLab server, which acts as a gateway and provides remote access to each DSLnode within a VPN through SSH.
• Resources and Power Management Most of the DSL-Lab experimenters are fa- miliar with the Grid’5000 platform. To leverage their knowledge acquired on Grid’5000, we have adapted the Grid’5000 batch scheduler, called OAR [160], to the DSL-Lab platform so that: i) experimenters have a similar work environment and ii) it would eventually facilitate the connection of both platforms. Thanks to OAR, several experimenters may reserve some nodes in advance and deploy their experiments simultaneously. Because DSLnodes are hosted on a volunteer basis, a request of the volunteers is that the DSLnode does not waste power. Besides selecting thrifty hardware, the system ensures that the DSLnodes are powered-off when not used, thus reducing electricity consumption. The node stays up if the
3.4 The European Desktop Grid Infrastructure