6.2 Gráficas comparativas
6.2.2 Reduciendo el tráfico
Currently a number of different methods and techniques are employed in order to provide resource management controls in the cloud environment. We shall examine VMware’s Distributed Resource Scheduler (DRS) [22] in more detail as a represen- tative system. All systems perform two basic functions: (1) application placement and (2) load balancing. The DRS scheduler has a rich set of controls used for effi- cient multi-resource management whilst providing differentiated QoS to groups of VMs, but this is obtainable for a small scale compared to a typical cloud. DRS is currently supporting 32 hosts and 3,000 VMs approximately, in a management domain called a cluster (see Fig. 6.4).
Having a rich set of resource management controls, in a large-scale cloud envi- ronment, lightens the noisy-neighbour problem for tenants if, like in the case of DRS, the underlying management infrastructure natively supports automated enforcement of guarantees. Additionally, such controls provide support to the cloud service provider from overcommitting hardware resources safely, allowing better efficiency from statistically multiplexing resources without sacrificing the expected guarantees.
6.3.1.1 Basic Resource Controls in DRS
VMware enterprise software hypervisors (ESX) and DRS provide resource controls allowing administrators and users to express allocations in terms of their absolute VM service rate requirements or relative VM importance. The same controls are provided for CPU and memory allocations for both host and cluster levels. Similar controls are under development for input and output resources that in conjunction with VMware’s Distributed Power Management product power on/off hosts whilst respecting these controls.
• Reservations: A reservation specifies the minimum guaranteed resources
required, i.e. the lower bound that applies even when the system is over commit- ted. Reservations are specified in absolute units, such as megahertz (MHz) for CPU and megabytes (MB) for memory. Admission control during VM power on ensures that the sum of reservations for a resource does not exceed the total capacity.
• Limits: The limit specifies an upper bound on consumption required, even when
a system is undercommitted. A VM can be prevented from consuming more than its pre-specified upper bound, even if resources remain idle. Like reservations, limits are specified in concrete absolute units, such as MHz and MB.
• Shares: Shares are used for specifying relative importance and are in contrast
specified as abstract numeric values. A VM is allowed to consume resources proportional to its allocated share value; it is guaranteed a minimum resource fraction equal to its fraction of the total shares in the system. Shares are used to represent relative resource rights that depend on the total number of shares contending for the resource.
Reservations and limits have an important role in managing cloud resources. Without the provision of these guarantees, users would suffer from performance unpredictability, unless the cloud provider took the non-work-conserving approach of simply statically partitioning the physical hardware, such a measure would lead to the inefficiency of over-provisioning. This is the main reason for providing such controls for enterprise workloads running on top of VMware ESX and VMware DRS.
6.3.1.2 Resource Pools
Additional flexibility is provided to administrators and users with extra resource management policies for groups of VMs. This is facilitated by introducing the con- cept of a logical resource pool. The logical resource pool is a container that can be used to specify an aggregate resource allocation for a set of VMs. Admission con- trol is now performed at the resource pool level. The pool operates under the con- straint of the sum of the reservations for a pool’s children should not exceed the pool’s own reservation. Separate, per-pool allocations can be used to provide both isolation between pools and sharing within pools. For example, if some VMs within a pool are idle, their unused allocation will be reallocated preferentially to other
VMs within the same pool. Such resource pools may be organised using a flexible hierarchical tree-shaped scheme as shown in Fig. 6.5: each pool having an enclosing parent pool and children that may be VMs or sub-pools. Resource pools can be used in dividing large capacity to logically grouped users. Resource pool hierarchies can be used by organisational administrators to mirror human organisational structures and to support delegated administration.
The resource pool construct can be used in the cloud setting as organisational administrators typically buy capacity in bulk from providers and run several VMs. It is not adopted by several large-scale cloud providers despite its usefulness for thousands of enterprise customers using VMware DRS and VMware ESX, each managing the resource needs of thousands of VMs.
6.3.1.3 DRS Load Balancing
The DRS scheduler performs load balancing by using three key resource-related operations: (1) computes the amount of resources that each VM should get based on the reservation, limit and shares settings for VMs as well as resource pool nodes; (2) performs the initial placement of VMs on to hosts, so that a user doesn’t have to make a manual placement decision; and (3) recommends and performs live VM migrations to do load balancing across hosts in a dynamic environment where the VMs’ resource demands may change over a period of time.
DRS cluster management of distributed hosts is done by providing the illusion that the entire cluster is operating as a single host with the aggregate capacity of all individual hosts. This is implemented by breaking up the user-specified resource
pool hierarchy into per-host resource pool hierarchies with appropriate host-level resource pool settings. Once the VMs are placed on a host, the local schedulers on each ESX host allocate resources to VMs fairly based on host-level resource pool and VM resource settings. DRS is invoked every 5 min by default, but can also be invoked on demand.
DRS load balancing uses its own load-balancing metric [22]. In particular, it does not use host utilisation. In DRS, load reflects VM importance, as captured by the concept of dynamic entitlement. Dynamic entitlement is computed based on the resource controls and actual demand for CPU and memory resources for each VM. The entitlement is higher than the reservation and lower than the limit; its actual value depends on the cluster capacity and total demand. Dynamic entitlement is equivalent to demand when the demands of all the VMs in the cluster can be met, if this is not the case it is a scaled-down demand value with the scaling dependent on cluster capacity, the demands of other VMs, the VM’s place in the resource pool hierarchy, and its shares, reservation and limit. Dynamic entitlement is computed using a pass over the resource pool hierarchy tree to allocate to all VMs and resource pools their CPU and memory reservations and to constrain their demand by their limits, followed by another pass over the tree to allocate spare resources to address limit-constrained demand above reservation in accordance with the associated share values. DRS currently uses normalised entitlement as its core per-host load metric. For a host h, normalised entitlement Nh is defined as the sum of the per-VM entitle-
ments Ei for all VMs running on h, divided by the host capacity Ch available to
VMs: Nh =
å
E Ci / h. If Nh≤ 1, then all VMs on host h would receive their entitle-ments, assuming that the host-level scheduler is operating properly. If Nh > 1, then
host h is deemed to have insufficient resources to meet the entitlements of all its VMs, and as a result, some VMs would be treated unfairly. After calculating Nh for
each host, the centralised load balancer computes the cluster-wide imbalance, Ic,
which is defined as the standard deviation over all Nh.
In a simplified description [23], DRS load-balancing algorithm uses a greedy hill-climbing optimisation technique. This approach, as opposed to an exhaustive approach that would try to find the best target balance, is driven by the practical considerations that the VMotion operations needed to improve load balancing have a cost and that VM demand is changing over time, so highly optimising for a par- ticular dynamic situation is not worthwhile. DRS minimises Ic by evaluating all
possible migrations, many filtered quickly in practice, and selecting the move reduc- ing Ic the most. The selected move is applied to the algorithm’s current internal
cluster snapshot so that it then reflects the state that would result when the migration is completed. This move-selection step is repeated until no additional beneficial moves remain or there are enough moves for this pass or the cluster imbalance is at or below the threshold T specified by the DRS administrator. The actual implemen- tation of load-balancing algorithm considers many other factors, including the risk- adjusted benefit of each move given the range and stability of VMs’ dynamic demand over the last hour, as well as the cost of the migration and any potential impact of the migration on the workload running in the VM. DRS is currently able to handle a small cloud of 32 hosts and 3,000 VMs; for extending its application, a
number of issues need to be resolved: (1) inventory management, efficient collection of host and VM data as the cloud grows, (2) efficient management of a cluster of heterogeneous host, (3) efficient adaptation to a fast changing environment and (4) survivability to resource management system failure.
6.3.1.4 The Future of Resource Management
The expansion of resource management to larger clouds comprised of thousands of machine and sites beyond those offered in private level is a complex optimisation task. Most existing systems do not, in a combined and integrated form, (1) dynami- cally adapt existing placements, (2) dynamically scale resources and (3) scale beyond a few thousands of physical machines. To overcome these constraints, research in the area is moving towards employing optimisation techniques for large distributed environments, such as stochastic processes, gossip protocols or even multiplayer gaming.
The use of a gossip protocol for dynamic resource management is proposed in [24]. A middleware architecture is presented with key element the gossip protocol that is used to ensure fair resource allocation among sites, whilst being able to dynamically adapt the allocation to load changes complying with the scalability requirement. In the proposed system, architecture is comprised of three managing entities: the machine, resource and overlay managers. Each machine runs a machine
manager component that computes the resource allocation policy, which includes
deciding the module instances to run. The resource allocation policy is computed by a protocol that runs in the resource manager component. This component takes as input the estimated demand for each module that the machine runs. The computed allocation policy is sent to the module scheduler for implementation/execution, as well as the site managers for making decisions on request forwarding. The overlay
manager implements a distributed algorithm that maintains an overlay graph of the
machines in the cloud and provides each resource manager with a list of machines to interact with. The protocol is simulated with varying input and a large number of machines 160,000 and sites 368,000 and maintains its efficiency showing a high degree of scalability.
The use of load prediction algorithms for cloud platforms is proposed in [24], using a two-step approach of load trend tracking followed by load prediction, using
cubic spline interpolation and hotspot detection algorithm for sudden spikes. Their
results indicate that the autonomic management framework is able to respond to both synthetic and real-world load trends accurately.
The use of a hybrid P2P cloud computing architecture for massively multiplayer online games is proposed in [25]. The proposed architecture uses a two-level load management system, multi-threshold load management for each game server and load management among game servers. The resulting architecture can support more whilst reducing response time.
Finally, an area that should not be overlooked in resource management is the area of network resources such as bandwidth allocation. With the use of multiple streams
for multiple applications, some fairer TCP stream handling is needed. The work on [26] focuses on fair bandwidth allocation among the cloud users. The proposed mechanism prevents the use of additional flows or non-compliant protocols from gaining an uneven share of the network resources of the shared environment. The use of the ‘fair-share’ dropping mechanism builds on this by ensuring an equal division of bandwidth among the TCP flows of the local entity. The results show that the small overhead incurred justifies the improvement in efficiency. The trends presented here are a clear indication for the direction of things to come in terms of resource allocation and management in the cloud environment.