URBANISMO Y MEDIOAMBIENTE Programa EV7.1:
IMPLICADAS EV7111: Realizar un diagnóstico
7. INDICADORES DE EVALUACIÓN
In an application with a low cycle per byte cost, the overall throughput is indeed limited by the throughput of the underlying disks. This is true in both the Active Disk and server case, as shown in Figure 3-1. In a sense, this is the best possible situation for a stor- age system. If the raw disk bandwidth (i.e. physical performance of the disk assembly and density of the media) is the limiting performance factor, this means the disk is essentially operating at maximum efficiency. In the limit, it is simply not possible to process data faster than the disk media can provide it.
In this case, the benefit of Active Disks is in providing the possibility of better scheduling of requests before they go to the disk internals, to make more efficient use of the underlying bandwidth. With additional higher-level knowledge [Patterson95, Mowry96], overall throughput of the disk can be increased. The work of Worthington and Ganger [Worthington94] shows that with more sophisticated scheduling, the performance for random requests can be increased by up to 20%. These types of benefits will be less dramatic for large, sequential requests, which already use the disk very efficiently. There may be a benefit with the use of extended interfaces that allow more flexible reordering of requests. One possible way to take advantage of this is by allowing a “background” work- load that can take advantage of idleness in a “foreground” workload to opportunistically improve its performance, as illustrated in Section 5.4.
The far bigger effect on disk throughput comes from the addition of extra disks across which data is partitioned. If the data rates of particular applications are known, then disks can be added to provide the appropriate level of performance [Golding95]. Active Disks will not help this directly, but will allow more efficient use of the disk resources that are available. Active Disks can also aid in the collection of statistics and performance met- rics of individual devices and workloads. This information can then be used by a higher-level management system to optimize the layout and placement of the workload [Borowsky96, Borowsky98]. This makes possible systems with a much greater amount of self-tuning and self-management that typical storage systems today. In order to scalably perform such monitoring and control, it is necessary to have control and computation at the end devices, rather than attempting to monitor everything centrally.
3.2.2 Processing
An application that is limited by the CPU processing rate, such as the multimedia applications discussed in the next chapter, benefits from the inherent parallelism in the Active Disk system. Where the server is limited by it’s single CPU, the processing power of the Active Disks scales with the number of disks available.
Of course, it is also possible to add additional processing capability to a server sys- tem, for example by using an SMP architecture rather than a single-processor machine. This is not precluded by an Active Disk system, nor does it change the basic model. This simply replaces the processing rate (scpu) with a higher value. For example, Figure 3-3
shows the expected performance if the number of processors in an SMP is scaled as addi- tional disks are added. The chart considers the details of the AlphaServer 8400 system from the previous chapter, and assumes that the number of disks is balanced with the num- ber of processors in that system, i.e. both increase linearly. One processor is added for approximately every 45 disks. This shows a step function in the host processing power, with a large boost whenever another processor is added. This still does not scale nearly as far as the Active Disk system at the high end, because there are still many more disks and processors. The chart assumes the comparison in today’s system, using the 25 MHz value for on-disk processing power. The benefit would be even greater with the 200 MHz on-disk processors. The line for Active Disks is much smoother because processing power is added in much smaller increments - for each 4 GB of additional storage, another 25 MHz of processing power is added.
Also note the much bigger performance gap at the very high end. It is simply not possible to add processors beyond twelve in this AlphaServer system, which is still one of the largest SMP systems available. This is true in most SMP systems sold today. This lim- itation comes primarily from physical limitations of building a system of that size, includ- ing the basic speed of the memory bus connecting all the processors and the single shared memory. The Active Disk system, on the other hand, will continue to scale as additional disks are added (to over 10,000 disks for this particular application, at which point the sys- tem is network bottlenecked).
3.2.3 Interconnect
Applications that are network-limited benefit from the filtering of Active Disks (leaving data on the disks, if it doesn’t have to move) and the scheduling that can be done with intelligence at the edges of the network. In a traditional disk system, a set of bytes must be moved from the disk, across the interconnect, and through the host memory sys-
Figure 3-3 Performance of an SMP system. The chart shows the performance of a multi-processor system as processors and disks are added. We see a coarse step function as additional processors are added, but a much smoother increase with additional Active Disks. This is the model prediction for the AlphaServer 8400 system introduced in the previous chapter. The application parameters assume an average-cost Data Mining application with a computation requirement of 10 cycles/byte. The disks have 25 MHz processors and the host has up to twelve 612 MHz processors. Note that the Active Disk system would continue to scale as additional disks are added, while the SMP system cannot support more than 12 processors.
128 256 384 512 640 768 Number of Disks 0.0 500.0 1000.0 1500.0 2000.0 T h ro ugh pu t (M B /s)
Multi-processor Scaling, Today’s System
Today’s Disks (25 MHz)
tem before the CPU can operate on it and make a decision (e.g. “take or leave”, “where to route”). In an Active Disk, these decisions can be offloaded to the devices and bytes will never have to leave the device if they will not be used in further processing.
Additional pressure is placed on storage interconnects by the introduction of Storage Area Networks (SANs) and the increased sharing demanded by today’s applications and customers [Locke98].
The data in Table 3-3 shows the interconnect limitations in a number of today’s
TPC-D systems. The table lists the theoretical and the actual throughput of delivering data to the processors in these large SMP systems. The throughput values are obtained by using the time to complete Query 6 in the benchmark, which must sequentially scan 1/7 of the lineitem table, and uses very little cycles/byte (so it should be interconnect limited in all cases).