ACUERDO POR EL QUE CONSTITUYE LA “POLÍTICA INSTITUCIONAL DE INTEGRIDAD” Y SE CREA EL “COMITÉ DE INTEGRIDAD” DE LA AUDITORÍA SUPERIOR DE LA CIUDAD DE MÉXICO

In order to convert application-level bandwidth requirement into physical-level bandwidth requirement for a VD, Cheetah analyzes a sample of VD’s workload for a small time window of T seconds, on a set of dedicated DAs that are specifically meant for sampling purposes. These dedicated set of DAs are used to analyze the sample workload and during this stage, each VD is considered one at a time, because the DA is still not in a position to efficiently handle workloads from multiple VDs simultaneously. To comprehensively capture the locality information of the VD’s workload, Cheetah expresses physical-level bandwidth in terms of DRUT. To measure the DRUT capacity of a VD, Chee- tah measures the I/O latency of every disk access request on the DA and aggregates all such latencies to compute the total I/O time. Due to advanced NCQ and caching techniques on the DA, some I/O requests are overlapped and hence Cheetah identifies and filters out the overlapped portions of the I/O latencies from the total I/O time to calculate the total effective duration for which the DA is actively used to service the requests from the given VD, which is the DRUT capacity of that VD. Since DRUT measures the effective disk usage time, it inherently captures all the core variables of a data workload like read/write ratio, request size and the amount of randomness (locality), which is why Cheetah prefers to express the physical-level bandwidth requirement of a VD in terms of DRUT rather than IOPS, MBPS or other similar

terminologies. Later, in the performance evaluation section7.9.1, we show the DRUT capacity for several real-world applications like web-search engines and online transaction processing systems, and show that the DRUT metric indeed captures all the important locality information in the I/O workload.

Figure 7.2 illustrates how Cheetah computes the DRUT capacity of a VD in a DA. The DA receives I/O requests R1, R2, R3, R4 and R5 from VD1 at times T1, T2, T3, T4 and T5 respectively. In this example, we assume the DA is configured with RAID0 setup with 2 disks and we also assume that the incoming requests are sequential in nature. Therefore, alternate requests are serviced by the same disk in the DA. Due to advanced disk scheduling techniques, requests are merged and reordered as shown in the figure. R3 and R5 are merged together and hence their interrupts are coalesced. Similarly R2 and R4 are processed together by the DA, and R1 is processed alone and could not be merged with R3 and R5 because it probably arrived a bit earlier than the threshold time for merging on disk 1 on the DA. Cheetah computes the average I/O latency for each group of I/O requests that are processed together at the same time (whose interrupts are coalesced). The I/O latency for R1, and the average I/O latency for the group of R3 and R6 are aggregated together to compute the DRUT capacity for disk 1. Similarly the average I/O latency for the group of R2 and R4 are aggregated together to compute the DRUT capacity for disk 2. Cheetah computes the average I/O latency for a group of requests in a very careful manner, such that the latency should not overlap in time. Therefore, for the group of R3 and R5, Cheetah computes the difference between T10 (the time at which last request in the group received its interrupt acknowledgement) and T6(the time at which the first request in the group began to be processed), and assigns the average of this difference as the I/O latency for each of the requests R3 and R5. If T3 happens to appear after T6, then the start time of the group is considered to be T3. Cheetah aggregates the average I/O latency for each I/O request in a VD on every disk in the DA to determine the DRUT capacity for each disk in the DA. Cheetah again aggregates the DRUT capacities on each disk for a VD to determine the overall DRUT capacity for a VD on that DA. It should be noted that the average I/O latency associated with each I/O request in a VD is used only for DRUT computation and it doesn’t correspond to the actual end-to-end I/O latency of the request on its DA.

It is extremely hard for Cheetah to measure the DRUT capacity of a DA configured with a checksum-based RAID setup because it is quite tricky to identify and associate the latency of an I/O request that corresponds to the checksum. We discuss this problem in greater detail in Section 7.6.2, and in

this work we restrict the DA to a software RAID0 or similar stripe-based RAID setup.

During the sampling phase of T seconds, Cheetah computes the DRUT capacity of the given VD and uses it to compute the physical-level bandwidth factor (PBF), P BF = DRU T /T . Since the I/O latency is measured on the DA, it doesn’t take into account the network latency and other queuing latencies involved at other components in the entire SDDS system, and hence the I/O latency effectively captures the core characteristics of the workload in terms of read/write ratio, I/O request size and locality (randomness). The calculated PBF corresponds to a factor of DA’s raw disk bandwidth (RB) thats actively used, and RB is calculated as the maximum disk I/O bandwidth avail- able on the DA when the workload consists of only write I/O requests with 100% sequential locality. RB is measured in units of MBPS.The physical-level bandwidth (PB) corresponding to the application-level bandwidth (AB) is calculated using the formula, P B = E ∗ P BF ∗ RB, where E is the elasticity configured by the tenant in the QoS specification. The rationale behind includ- ing E factor in this formula is because of the following implicit assumptions: a) the core characteristics of the sample workload will continue to approx- imately remain the same as in the application’s real-time workload, except for the arrival rate of the I/O requests, and b) the number of I/O requests are proportional to the number of application requests. The PB value thus calculated, effectively represents the sample workload in its entirety and the entire process is completely automated without requiring the tenants to ex- plicitly specify PB requirements. This PB value by itself doesn’t make much sense but when its incorporated into a QoS aware disk I/O scheduler, the requests from different VDs can be processed on a DA with different priorities that are in accordance with the bandwidth guarantees configured in the QoS specification for the respective VDs.

The above mentioned PB extraction procedure focusses only on VD-level QoS granularity, but it is trivial to extend it to VDC-level granularity. At VDC granularity, a tenant’s application is spread across multiple VDs located on physically isolated CNs. Given a VDC, the tenant’s application workload is sampled on all the VDs belonging to the VDC as previously explained, but with just a minor exception. The CFVC scheduler in each DA would maintain a queue for each VDC rather than for each VD. Therefore, Cheetah aggregates a set of PB values from all the DAs that hold the VDs belonging to the given VDC and uses the aggregated PB value as the physical-level bandwidth to be guaranteed for that VDC. Since the CFVC scheduler in a DA aggregates the I/O requests from all VDs belonging to a VDC, temporary fluctuations in one of the VDs of a VDC is efficiently absorbed in the corresponding DA. However,

VD1 (100)

_{VD2
(80)}

_{VD3
(70)}

VD4 (60)

DA1

100 DA4 100 DA5 100 DA2 100 DA4 100 DA5 100 DA3 100 DA4 100 DA5 100 DA1 0 DA2 20 DA3 30

Figure 7.3: Illustration of a scenario where a naive greedy RLB algorithm fails to load balance the DAs

if the VDs are located in different DAs, Cheetah needs to build additional non- trivial optimizations, like coordinating between such DAs at real-time, in order to handle such short term fluctuations in the workload, and we reserve it to future work.

In document Í N D I C E Este Ejemplar se acompaña de un anexo digital ADMINISTRACIÓN PÚBLICA DE LA CIUDAD DE MÉXICO (página 107-112)

ACUERDO POR EL QUE CONSTITUYE LA “POLÍTICA INSTITUCIONAL DE INTEGRIDAD” Y SE CREA EL “COMITÉ DE INTEGRIDAD” DE LA AUDITORÍA SUPERIOR DE LA CIUDAD DE MÉXICO

VD1 (100)

VD2 (80)

VD3 (70)

VD4 (60)

_{VD2
(80)}

_{VD3
(70)}