CAPITULO VI. PLAN DE COMPETITIVIDAD MUNICIPAL
6.3 Líneas estratégicas
The performance of MINs is usually determined by modeling, using simulation [52] or mathematical methods [53]. In this chapter we estimated the network performance using simulations. We developed a generic simulator for MINs in a packet communication environment. The simulator can handle several switch types, inter-stage interconnection patterns, load conditions, switch operation policies, and priorities. We focused on an (N X N) Banyan Network that consists of (2 X 2) SEs, using internal queuing. Each (2 X 2) SE in all stages of the MIN was modeled by two non-shared buer queues, where the crossbar segment was located in front of queues. Buer operation was based on FCFS principle. In the case of non-priority scheme MINs, when there was a contention between two packets, it was solved randomly (algorithms 2.1 and 2.2). The performance of non-priority MINs was compared against the performance of internal priority MINs, where contentions were resolved by favoring the packet transmitted from the SE with the highest transmission queue length (algorithms 2.1 and 2.3). The simulation was performed at packet level, assuming xed-length packets transmitted in equal-length time slots, where the slot was the time required to forward a packet from one stage to the next.
The parameters for the packet trac model were varied across simulation experiments to generate dierent oered loads and trac patterns. Metrics such as packet throughput and packet delays were collected at the output ports. We performed extensive simula- tions to validate our results. All statistics obtained from simulation running for 105 clock
cycles. The number of simulation runs was adjusted to ensure a steady-state operating condition for the MIN. There was a stabilization process in order the network be allowed to reach a steady state by discarding the rst 103 network cycles, before collecting the
statistics.
SendQueue Process (csid; sqid; bm)
Input: Current stage id (csid); send-queue id (sqid)of current stage ; blocking mechanism
(bm).
Output: Population for send- and accept-queue (P op) ; total number of serviced and blocked packets for send-queue (Serviced; Blocked) respectively ; total number of packet delay cycles for send-queue (Delay) ; routing address RA of each buer position of queue.
{
if (P op[sqid][csid]> 0) // send-queue is not empty
{
RAbit =get bit(RA[sqid][csid][1]; csid);
// get the (csid)th bit of Routing Address (RA) of the leading packet of
// send-queue by a cyclic logical left shift; perfect shue algorithm if (RAbit = 0) // upper port routing
aqid= 2 ∗ (sqid%(N=2)) ; // upper link; perfect shue algorithm
else // lower port routing
aqid= 2 ∗ (sqid%(N=2)) + 1 ; // lower link; perfect shue algorithm
// where aqid is the accept-queue id of next stage
if (P op[aqid][csid+ 1] =B) // where B is the buer-size
{ // blocking state
Blocked[sqid][csid] = Blocked[sqid][csid] + 1 ;
if (bm = “blm”) // block and lost mechanism
{
P op[sqid][csid] =P op[sqid][csid] − 1 ;
for (bfid = 1;bfid >= P op[sqid][csid];bfid+ +)
RA[sqid][csid][bfid] =RA[sqid][csid][bfid+ 1] ;
// where RA is the Routing Address of the packet // located at (bfid)th position of send-queue
} }
else // unicast forwarding
{
P op[sqid][csid] = P op[sqid][csid] − 1;
P op[aqid][csid+ 1] =P op[aqid][csid+ 1] + 1 ;
RA[aqid][csid+ 1][P op[aqid][csid+ 1]] =RA[sqid][csid][1] ;
for (bfid = 1;bfid >= P op[sqid][csid];bfid+ +)
RA[sqid][csid][bfid] = RA[sqid][csid][bfid+ 1] ;
}
Delay[sqid][csid] = Delay[sqid][csid] +P op[sqid][csid] ;
return P op; Serviced; Blocked; Delay; RA ;
} }
Algorithm 2.1: Send-queue process for single- and internal-priority MINs SinglePriority SEs Process (csid; useid; bm)
Input: Current stage id (csid); Switching Element id (useid) of upper segment ; blocking
mechanism (bm).
{
lseid =useid+ N4; // where N is the number of input/output ports
// and lseid is the adjacent Switching Element (SE) of lower segment
r = random(); // where r ∈ [0::1)
if (r < 0:5) // upper segment is clocked rstly
{ // process for upper queues of SEs SendQueue Process (csid; 2 ∗ useid; bm);
SendQueue Process (csid; 2 ∗ lseid; bm);
// process for lower queues of SEs
SendQueue Process (csid; 2 ∗ useid+ 1; bm);
SendQueue Process (csid; 2 ∗ lseid+ 1; bm);
}
else // lower segment is clocked rstly
{ // process for upper queues of SEs SendQueue Process (csid; 2 ∗ lseid; bm);
SendQueue Process (csid; 2 ∗ useid; bm);
// process for lower queues of SEs
SendQueue Process (csid; 2 ∗ lseid+ 1; bm);
SendQueue Process (csid; 2 ∗ useid+ 1; bm);
} }
InternalPriority SEs Process (csid; useid; bm)
Input: Current stage id (csid); Switching Element id (useid) of upper segment ; blocking
mechanism (bm).
{
lseid=useid+N4; // where N is the number of input/output ports
// and lseid is the adjacent Switching Element (SE) of lower segment
r = random(); // where r ∈ [0::1) // process for upper queues of SEs
if (P op[2 ∗ useid][csid]> P op[2 ∗ lseid][csid])
or ((P op[2 ∗ useid][csid] =P op[2 ∗ lseid][csid])and (r < 0:5))
{ // upper segment clocking takes precedence SendQueue Process (csid; 2 ∗ useid; bm);
SendQueue Process (csid; 2 ∗ lseid; bm);
}
if (P op[2 ∗ useid][csid]< P op[2 ∗ lseid][csid])
or ((P op[2 ∗ useid][csid] =P op[2 ∗ lseid][csid])and (r >= 0:5))
{ // lower segment clocking takes precedence
SendQueue Process (csid; 2 ∗ lseid; bm);
SendQueue Process (csid; 2 ∗ useid; bm);
}
// process for lower queues of SEs
if (P op[2 ∗ useid+ 1][csid]> P op[2 ∗ lseid+ 1][csid])
or ((P op[2 ∗ useid+ 1][csid] =P op[2 ∗ lseid+ 1][csid])and (r < 0:5))
{ // upper segment clocking takes precedence
SendQueue Process (csid; 2 ∗ useid+ 1; bm);
SendQueue Process (csid; 2 ∗ lseid+ 1; bm);
}
if (P op[2 ∗ useid+ 1][csid]< P op[2 ∗ lseid+ 1][csid])
or ((P op[2 ∗ useid+ 1][csid] =P op[2 ∗ lseid+ 1][csid])and (r >= 0:5))
{ // lower segment clocking takes precedence SendQueue Process (csid; 2 ∗ lseid+ 1; bm);
SendQueue Process (csid; 2 ∗ useid+ 1; bm);
} }
Algorithm 2.3: Switching Element process for internal-priority MINs
Figue 2.4 shows the normalized throughput of a single-buered MIN with 6 stages as a function of the probability of arrivals for the three classical models [44, 31, 23] and our simulation. All models are very accurate at low loads. The accuracy reduces as input load increases. Especially, when input load approaches the network maximum throughput, the accuracy of Jenq's model is insucient. One of the reasons is the fact that many packets
Figure 2.4: T h of single-buered, 6-stage, single- (or non-) priority MIN
Figure 2.5: T h of double-buered, n-stage, internal- vs. non-priority scheme
are blocked mainly at the network rst stages at high trac rates. Thus, Mun introduced a \blocked" state to his model to improve accuracy. The consideration of the dependencies between the two buers of an SE in Theimer's model leads to further improvement. Our simulation was also tested by comparing the results of the Theimer's model with those of our simulation experiments, which were found to be in close agreement (dierences are less than 1%).
Figure 2.5 illustrates the gains on normalized throughput of a MIN using an internal priority vs. non priority (or single priority) scheme. In the diagram, curve NPS[b][n] depicts the normalized throughput of an n-stage MIN, where n = 3; 6; 8; 10, constructed by 2X2 SEs, using queues of buer-length b, employing a non priority scheme. Similarly, curve IPS[b][n] shows the corresponding normalized throughput of an n-stage MIN, where n = 3; 6; 8; 10, constructed by 2X2 SEs, using queues of buer-length b, employing an internal priority scheme. In this gure, all curves represent the performance factor of normalized throughput for double buered MINs (b = 2) at dierent oered loads ( = 0:1; 0:2; · · · ; 1). We can notice here that the gains on normalized throughput of a MIN using an internal priority vs. non priority scheme are 1.9%, 3.3%, 3.7%, and 4.0% of the optimal value, which is just T hmax = 1, when n = 3; 6; 8; 10 respectively, under full load
trac. It is obvious that the normalized throughput falls as the network size (bandwidth) increases. However, the gains of normalized throughput using the internal priority vs. non priority scheme are more considerable as the network size increases.
Figure 2.6 illustrates the gains on normalized throughput of a MIN using an internal priority scheme as compared to the single priority one in the case of buer size b = 4. We can notice here that the gains on normalized throughput of a MIN using an internal priority vs. non priority scheme are 1.4%, 3.7%, 4.3%, and 4,7% of the optimal value, when n = 3; 6; 8; 10 respectively, under full load trac. As it is seen by the diagram the gains
Figure 2.6: T h of nite-buered (b=4), n- stage, internal- vs. non-priority scheme
Figure 2.7: T h of nite-buered (b=8), n- stage, internal- vs. non-priority scheme on normalized throughput remain considerable for all network setups, especially in cases where n >= 6. Figure 2.7 presents the case of a MIN with a large queue conguration, where the buer size is b = 8. The results show that the gains on normalized throughput, when the buer length is b = 8 are lower at all network setups (n = 3; 6; 8; 10), but still considerable. According to the above diagram the gains of a MIN using an internal priority vs. non priority scheme are 0.8%, 2.7%, 3.0%, and 3.5% of the optimal value, when n = 3; 6; 8; 10 respectively, under full load trac. It is worthy of remark, that the normalized throughput is improved for both single and internal priority MINs due to the increment of buer size (b = 8), which is more obvious in the case of heavy trac (ë > 0:7) oered load.
Figure 2.8 represents the corresponding increments on normalized packet delay for internal priority vs. single priority packets of a 6-stage MIN, under dierent buer size schemas (b = 1; 2; 4; 8), which are found to be negligible for all conguration setups. It emerges that when the buer size of the MIN has the maximum value (b = 8) the normalized delay of internal priority packets under full load trac increases from 5.63 - the corresponding normalized delay of single priority packets - to 6.02, that is just the worst case. It is obvious that the corresponding single buered (b = 1) MINs have the same values for all performance factors at both single and internal priority schemas. The reason is that, when two packets at a stage contend for the same buer at the next stage and there is not adequate free space to be stored the algorithm of solving the contention is the same for both single and internal priority schemas, because all queues can hold only one packet and thus, one of them is selected randomly independently of the priority scheme. It is also noteworthy that larger buers introduce larger delays, because packets ll the buers and stay in the network longer, thereby increasing queuing delays. Large packet delay values can adversely aect applications sensitive to packet delay or jitter,
Figure 2.8: D of nite-buered, 6-stage, internal- vs. non-priority scheme
Figure 2.9: Upf of nite-buered, 6-stage, internal- vs. non-priority scheme
such as streaming media trac.
Figure 2.9 illustrates the relation of the combined performance indicator Upf of a 6- stage MIN to the oered load , under dierent buer size congurations (b = 1; 2; 4; 8). Recall from section 2.3, the combined performance indicator Upf depicts the overall perfor- mance of a MIN, considering the weights of each individual performance factor (throughput and packet delay) are of equal importance. It is clear that the performance indicator Upf has lower (better) values as the buer length increases, but when the buer size reaches the values b = 4; 8 the performance indicator Upf deteriorates signicantly , under moderate and heavy trac (ë > 6) using either internal or single priority scheme.