• No se han encontrado resultados

3.2.2. Las etapas educativas

3.2.2.2. La Etapa de Educación Primaria (Descripción)

Having described the memory requirements of two conjunctive filtering algo- rithms and our general Boolean representative, we now compare the memory usage of the conjunctive solutions to the general Boolean one. When only con- sidering the filtering component, these memory requirements directly affect the scalability of a solution (see Section 2.2 and Section 2.6 on page 23 and page 58, respectively). From our following analysis, we can thus also deduce under what circumstances a Boolean filtering algorithm should be preferred with respect to scalability and the settings that favor a conjunctive solution.

5.5.1

Point of Interchanging Memory Requirements

In the subsequent comparison, we directly use the accumulated memory re- quirements of the three algorithms, given in Equations 5.1 to 5.3 (page 128, 130, and 132). The memory usage in all three cases grows linearly with the number of subscriptions |s| (and is zero if no subscriptions are registered).

Hence, we only need to analyze the first derivatives of the functions in Equa- tions 5.1 to 5.3 at |s| for a comparison of the memory requirements. For the counting algorithm (Equation 5.1), it thus holds:

mem′counting(|s|) = 2|ss| + w(s) × |sp| × |p| + w(p) × |sp| × |p| +

|p| × (1 − pc)

8 . (5.4)

Similarly for the cluster algorithm (Equation 5.2) we derive:

mem′cluster(|s|) = |ss| × (w(c) + w(s)) + |sp| × |p| × (w(s) + w(p)) +

|p| × (1 − pc)

8 . (5.5)

Finally, the general Boolean approach (Equation 5.3) leads to the following first derivation:

mem′Boolean(|s|) = |p| × (1 + w(p) + w(s)) + 2|op| + w(l) + 2 +

|p| × (1 − pc)

8 . (5.6)

To reduce the number of variables in these equations, let us now assume typ- ical values for the algorithm-specific parameters, as stated in Section 5.1.3: w(s) = 4, w(p) = 4, w(l) = 4, and w(c) = 4. That is, the widths of sub- scription identifiers, predicate identifiers, subscription locations, and cluster references are 4 bytes each2. Finally, let us further reduce the number of vari-

ables by utilizing the proportional notions of opprop (proportional number of

operators) and sprop (proportional number of conjunctive elements per predi-

cate), as defined in Section 5.1.

With these specifications, we now compare the memory usage of the con- junctive algorithms (Equation 5.4 and Equation 5.5) to that of the general Boolean approach (Equation 5.6). The following inequalities denote the points where the general Boolean approach requires less memory for its event filter- ing data structures than the respective conjunctive solution. These points are described in terms of the characterizing parameter |ss|. That is, the general

Boolean approach requires less memory if a canonical conversion to disjunctive normal form creates more than the stated number of conjunctive subscriptions. 2These values hold on 32-bit machines when using standard (unsigned) integers as iden- tifiers and standard memory pointers.

We refer to these points as turning points because they describe in what cases of |ss| a general Boolean filtering algorithm becomes worthwhile. To

allow for a better overview, we use the notation |ss|(algorithmBoolean ) to denote the

conjunctive algorithm “algorithm” compared to the general Boolean approach: |ss|( counting Boolean) > |p| × (2opprop + 9) + 6 2 + 8sprop × |p| , (5.7) |ss|( cluster Boolean) > |p| × (2opprop + 9) + 6 8 + 8sprop× |p| . (5.8)

Having found these turning points, we illustrate them graphically in the fol- lowing subsection.

5.5.2

Graphic Illustration of the Turning Point

Figure 5.1 shows the turning point for different parameter combinations. The turning point when comparing the counting and the general Boolean approach is illustrated in Figure 5.1(a); Figure 5.1(b) depicts the cluster in comparison to the general Boolean algorithm. On the abscissae of the figures, we show the number of predicates per subscription |p|. The ordinates show the number of conjunctions |ss| that need to be created by canonical conversion to lead to a

more space-efficient general Boolean filtering approach.

In the figures, we vary sprop (proportional number of conjunctions per predi-

cate) between 0.3 and 0.7 to show the influence of this parameter on the turning point. Our example subscription classes show values of sprop between approx-

imately 0.4 and 0.7 (see Section 5.1.4). For parameter opprop (proportional

number of operators), we choose opprop = 0.8 in these figures, being approxi-

mately the average of opprop in our example classes (in between approximately

0.7 and 0.9).

To interpret Figure 5.1, one chooses one conjunctive algorithm (i.e., either Figure 5.1(a) or Figure 5.1(b)), one of the curves (specifying opprop and sprop),

and the number of predicates |p| on the abscissa. One then gets the mapping for this scenario on the ordinate, specifying the turning point. We demonstrate this process in the following example:

Example 5.2 (Finding the turning point) To determine the turning point for the counting algorithm in comparison to the general Boolean approach, we

1 2 3 4 5 6 5 10 15 20 25 30 35 40 45 50 Number of conjunctions | ss |

Number of predicates per subscription |p|

opprop=0.8, sprop=0.3

opprop=0.8, sprop=0.5

opprop=0.8, sprop=0.7

(a) Counting and Boolean

1 2 3 4 5 6 5 10 15 20 25 30 35 40 45 50 Number of conjunctions | ss |

Number of predicates per subscription |p|

opprop=0.8, sprop=0.3

opprop=0.8, sprop=0.5

opprop=0.8, sprop=0.7

(b) Cluster and Boolean

Figure 5.1: Turning point (point of interchanging memory requirements) for counting and Boolean approach, and cluster and Boolean approach.

need to consider Figure 5.1(a). Let us assume that subscriptions, on average, specify 10 predicates (|p| = 10). We thus fix the value “10” on the abscissa. Let us further assume that the number of operators proportional to the number of predicates is approximately 0.8 (opprop = 0.8), and that after the conver-

sion a predicate occurs in approximately 70 percent of the created conjunctions (sprop = 0.7).

Thus we find the turning point as the value of the lowermost curve in Fig- ure 5.1(a) for argument |p| = 10: the number of conjunctions that is created by the conversion has to be less than two (|ss| < 2). Hence whenever a sub-

scription is not purely conjunctive, the general Boolean algorithm requires less memory than the counting approach (and is thus more scalable) for this sce- nario.

From the viewpoint of the general Boolean approach, the lower a curve is situated in Figure 5.1, the more advantageous this algorithm performs in com- parison to a conjunctive solution. The reason for this property is that already an only slightly increased complexity after a canonical conversion leads to less memory use for the Boolean approach. From the viewpoint of conjunctive algorithms, conversely, the higher a curve is located in comparison to other conjunctive approaches, the more space-efficient is the respective solution.

For both conjunctive algorithms, an increase in the number of operators (increasing opprop) when holding the other parameters fixed leads to more space

efficiency compared to the Boolean approach. This is because the Boolean algorithm needs to encode and store these operators, but not the conjunctive

solutions. When observing the other parameter, the number of conjunctions per predicate after conversion sprop, an increase in this parameter leads to a

more space-efficient Boolean approach. Obviously this is founded in the fact that the complexity of the converted conjunctive subscriptions increases if the other parameters stay fixed.

Comparing the counting and cluster approach, the counting algorithm is more space-efficient than the cluster algorithm (curves for the same parameter setting are located higher in Figure 5.1(a) than in Figure 5.1(b)). In particu- lar, for small predicates numbers (left on the abscissa), the counting approach outperforms the cluster algorithm. The reason for this behavior is the require- ment to store subscription cluster table and subscription identifiers in clusters regardless of the number of predicates, which leads to a larger proportional memory use for an overall small number of predicates |p|. For higher predicate numbers, both conjunctive algorithms lead to comparable turning points. The counting algorithm, though, always stays slightly more space-efficient than the cluster algorithm.

5.5.3

Properties of Example Subscription Classes

In this dissertation, we focus on general-purpose filtering algorithms for pub- sub systems. After our conceptual analysis of the turning point, we now com- pare the memory requirements of our example subscription classes using the counting and our general Boolean approach.

To determine the preferable general-purpose filtering algorithm for sub- scriptions of these classes, we use the findings from Section 5.1.4, describing these classes with the help of our subscription characterization framework. Having determined the turning point with the help of Equation 5.7, we can compare this point to the real number of conjunctions |ss| that is created by

the canonical conversion of these classes. If |ss| is greater than or equal to

the derived turning point, the general Boolean filtering solution is favorable with respect to memory requirements. For Subscription Class 1, we derive the following turning point:

|ss|( counting Boolean) > 6 × (2 × 0.667 + 9) + 6 2 + 8 × 0.667 × 6 = 68 34 = 2.

The turning point for Subscription Class 2 is as follows: |ss|( counting Boolean) > 12 × (2 × 0.833 + 9) + 6 2 + 8 × 0.417 × 12 = 134 42 ≈ 3.19. Finally, for Subscription Class 3 the turning point is:

|ss|( counting Boolean) > 7 × (2 × 0.857 + 9) + 6 2 + 8 × 0.429 × 7 = 81 26 ≈ 3.12.

The actual created number of conjunctive subscriptions for these classes is given in Table 5.2 (page 125): |ss| = 2 for Class 1, |ss| = 4 for Class 2, and

|ss| = 6 for Class 3. Hence for all subscription classes the general Boolean

algorithm requires less (Class 2 and 3) or equal (Class 1) memory than the conjunctive solution. Consequently, for our online auction scenario one should apply a Boolean filtering approach with respect to memory use.