• No se han encontrado resultados

3.2.3. Aproximación y elaboración del marco teórico de la L.O.E 1 LOE: Marco normativo

3.2.3.4. El nuevo currículo de secundaria: Novedades respecto a la LOGSE

3.2.3.4.4. Las Competencias Básicas: La gran novedad de la LOE

Having the theoretical means to determine whether a conjunctive or a Boolean filtering algorithm is preferable for a given setting, we now verify our findings by experiment. In this empirical evaluation, we compare the counting algo- rithm and our general Boolean approach, in accordance with the focus of this dissertation.

5.6.1

Experimental Setup

A practical implementation of filtering algorithms requires memory resources additional to those described by our theoretical framework. For example, one needs suitable data structures that efficiently support the required operations and these structures need extra memory for their effective management. Thus, in a practical implementation, one has to face a higher memory cost than described in the theoretical model.

Also, the data structures have to be implemented reasonably. For example, indexing structures (e.g., predicate-subscription association table) require a dynamic implementation to allow for both registrations and deregistrations. For sole filtering structures (e.g., fulfilled predicate vector), on the other hand, it is sufficient to provide static implementations.

contains implementations of the subscription indexing parts of both counting and general Boolean algorithm (both candidate and final subscription match- ing for the Boolean approach). We exclude predicate indexes because both approaches can apply the same indexes for this purpose and thus require the same memory in practice.

For the dynamic data structures of the algorithms, we used dynamic ar- ray implementations that consumed less memory than their Stl3 variants in

empirical studies. These dynamic structures include the required tables (e.g., predicate-subscription association table), whose implementation is also based on our dynamic array.

Because the general Boolean approach extends the counting algorithm and because of our choice to use comparable implementations for both of these approaches, our experiments reveal whether the practical memory overhead is comparable for these two classes of filtering algorithms. That is, we can verify the findings of our theoretical framework with the provided practical implementation.

Our experiments required us to use an artificial test setup to derive data points for a wide range of parameter assignments. We analyzed predicate numbers in the interval from |p| = 5 to |p| = 50. The number of conjunctive subscriptions due to conversion was varied between |ss| = 1 and |ss| = 5. We

also used different numbers of operators |op| and conjunctions per predicate |sp|

in our experiments. Here we present the results for the setting |op| = 0.5, and the three assignments |sp| = 0.3, |sp| = 0.5, and |sp| = 0.7. The turning point

is generally independent of the number of subscriptions; in our experiments we used 1,000,000 subscriptions (|s| = 1, 000, 000).

In the following, we report the total memory requirements of the filtering process using information provided by the process status application program- ming interface (PSAPI).

5.6.2

Illustrating the Memory Usage

Figure 5.2(a) illustrates the memory requirements (z-axis) for the counting al- gorithm (light surface) and the general Boolean algorithm (dark surface). The surface that represents the counting algorithm is derived (i.e., interpolated) from 50 points (10 values for |p| on the x-axis and five values for |ss| on the

y-axis). The surface that illustrates the general Boolean approach is derived 3Standard Template Library [SL95].

Predicates per subscription | p| Number of conjunctions | ss| 5 10 15 20 25 30 35 40 45 50 1 2 3 4 5 1,000 800 600 400 200 0 Memory in MB

(a) Perspective view

Number of conjunctions |

ss

|

5 10 15 20 25 30 35 40 45 50 Predicates per subscription |p|

1 2 3 4 5 (b) Top view

Figure 5.2: Memory requirements for counting algorithm (light surface) and Boolean algorithm (dark surface) for the setting |s| = 1, 000, 000, opprop = 0.5,

and |sp| = 0.3. In the right figure, we show the same setting and a top view to

the left figure. The light surface is illustrated transparently in the right figure and our theoretical result is indicated by the additional curve.

from the 10 different values for |p| (x-axis). The memory requirements of this algorithm are independent of conversion and thus have the same values for all assignments of |ss| (y-axis).

As illustrated in Figure 5.2(a), the specialized counting algorithm requires less memory than the Boolean approach for small values of |ss|. However, the

more conjunctions are created due to conversion (higher values on y-axis), the higher the memory usage of the counting approach (z-axis). One can directly observe that both surfaces cut at some point: the turning point, theoretically described in Equation 5.7.

To get a better overview of the turning point, we illustrate a top view of the behavior of the algorithms in the described setting in Figure 5.2(b). We remove the surface that represents the counting algorithm and just show the surface for the Boolean approach in this figure. This surface is shown until it is cut by the surface of the counting algorithm and thus covered in the illustrated top view. In Figure 5.2(b), we additionally illustrate a curve that represents the theoretically derived turning point in the same setting. For our two other settings, we show a similar top view to the empirical results in Figure 5.3(a) (|sp| = 0.5) and Figure 5.3(b) (|sp| = 0.7). There we also

Number of conjunctions |

ss

|

5 10 15 20 25 30 35 40 45 50 Predicates per subscription |p|

1 2 3 4 5

(a) Top view, |sp| = 0.5

Number of conjunctions |

ss

|

5 10 15 20 25 30 35 40 45 50 Predicates per subscription |p|

1 2 3 4 5 (b) Top view, |sp| = 0.7

Figure 5.3: Turning point for the setting |s| = 1, 000, 000, opprop = 0.5, and

varying values of |sp|. The theoretical result is indicated by the additional

curve.

The theoretically predicted turning point broadly aligns with the behavior in practice in all three of these figures. However, for small predicate numbers (left on the abscissae) the turning point in practice can be found below the theoretically determined one. This is particularly the case for small values of |sp| (cf. Figure 5.2(b)). This behavior, in fact, means that for small predicate

numbers the general Boolean approach leads to even better results in practice than in theory: even disjunctive normal forms less complex than predicted by the theoretical characterization framework do already favor a general Boolean filtering solution.

The reason for this behavior is found in the data structures for these al- gorithms: the subscription-predicate association table that is required in the counting algorithm has a relatively high management overhead for small pred- icate numbers and small values of |sp| because the created conjunctive sub-

scriptions involve an even smaller number of predicates in this case. Hence the memory use for management purposes proportional to the data in this table is relatively high. However, this is not the case for the subscription trees in the Boolean approach. For larger predicate numbers in the created conjunctions, the proportion of memory for management and stored data gets smaller, and becomes comparable in both approaches.

1,000 800 600 400 200 0 5 10 15 20 25 30 35 40 45 50 Memory in MB

Predicates per subscription |p|

ss varies, pc=0

ss varies, pc=0.25

ss varies, pc=0.5

(a) General Boolean algorithm

1,000 800 600 400 200 0 5 10 15 20 25 30 35 40 45 50 Memory in MB

Predicates per subscription |p|

ss=2, pc=0 ss=2, pc=0.25 ss=2, pc=0.5 ss=5, pc=0 ss=5, pc=0.25 ss=5, pc=0.5 (b) Counting algorithm

Figure 5.4: Influence of predicate commonality on the general Boolean algorithm and the counting algorithm, using the setting |s| = 1, 000, 000, opprop = 0.5, and sprop = 0.3.

5.6.3

Predicate Commonality

Although our theoretical analysis shows that the memory requirements do only marginally depend on predicate commonality pc (only the fulfilled predicate

vector is influenced by pc), the behavior in practice is different. The reason for

this development is again found in the varying overhead for the management of data structures in an implementation, in this case for the predicate-sub- scription association table that is required in both algorithms. The result is a decreasing memory usage for increasing predicate commonality pc.

Figure 5.4(a) shows this behavior for the Boolean algorithm; the counting approach is illustrated in Figure 5.4(b). The abscissae of these figures state the number of predicates |p|. The memory usage is displayed at the ordinates. In this set of experiments, we use the following parameters: |s| = 1, 000, 000, opprop = 0.5, and sprop = 0.3. The curves in the figures state different predicate

commonalities: pc = 0, pc = 0.25, and pc = 0.5. For the counting algorithm

(Figure 5.4(b)), we illustrate two settings, |ss| = 2 and |ss| = 5.

Although predicate commonality pc changes the memory use for both al-

gorithms in practice, this effect does not influence the turning point of the memory requirements (see Section 5.5.2): predicate commonality has the same effect on the Boolean and the conjunctive algorithm, as shown in Figure 5.4.