Expte 13758-8-2008 Alc 3 Sancion ( 08-11-2012) Decreto de Promulgación 2550 (20-11-2012)

This sub-section details the evaluation of the results from the binary classification task. The strategies in the proposed IRL mechanism are denoted as RL with the identifier for the different rule refinement strategies used appended. Table 6.2 shows the micro- averaged F1-measure and the average number of rules generated by the eight strategies

in the proposed IRL mechanism using χ2 and IG for the 20NG-A dataset. Table 6.4 shows the same results but for the 20NG-B dataset. The micro-averaged F1-measure

for the other machine learning techniques in comparison with the best RL strategy for classification of the 20NG-A and 20NG-B datasets are shown in Tables 6.3 and 6.5 respectively.

Strategies F1 Average #_{of rules} Average # of rules_{with negation} _{with negation}% of rules

χ2 RL + UP 0.800 1783.8 0.0 0.0 RL + UN 0.810 997.6 366.1 36.7 RL + Ov 0.803 1601.7 0.0 0.0 RL + UP-UN-Ov 0.800 1786.4 5.7 0.3 RL + UN-UP-Ov 0.810 1011.8 372.7 36.8 RL + BestStrategy 0.830 1065.0 224.5 21.1 RL + BestPosRule 0.824 1226.0 0.0 0.0 RL + BestRule 0.821 1112.1 402.3 36.2 IG RL + UP 0.759 1122.6 0.0 0.0 RL + UN 0.794 652.5 321.0 49.2 RL + Ov 0.785 998.0 0.0 0.0 RL + UP-UN-Ov 0.759 1124.3 0.6 0.0 RL + UN-UP-Ov 0.794 657.9 322.5 49.0 RL + BestStrategy 0.815 649.8 205.6 31.6 RL + BestPosRule 0.802 798.9 0.0 0.0 RL + BestRule 0.803 666.5 329.7 49.5

Table 6.2: Micro-averaged F1-measure and the average number of rules for the classi-

fication of the 20NG-A dataset using χ2 and IG for feature selection and the keyword representation

Inspection of Table 6.2 indicates that RL + BestStrategy was the best overall strategy with a F1-measure value that was slightly higher than the rest of the strategies.

This strategy is one of the strategies that could generate rules with negation. Out of the rules generated using the RL + BestStrategy, 21.1% (when χ2 was used) and 31.6% (when IG was used) of the rules incorporated negation. With regards to the first three strategies, which utilized only one sub-space each, the best out of the three was the RL + UN strategy, which also generated rules with negation. In addition, RL +

UN had the smallest ruleset, an average of 997.6 rules, out of which 36.7% were rules with negation. Similarly, when IG was used, RL + UN also had a higher F1-measure

when comparing the first three strategies, in addition to having the smallest ruleset, an average of 652.5 rules, out of which 49.2% were rules with negation. Regardless of the feature selection technique used, the worst strategies were RL + UP and RL + UP-UN-Ov, which also featured the largest rulesets.

Techniques χ2 Rank IG Rank

RL 0.830 2 0.815 3 SMO 0.849 1 0.842 1 NB 0.636 6 0.661 6 JRip 0.760 5 0.756 5 OlexGreedy 0.824 3 0.821 2 OlexGA 0.817 4 0.813 4

Table 6.3: Micro-averaged F1-measure for the classification of the 20NG-A dataset

using χ2 and IG for feature selection and the keyword representation for the best RL strategy in comparison with the other machine learning techniques

When the best RL strategy was compared with the other machine learning techniques, as shown in Table 6.3, it ranked second behind SMO when χ2 was used and ranked third behind SMO and OlexGreedy when IG was used. While the first top four techniques had a F1-measure value of more than 0.800, both JRip and NB did not per-

form well, with NB performing the worst out of all the techniques. When comparing only the rule-based techniques (not including SMO and NB), the RL strategy came in best and second best when χ2 and IG were used respectively. Recall that rule-based techniques offer the advantage that they are more readily understandable by the end user.

In the classification of the 20NG-B dataset, the best strategy was RL + BestRule, followed closely by RL + BestStrategy whenχ2 was used as shown in Table 6.4. Both these strategies generated rules with negation. In the case of RL + BestRule, 37.7% of the generated rules incorporated negation, while in the case of RL + BestStrategy, 19.3% of the rules incorporated negation, as can be seen from Table 6.4. The worst strategy was RL + UN-UP-Ov whenχ2 was used. Looking at the first three strategies that utilized only one sub-space, the best strategy was RL + UP, which did not generate any rules with negation. RL + UP had a slightly higher F1-measure value compared

to RL + UN but this was achieved with a much larger ruleset; an average of 1707.0 rules as compared to RL + UN, which had an average of 837.7 rules (less than half of RL + UP’s). When IG was used, the best strategy was RL + BestStrategy, which generated 25.7% of rules with negation. This was followed closely by RL + BestRule,

which generated a ruleset comprising 50.6% of rules with negation. The worst strategy was RL + Ov. With respect to the first three strategies which used only one sub-space, RL + UN was again slightly better, with an average number of rules of only 625.2 with 47.3% of the rules incorporating negation.

Strategies F1 Average #_{of rules} Average # of rules_{with negation} _{with negation}% of rules

χ2 RL + UP 0.844 1707.0 0.0 0.0 RL + UN 0.825 837.7 305.4 36.5 RL + Ov 0.824 1481.1 0.0 0.0 RL + UP-UN-Ov 0.844 1706.3 1.9 0.1 RL + UN-UP-Ov 0.823 848.6 310.9 36.6 RL + BestStrategy 0.861 1062.7 205.2 19.3 RL + BestPosRule 0.858 1197.9 0.0 0.0 RL + BestRule 0.862 1058.7 399.1 37.7 IG RL + UP 0.819 1238.7 0.0 0.0 RL + UN 0.821 625.2 295.6 47.3 RL + Ov 0.809 1089.6 0.0 0.0 RL + UP-UN-Ov 0.819 1238.5 1.6 0.1 RL + UN-UP-Ov 0.821 625.2 295.6 47.3 RL + BestStrategy 0.852 698.6 179.8 25.7 RL + BestPosRule 0.842 911.6 0.0 0.0 RL + BestRule 0.850 713.7 360.8 50.6

Table 6.4: Micro-averaged F1-measure and the average number of rules for the classi-

fication of the 20NG-B dataset usingχ2 and IG for feature selection and the keyword representation

Techniques χ2 Rank IG Rank

RL 0.862 2 0.852 2 SMO 0.892 1 0.891 1 NB 0.656 6 0.672 6 JRip 0.808 5 0.812 5 OlexGreedy 0.845 3 0.844 3 OlexGA 0.844 4 0.837 4

Table 6.5: Micro-averaged F1-measure for the classification of the 20NG-B dataset

using χ2 and IG for feature selection and the keyword representation for the best RL strategy in comparison with the other machine learning techniques

In comparison with the other machine learning techniques as shown in Table 6.5, the best RL strategy came in second behind SMO. It performed quite similarly to the

Olex systems. NB was again the worst performing technique in this case. Comparison of only the rule-based techniques (not including SMO and NB) showed that the RL strategy was the best.

Overall, with regards to the binary classification task for the 20 Newsgroups datasets, the better RL strategies were ones which generated rules with negation. With respect to the feature selection techniques used, when IG was adopted, the percentage of rules with negation was higher, except for RL + UP-UN-Ov in the 20NG-A dataset. More- over, it was observed that the average number of rules generated was much smaller for the RL strategies when IG was used as compared to when χ2 was used. χ2 and IG both used different computations to determine the significance of a feature as a keyword. Therefore, the choice of features ranked in the top 10% was different. Generally, when a keyword that occurs in more documents is used as a feature for rule learning, the rule learnt has a wider coverage (covering more documents), resulting in fewer rules needed to cover all the documents in a dataset. This suggested that when IG was used, there were more rules learnt which covered two or more documents whereas the features deemed more significant byχ2 led to the learning of more rules that covered only one document, thus resulting in a bigger ruleset. However, the use ofχ2 led to better classification results in terms of higher F1-measure values obtained. Comparison with

the other machine learning techniques showed that the best RL strategy outperformed the other machine learning techniques and was closely competitive with SMO.

In document DEPARTAMENTO DE LEGISLACIÓN Y DOCUMENTACION (página 40-44)