• No se han encontrado resultados

Our bag-based fuzzy rough classifiers rely on the definition of three parameters: (i) the bag similarity relation R(X, B), (ii) the bag-to-class membership degree C(B) and (iii) the way to compute theC(X) in (6.7). We evaluate our classifiers by means of their majority class accuracy, minority class accuracy, AUC and balanced accuracy values on the datasets in Table6.13.

Table 6.17: Five best performing BFRMIC methods for each evaluation measure. The results are taken as averages over the datasets in Table 6.13. Weight combination W4 is used. For each evaluation measure, the bottom line lists the method with the lowest mean result.

Classifier Acc maj Classifier Acc min

HInvadd-MaxExp 0.9762 AvgH-MaxAdd 0.7613 AvgH-MaxExp 0.9757 AvgH-Avg 0.7557 HAdd-MaxExp 0.9750 AvgHExp-Avg 0.7434 H-MaxExp 0.9742 HAdd-Avg 0.7409 HExp-MaxExp 0.9739 HExp-Avg 0.7336 HAdd-Avg 0.6516 HAdd-MaxExp 0.2776

Classifier AUC Classifier Balacc

AvgH-Max 0.8807 AvgH-MaxAdd 0.7608 AvgH-MaxExp 0.8784 AvgHExp-MaxAdd 0.7451 AvgH-MaxInvadd 0.8776 AvgHInvadd-MaxAdd 0.7410 AvgHExp-MaxInvadd 0.8576 AvgH-MaxInvadd 0.7402 AvgHInvadd-MaxInvadd 0.8556 AvgHAdd-MaxAdd 0.7397 H-Avg 0.7494 AvgHAdd-Max 0.6255

As we did in Section6.6.2, we first fix the weight combination to W4 and compare the eight alternatives forR(X, B) and five alternatives forC(B) listed in Table6.3. Table6.17provides the five BFRMIC methods with the best results for each evaluation measure. For the AUC and balanced accuracy measures, we encounter methods with similar settings as the best performing ones in Section6.4.3on top. It is striking that four out of five methods in the top five for the balanced accuracy use MaxAdd to compute the C(B) values. We compare the different parameter settings in Section6.6.3.1 and study the effect of the IFROWANN weight combination in Section6.6.3.2. The version of BFRMIC with crisp class membership degrees for the training bags is evaluated separately in Section6.6.3.3.

6.6.3.1 Setting rankings

Fixing the weight combination in (6.7) to W4, we evaluate the performance of the eight settings for R(X, B) and five settings for C(B). Table 6.18 lists the average results and standard deviations of all alternatives for these two parameters. We observe that the rankings of theR(X, B) settings are similar to the ones listed for the BFMIC methods in Table 6.11 and that AvgH comes out on top for all included measures. However, notable differences are observed for the rankings of theC(B) versions.

With respect to theC(B) calculations, we observe that Max, MaxExp and MaxInvadd per- form very well on the majority class at the cost of many classification errors on the minority class. The behaviour of MaxAdd and Avg is distinctly different. These settings obtain a better balance between the minority and majority class accuracies. This is particularly true for MaxAdd, which is reflected in its superior balanced accuracy value. We explain the poor performance of Max (and MaxExp, MaxInvadd) in the following paragraphs.

As described in Section 6.5.1.3, the BFRMIC methods precompute their C(B) values for the training bags in a learning phase. They do not use a leave-one-out procedure, that is,

the similarity of a bag with itself is included in the aggregation toCB(B), where CB is the

class label of bag B. As a result, a training bag should be pulled more to its own class. In particular, Max sets CB(B) = 1. When using an average-related procedure, this effect

diminishes as the class size increases. Based on this observation, we can derive the following. To classify a bagX to the majority classCmaj or minority classCmin, the aggregation lengths

inCmaj(X) and Cmin(X) are the same. In particular, Cmaj(X) = OWAWexp

M aj({min(1−R(X, B) +Cmaj(B),1)|BT}),

Cmin(X) = OWAWexp

M in({min(1−R(X, B) +Cmin(B),1)|BT}).

Due to the large impact of the similarity of a training bag with itself in the calculation of C(B) by Max, the training bags taking part in the aggregation ofCmaj(X) will mostly belong to the minority class, while those in the aggregation of Cmin(X) will mostly belong to the

majority class. Concretely, we will find Cmaj(X) = OWAWexp

M aj({1,1, . . . ,1}

| {z }

BCmaj

∪ {min(1−R(X, B) +Cmaj(B),1)|BCmin}), (6.11) Cmin(X) = OWAWexp

M in({1,1, . . . ,1}

| {z }

BCmin

∪ {min(1−R(X, B) +Cmin(B),1)|BCmaj}). (6.12)

As a result of the sorting step in the OWA procedure, the values stemming from bagsBCmaj

in (6.11) and from bagsBCminin (6.12) are placed at the beginning of the ordered sequence

(Definition3.1.1) and are likely assigned a zero weight by the exponential weights used inW4. This supports our above claim thatCmaj(X) andCmin(X) depend primarily on contributions of minority class and majority class bags respectively.

On average across the minority class 1 datasets, the maximum similarity values of minor- ity class bags with majority class bags have a minimum value of 0.9160, a mean value of 0.9449, a median value of 0.9456 and a maximum value of 0.9697. For the minority class 0 datasets, these average values are 0.7556 (minimum), 0.8363 (mean), 0.8295 (median) and 0.9609 (maximum). In the same way, we can determine that the maximum similarity values of majority class bags with minority class bags have an average minimum value of 0.8545, average mean value of 0.9164, average median value of 0.9143 and average maximum value of 0.9697 for minority class 1 datasets. On the minority class 0 datasets, these values are 0.7234, 0.8144, 0.8121 and 0.9609 respectively. It is evident that theC(B) values computed with Max can be expected to be noticeably higher for minority bags andC=Cmaj than for

majority bags andC =Cmin. As a result, we can expect Cmaj(X) computed with (6.11) to

be usually higher than Cmin(X) computed with (6.12), which results in an easy assignment

of the majority class label and a high accuracy on this class. This is particularly true for Max, on which the above reported values are based, but similar conclusions can be drawn for the strongly related MaxExp and MaxInvadd aggregations.

For completeness (and to show why Avg and MaxAdd are not expected to have this problem), the distributions of the average similarity values of minority class bags with majority class bags are 0.8482(min)-0.8800(mean)-0.8815(median)-0.8946(max) (minority class 1 datasets)

Table 6.18: Setting rankings of theR(X, B) andC(B) alternatives. Weight combinationW4 is used.

R(X, B) Acc maj C(B) Acc maj AvgH 0.8713±0.1222 MaxExp 0.9706±0.0060 AvgHExp 0.8637±0.1198 Max 0.9579±0.0081 H 0.8609±0.1294 MaxInvadd 0.9540±0.0059 AvgHInvadd 0.8601±0.1201 MaxAdd 0.7432±0.0124 HInvadd 0.8545±0.1190 Avg 0.6691±0.0139 AvgHAdd 0.8545±0.1357 HExp 0.8543±0.1359 HAdd 0.8524±0.1349

R(X, B) Acc min C(B) Acc min AvgH 0.5646±0.1643 Avg 0.7310±0.0179 AvgHExp 0.5399±0.1684 MaxAdd 0.7204±0.0231 AvgHInvadd 0.5181±0.1803 MaxInvadd 0.4588±0.0367 AvgHAdd 0.5062±0.1923 Max 0.3397±0.0332 HInvadd 0.5015±0.1827 MaxExp 0.3120±0.0355 HAdd 0.5000±0.1992 HExp 0.4971±0.1841 H 0.4717±0.1790 R(X, B) AUC C(B) AUC AvgH 0.8605±0.0230 MaxInvadd 0.8277±0.0380 AvgHExp 0.8397±0.0155 MaxAdd 0.8076±0.0249 AvgHInvadd 0.8255±0.0162 Max 0.8034±0.0404 AvgHAdd 0.8171±0.0172 MaxExp 0.8027±0.0408 HAdd 0.7974±0.0199 Avg 0.7929±0.0240 HInvadd 0.7825±0.0158 HExp 0.7713±0.0083 H 0.7608±0.0059 R(X, B) Balacc C(B) Balacc AvgH 0.7180±0.0309 MaxAdd 0.7318±0.0166 AvgHExp 0.7018±0.0339 MaxInvadd 0.7064±0.0199 AvgHInvadd 0.6891±0.0397 Avg 0.7000±0.0121 AvgHAdd 0.6803±0.0465 Max 0.6488±0.0181 HInvadd 0.6780±0.0306 MaxExp 0.6413±0.0179 HAdd 0.6762±0.0399 HExp 0.6757±0.0294 H 0.6663±0.0300

and 0.6739(min)-0.7130(mean)-0.7126(median)-0.7501(max) (minority class 0 datasets). The distributions of the average similarity values of majority class bags with minority class bags are 0.8135(min)-0.8800(mean)-0.8799(median)-0.9169(max) (minority class 1 datasets) and 0.6715(min)-0.7130(mean)-0.7142(median)-0.7434(max) (minority class 0 datasets). Clearly, these values are far closer together, which can lead to more confusion between classes, but a better recognition of the minority class.

6.6.3.2 Weight combinations

We fix R(X, B) to AvgH and C(B) to MaxAdd and assess the effect of the IFROWANN weight combinations in theC(X) lower approximation calculations in (6.7). Table 6.19 lists

the results of this evaluation as average values across the datasets in Table6.13. As for our IFRMIC methods, weighting schemesW4andW7yield the best balance between the majority and minority class accuracies, which is reflected in their high balanced accuracy results. The other schemes favour one of the two classes. CombinationsW4andW7 have the highest AUC values as well, but the results of the other alternatives are not much lower for this measure. Only W1 and W2 have a poor AUC value, which indicates that using weight vectorWM inadd∗ is not a good idea.

Table 6.19: Evaluation of the eight weight combinations within BFRMIC-AvgH-MaxAdd-*. The value for γ inW5 and W6 is 0.1.

Weights Acc maj Weights Acc min Weights AUC Weights Balacc

W3 0.9526 W2 0.9873 W4 0.8403 W7 0.7637 W4 0.7604 W1 0.9716 W7 0.8371 W4 0.7608 W7 0.7566 W6 0.9574 W8 0.8034 W5 0.6954 W5 0.5779 W8 0.8996 W5 0.7967 W3 0.6749 W8 0.4357 W5 0.8129 W3 0.7965 W8 0.6677 W6 0.2326 W7 0.7709 W6 0.7433 W6 0.5950 W1 0.1138 W4 0.7613 W1 0.6915 W1 0.5427 W2 0.0440 W3 0.3972 W2 0.6548 W2 0.5156

We have studied the effect of varying theγ parameter on the performance of W5 and W6 as well, which is set to 0.1 in Table6.19. The same conclusion holds as for our IFRMIC methods (see Section 6.6.2.2), namely that lowerγ values provide better results. In particular, a lower γ leads to (i) a higher majority class accuracy, (ii) a somewhat lower minority class accuracy, (iii) a higher AUC and (iv) a higher balanced accuracy. Setting γ = 0 yields AUC results of 0.8294 (W5) and 0.7826 (W6) and balanced accuracy values of 0.7363 (W5) and 0.6419 (W6). 6.6.3.3 Crisp class memberships

As argued in Section6.5.1.3, it can make more intuitive sense to use crisp class membership degrees for the training bags in (6.7) instead of computing C(B) with Max, MaxExp, Max- Invadd, MaxAdd or Avg. As we did in Section 6.6.3.2, the bag similarity values R(X, B) are still computed with AvgH, the preferred alternative according to Table 6.18.

Table 6.20: Performance of the BFRMIC-AvgH-Crisp-* methods with crisp class membership relation (6.8) for the training bags.

Weights Acc maj Weights Acc min Weights AUC Weights Balacc W3 1.0000 W2 0.9872 W6-0.0 0.9030 W6-0.0 0.8370 W5-0.0 0.9910 W6-0.1 0.9129 W8 0.9006 W8 0.8120 W5-0.1 0.9815 W6-0.0 0.8649 W6-0.1 0.8989 W6-0.1 0.7821 W4 0.9696 W1 0.8302 W4 0.8909 W7 0.7771 W7 0.9505 W8 0.7016 W7 0.8837 W4 0.7609 W8 0.9225 W7 0.6037 W1 0.8813 W1 0.7563 W6-0.0 0.8091 W4 0.5523 W5-0.1 0.8790 W5-0.1 0.6646 W1 0.6824 W5-0.1 0.3477 W5-0.0 0.8708 W5-0.0 0.5801 W6-0.1 0.6512 W5-0.0 0.1691 W2 0.8692 W2 0.5499 W2 0.1127 W3 0.0256 W3 0.8342 W3 0.5128

In Table 6.20, we repeat the analysis conducted in Section 6.6.3.2 and evaluate the perfor- mance of the different IFROWANN weight combinations within this BFRMIC method. We use two values of γ in combinations W5 and W6, namely γ = 0 and γ = 0.1. These results should be compared to the ones listed in Table 6.19, where the C(B) values are computed with MaxAdd instead of the crisp class membership degrees (6.8). We observe the following: • On average, the majority class accuracy benefits from using a crisp class relation, while

the minority class accuracy decreases.

• On average, the AUC of the BFRMIC method increases when a crisp class relation is used instead of the MaxAdd setting for C(B). This holds for all weight combinations. The largest increase is observed forW2with a rise in mean AUC of 0.2144. The smallest increase is observed for W3, but a positive mean difference of 0.0377 is still reported. • A decrease in balanced accuracy is observed for combinations W3 (-0.1621), W5-0.0

(-0.1562) and W5-0.1 (-0.0308). For the remaining evaluated weight combinations, an increase in balanced accuracy is obtained, ranging from a modest mean improvement of 0.0001 for W4 to a notable increase of 0.2136 forW1.

• When using weight combination W6 with γ = 0 within BFRMIC with the crisp class membership degree (6.8), both the highest mean AUC and highest mean balanced ac- curacy are obtained. In fact, this is the best mean performance of any method on the imbalanced datasets in Table 6.13 encountered so far. Note that by setting γ to zero, the sixth weight combination reduces to hWadd∗M aj, Wexp∗M aji. As a consequence, the

lower approximation values computed in (6.7) are a priori based on the same number of observations, because Wadd∗M aj and Wexp∗M aj have the same number of leading zero posi-

tions (namely, |M aj|). The lower approximation of the minority class is obtained with additive weights and that of the majority class with exponential weights.

As a final step, we study this last observation in more detail. The weight vectors used in (6.7) when using W6 with γ = 0 consist of |M aj| leading zeros, followed by WLadd for the minority class and WLexp for the majority class, both with p = |M in| (see Section 3.2.1 for the weight vector definitions). In Table6.21, we evaluate the performance of this BFRMIC method when options other than WLadd and WLexp are selected to fill up the weight vectors in W6. We compare combinations of Strict, Add, Exp and Invadd. The original definition of W6 with γ = 0 corresponds to Add-Exp in the table. This clearly is a good choice, as it attains the second highest AUC and balanced accuracy values and the third highest minority class accuracy. However, its majority class accuracy is relatively low, namely the lowest but two. TheInvadd-Expalternative has a considerably higher majority class accuracy (0.9059 compared to 0.8091) but lower minority class accuracy (0.7799 compared to 0.8649). Nevertheless, its mean AUC and balanced accuracy are both highest among the evaluated methods. These are also the highest mean values observed on the imbalanced datasets so far. When we compare Invadd-Exp and Add-Exp in more detail, we observe that their differ- ence in performance on the minority and majority classes presents itself everywhere, that is, across all datasets of Table 6.13. Invadd-Exp has a higher majority class accuracy than Add-Exp on all datasets. Add-Exp has a higher minority class accuracy than Invadd-Exp on all datasets, except for ties on WIRSel-6, Corel20-8 and Mut bonds. We can refer the reader

Table 6.21: Variants of W6-0.0 within BFRMIC-AvgH-Crisp. The first weight component refers to the weights for the minority class, the second to weights for the majority class.

Combination Acc maj Combination Acc min Combination AUC Combination Balacc

Exp-Add 1.0000 Add-Strict 0.9298 Invadd-Exp 0.9038 Invadd-Exp 0.8429

Strict-Add 1.0000 Invadd-Strict 0.8660 Add-Exp 0.9030 Add-Exp 0.8370

Strict-Invadd 0.9993 Add-Exp 0.8649 Invadd-Strict 0.8965 Invadd-Strict 0.8326

Exp-Invadd 0.9990 Invadd-Exp 0.7799 Add-Strict 0.8945 Exp-Strict 0.8091

Invadd-Add 0.9979 Exp-Strict 0.7026 Add-Invadd 0.8936 Add-Strict 0.8059

Add-Add 0.9910 Strict-Strict 0.6037 Exp-Strict 0.8918 Strict-Strict 0.7771

Invadd-Invadd 0.9870 Exp-Exp 0.5523 Exp-Exp 0.8909 Exp-Exp 0.7609

Strict-Exp 0.9847 Add-Invadd 0.4884 Invadd-Invadd 0.8856 Add-Invadd 0.7311

Add-Invadd 0.9738 Strict-Exp 0.4537 Strict-Strict 0.8837 Strict-Exp 0.7192

Exp-Exp 0.9696 Invadd-Invadd 0.3016 Strict-Exp 0.8757 Invadd-Invadd 0.6443

Strict-Strict 0.9505 Add-Add 0.1691 Add-Add 0.8708 Add-Add 0.5801

Exp-Strict 0.9156 Exp-Invadd 0.1327 Exp-Invadd 0.8627 Exp-Invadd 0.5658

Invadd-Exp 0.9059 Strict-Invadd 0.0891 Invadd-Add 0.8599 Strict-Invadd 0.5442

Add-Exp 0.8091 Invadd-Add 0.0822 Strict-Invadd 0.8444 Invadd-Add 0.5401

Invadd-Strict 0.7992 Exp-Add 0.0256 Exp-Add 0.8342 Exp-Add 0.5128

Add-Strict 0.6820 Strict-Add 0.0164 Strict-Add 0.8149 Strict-Add 0.5082 to Section3.3.4.3 for a discussion on why the additive weights can lead to a better minority class recognition than inverse additive weights on imbalanced datasets. The AUC values of Invadd-Exp and Add-Exp are mostly close together, although the former attains 22 wins and the latter only 11, reflected in the slightly higher mean AUC forInvadd-Exp. With respect to the balanced accuracy,Invadd-Exphas the highest mean value and attains 22 wins compared to 11 wins forAdd-Exp.

Some other aspects of note:

• The minority class accuracy increases whenStrictorExpare used for the majority class weights, regardless of the weights for the minority class. WhenInvadd orAdd are used for the majority class weights, the majority class accuracy increases at the cost of the minority class accuracy.

• The minority class accuracy is mostly better when Invadd or Add are used for its weights.

• The settings with symmetric weights (Exp-Exp,Strict-Strict,Add-Add,Invadd-Invadd) are not the top performers, but not the worst either.

Documento similar