Supplementary material for the paper “The mRMR variable selection method: a comparative study for functional data”
Jos´ e R. Berrendero, Antonio Cuevas, Jos´ e L. Torrecilla Departamento de Matem´ aticas
Universidad Aut´ onoma de Madrid, Spain
1 List of models used in the simulation study and a few example outputs Our simulation study consists of 400 experiments based on 100 different underlying models. The optimal classification rule in each case depends only on a finite number of variables. Models differ in complexity and number of relevant variables. The processes involved are chosen among the following: first, the standard Brownian Motion, B.
Second, BT denotes a Brownian Motion with a trend m(t), i.e., BT (t) = B(t)+m(t);
we have considered several choices for m(t), a linear trend, m(t) = ct, a linear trend with random slope, i.e., m(t) = θt, where θ is a Gaussian r.v., and different members of two parametric families: the peak functions Φ
m,kand the hillside functions, defined by
Φ
m,k= Z
t0
ϕ
m,k(s)ds , hillside
t0,b(t) = b(t − t
0) I
[t0,∞),
where, ϕ
m,k(t) =
√ 2
m−1h
I (
2k−22m ,2k−12m) − I (
2k−12m ,22km) i
for m ∈ N, 1 ≤ k ≤ 2
m−1. Third, the Brownian Bridge: BB(t) = B(t)−tB(1). Our fourth class of Gaussian processes is the Ornstein–Uhlenbeck process, with zero mean (OU) or different mean functions m(t) (OU t). Finally some “smooth” processes have been also include. They are obtained by convolving Brownian trajectories with Gaussian kernels. We have considered two levels of smoothing denoted by sB and ssB.
In the following list of models, µ
idenotes de distribution of X|Y = i and variables is the set of relevant variables in each Gaussian or Mixture case. We call them “relevant”
in the sense that the optimal classification rule depends only on these variables. In the list below the variables written in boldface are “especially relevant” regarding their influence in the optimal classifier.
1. Gaussian models considered :
1. G1:
µ0: B(t)
µ1: B(t) +θt , θ∼N(0,3) variables={X100}.
2. G1b:
µ0: B(t)
µ1: B(t) +θt , θ∼N(0,5) variables={X100}.
3. G2:
µ0: B(t) +t µ1: B(t) variables={X100}.
4. G2b:
µ0: B(t) + 3t µ1: B(t) variables={X100}.
5. G3:
µ0: BB(t) µ1: B(t) variables={X100}.
6. G4:
µ0: B(t) +hillside0.5,4(t) µ1: B(t)
variables={X47,X100}.
7. G5:
µ0: B(t) + 3Φ1,1(t) µ1: B(t) variables={X1,X48, X100}.
8. G6:
µ0: B(t) + 5Φ2,2(t) µ1: B(t) variables={X48,X75, X100}.
9. G7:
µ0: B(t) + 5Φ3,2(t) + 5Φ3,4(t)
µ1: B(t)
variables={X22,X35, X49, X74,X88, X100}.
10. G8:
µ0: B(t) + 3Φ2,1.25(t) + 3Φ2,2(t)
µ1: B(t)
variables={X9,X35, X48, X62,X75, X100}.
2. logistic-type models considered: they are all defined according standard (ii) (see Sec. 4 in the main paper). The processX =X(t) follows one of the distributions mentioned above and Y = Binom(1, η(X)) with η(x) = (1 +e−ψ(x(t1),···,x(tk)))−1, a function of the relevant variables x(t1),· · ·, x(tk).
L1: ψ(X) = 10X65.
L2: ψ(X) = 10X30+ 10X70. L3: ψ(X) = 10X30−10X70. L4: ψ(X) = 20X30+ 50X5020X80. L5: ψ(X) = 20X30−50X50+ 20X80.
L6: ψ(X) = 10X10+ 30X40+ 10X72+ 10X80+ 20X95. L7: ψ(X) =P10
i=110X10i.
L8: ψ(X) = 20X302 + 10X504 + 50X803 . L9: ψ(X) = 10X10+ 10|X50|+ 0X302X85. L10: ψ(X) = 20X33+ 20|X68|.
L11: ψ(X) =X20
35 +X30
77. L12: ψ(X) = logX35+ logX77.
L13: ψ(X) = 40X20+ 30X28+ 20X62+ 10X67. L14: ψ(X) = 40X20+ 30X28−20X62−10X67. L15: ψ(X) = 40X20−30X28+ 20X62−10X67.
Some variations of these models have been also considered:
L3b: ψ(X) = 30X30−20X70.
L4b: ψ(X) = 30X30+ 20X50+ 10X80. L5b: ψ(X) = 10X30−10X50+ 10X80.
L6b: ψ(X) = 20X10+ 20X40+ 20X72+ 20X80+ 20X95. L8b: ψ(X) = 10X302 + 10X504 + 10X803.
3. Mixture-type models: they are obtained by combining (via mixtures) in several ways the above mentioned Gaussian distributions assumed forX|Y = 0 andX|Y = 1. These models are denoted M1, ..., M10 in the output tables.
1. M1:
µ0:
B(t) + 3t, 1/2 B(t)−2t, 1/2
µ1: B(t)
variables={X100}.
2. M2:
µ0:
B(t) + 3Φ2,2(t), 1/2 B(t) + 5Φ3,2(t), 1/2
µ1: B(t)
variables={X22,X35, X48,X75, X100}.
3. M3:
µ0:
B(t) + 3Φ2,2(t), 1/10 B(t) + 5Φ3,2(t), 9/10
µ1: B(t)
variables={X22,X35, X48,X75, X100}.
4. M4:
µ0:
B(t) + 3Φ2,2(t), 1/2 B(t) + 5Φ3,3(t), 1/2
µ1: B(t)
variables={X48,X62,X75, X100}.
5. M5:
µ0:
B(t) + 3Φ2,1(t) ,1/3 B(t) + 3Φ2,2(t), 1/3 B(t) + 5Φ3,2(t), 1/3
µ1: B(t)
variables={X1,X22,X35, X48,X75, X100}.
6. M6:
µ0:
B(t) + 3Φ2,1(t) ,1/2 B(t) + 3t ,1/2
µ1: B(t)
variables={X1,X22, X49,X100}.
7. M7:
µ0:
B(t) + 3Φ1,1(t) ,1/2 BB(t) ,1/2
µ1: B(t)
variables={X1,X48,X100}.
8. M8:
µ0:
B(t) +θt, θ∼N(0,5) ,1/2 B(t) +hillside0.5,5(t) ,1/2
µ1: B(t)
variables={X47,X100}.
9. M9:
µ0:
B(t) +θt, θ∼N(0,5) ,1/2
BB(t) ,1/2
µ1: B(t)
variables=X100.
10. M10:
µ0:
B(t) + 3Φ1,1(t) ,1/3 B(t)−3t ,1/3 BB(t) ,1/3
µ1: B(t)
variables={X1,X48,X100}.
11. M11:
µ0:
B(t) + 3Φ1,1(t) ,1/4 B(t)−3t ,1/4 B(t) +hillside0.5,5(t) ,1/4
BB(t) ,1/4
µ1: B(t)
variables={X1,X48,X100}.
Finally, the full list of models involved is as follows:
1. L1 OU
2. L1 OUt
3. L1 B
4. L1 sB
5. L1 ssB
6. L2 OU
7. L2 OUt
8. L2 B
9. L2 sB
10. L2 ssB
11. L3 OU
12. L3b OU
13. L3 OUt
14. L3b OUt
15. L3 B
16. L3b B
17. L3 sB
18. L3 ssB
19. L4 OU
20. L4b OU
21. L4 OUt
22. L4b OUt
23. L4 B
24. L4 sB
25. L4 ssB
26. L5 OU
27. L5b OU
28. L5 OUt
29. L5 B
30. L5 sB
31. L5 ssB
32. L6 OU
33. L6b OU
34. L6 OUt
35. L6b OUt
36. L6 B
37. L6 sB
38. L6 ssB
39. L7 OU
40. L7b OU
41. L7 OUt
42. L7b OUt
43. L7 B
44. L7 sB
45. L7 ssB
46. L8 B
47. L8 sB
48. L8 ssB
49. L8b OU
50. L9 B
51. L9 sB
52. L9 ssB
53. L10 OU
54. L10 B
55. L10 sB
56. L10 ssB
57. L11 OU
58. L11 OUt
59. L11 B
60. L11 sB
61. L11 ssB
62. L12 OU
63. L12 OUt
64. L12 B
65. L12 sB
66. L12 ssB
67. L13 OU
68. L13 OUt
69. L13 B
70. L13 sB
71. L13 ssB
72. L14 OU
73. L14 OUt
74. L14 B
75. L14 sB
76. L15 OU
77. L15 OUt
78. L15 B
79. L15 sB
80. G1
81. G1b
82. G2
83. G2b
84. G3
85. G4
86. G5
87. G6
88. G7
89. G8
90. M1
91. M2
92. M3
93. M4
94. M5
95. M6
96. M7
97. M8
98. M9
99. M10
100. M11
We next provide a few simulation results. Seewww.uam.es/antonio.cuevas/exp/mRMR-outputs.
xlsxfor the full simulation outputs.
NB accuracy outputs
Model MID FCD RD VD CD Base
L7 OU 90.73 87.01 90.41 87.72 90.50 93.35 L1 OUt 72.54 74.92 74.10 74.56 74.26 73.33 L14 B 75.89 77.31 77.17 76.72 76.75 75.33 L9 sB 84.73 86.01 85.84 85.40 85.78 82.27
G1b 79.74 75.94 80.73 81.66 77.67 79.49
G3 76.22 64.54 78.84 78.78 72.67 71.24
G6 83.13 83.50 84.38 83.96 84.28 74.90
M1 79.12 73.09 80.93 81.89 76.77 77.61
M4 64.73 68.04 67.91 67.48 67.89 61.36
M6 81.25 80.09 83.06 82.45 83.21 78.02
Table 10.- Average NB accuracy (proportion of correct classification) outputs, over 200 runs of the considered methods with sample sizen= 50.
NB number of variables
Model MID FCD RD VD CD Base
L7 OU 12.1 14.1 10.1 11.5 11.1 100
L1 OUt 8.9 8.0 7.0 5.9 6.6 100
L14 B 7.9 8.0 6.0 7.5 5.9 100
L9 sB 4.9 7.5 4.7 4.9 5.3 100
G1b 6.9 13.4 4.8 1.5 9.0 100
G3 6.4 11.6 3.7 3.7 8.2 100
G6 3.4 1.9 3.5 3.1 2.7 100
M1 4.9 11.7 4.5 1.4 8.4 100
M4 6.5 5.0 5.1 3.4 4.6 100
M6 7.1 7.6 5.5 5.6 6.4 100
Table 11.- Average number of selected variables over 200 runs of the considered methods with sample sizen= 50 using NB.
k-NN accuracy outputs
Model MID FCD RD VD CD Base
L7 OU 90.74 86.89 90.78 87.79 90.67 92.21 L1 OUt 75.83 77.34 76.77 77.22 77.11 75.81 L14 B 75.34 77.16 77.29 76.20 76.49 74.43 L9 sB 86.79 87.39 87.35 87.03 87.28 86.10
G1b 79.11 75.74 80.00 80.01 78.13 78.57
G3 73.39 61.97 77.28 77.03 68.20 65.26
G6 91.95 84.16 88.22 85.80 84.68 92.19
M1 81.63 75.47 82.79 83.00 80.59 80.72
M4 72.94 71.60 74.72 70.30 71.66 73.29
M6 83.08 79.70 84.13 84.26 84.02 80.99
Table 12.- Averagek-NN accuracy (proportion of correct classification) outputs, over 200 runs of the considered methods with sample sizen= 50.
k-NN number of variables
Model MID FCD RD VD CD Base
L7 OU 11.5 14.5 10.4 12.1 11.3 100
L1 OUt 9.0 7.9 6.9 6.5 6.8 100
L14 B 8.3 7.6 5.5 8.0 6.5 100
L9 sB 6.3 7.7 6.0 7.1 6.0 100
G1b 7.8 11.7 6.5 6.3 8.7 100
G3 5.1 11.2 2.5 2.9 7.8 100
G6 11.5 12.6 9.0 8.2 7.5 100
M1 7.8 11.4 6.3 4.9 8.6 100
M4 11.7 16.1 10.4 9.7 10.1 100
M6 9.9 9.6 7.5 8.7 7.6 100
Table 13.- Average number of selected variables over 200 runs of the considered methods with sample sizen= 50 usingk-NN.
LDA accuracy outputs
Model MID FCD RD VD CD Base
L7 OU 89.27 85.13 89.75 86.81 90.10 -
L1 OUt 72.05 73.74 73.48 73.96 73.66 -
L14 B 75.25 76.35 77.12 75.62 76.27 -
L9 sB 84.91 84.88 84.96 84.69 85.12 -
G1b 53.35 52.22 54.15 54.49 51.67 -
G3 52.26 50.91 53.53 53.51 51.06 -
G6 95.28 87.92 90.54 86.59 87.80 -
M1 54.44 53.64 54.88 54.68 53.70 -
M4 78.95 70.98 76.46 71.37 71.30 -
M6 79.92 79.53 80.80 80.57 80.70 -
Table 14.- Average LDA accuracy (proportion of correct classification) outputs, over 200 runs of the considered methods with sample sizen= 50.
LDA number of variables
Model MID FCD RD VD CD Base
L7 OU 5.5 6.9 6.2 6.1 7.2 -
L1 OUt 5.6 4.6 4.6 4.4 4.9 -
L14 B 3.6 3.2 3.1 4.1 3.6 -
L9 sB 4.0 3.5 3.3 3.4 3.5 -
G1b 5.6 7.0 5.3 4.6 7.4 -
G3 7.1 8.9 5.0 5.0 8.9 -
G6 10.7 14.3 11.2 7.8 11.6 -
M1 5.5 7.2 5.3 5.7 6.8 -
M4 11.1 10.2 10.3 7.4 8.5 -
M6 6.4 4.5 5.2 5.2 4.8 -
Table 15.- Average number of selected variables over 200 runs of the considered methods with sample sizen= 50 using LDA.
SVM accuracy outputs
Model MID FCD RD VD CD Base
L7 OU 90.73 87.01 90.41 87.72 90.50 93.35 L1 OUt 72.54 74.92 74.10 74.56 74.26 73.33 L14 B 75.89 77.31 77.17 76.72 76.75 75.33 L9 sB 84.73 86.01 85.84 85.40 85.78 82.27
G1b 79.74 75.94 80.73 81.66 77.67 79.49
G3 76.22 64.54 78.84 78.78 72.67 71.24
G6 83.13 83.50 84.38 83.96 84.28 74.90
M1 79.12 73.09 80.93 81.89 76.77 77.61
M4 64.73 68.04 67.91 67.48 67.89 61.36
M6 81.25 80.09 83.06 82.45 83.21 78.02
Table 16.- Average SVM accuracy (proportion of correct classification) outputs, over 200 runs of the considered methods with sample sizen= 50.
SVM number of variables
Model MID FCD RD VD CD Base
L7 OU 12.1 14.1 10.1 11.5 11.1 100
L1 OUt 8.9 8.0 7.0 5.9 6.6 100
L14 B 7.9 8.0 6.0 7.5 5.9 100
L9 sB 4.9 7.5 4.7 4.9 5.3 100
G1b 6.9 13.4 4.8 1.5 9.0 100
G3 6.4 11.6 3.7 3.7 8.2 100
G6 3.4 1.9 3.5 3.1 2.7 100
M1 4.9 11.7 4.5 1.4 8.4 100
M4 6.5 5.0 5.1 3.4 4.6 100
M6 7.1 7.6 5.5 5.6 6.4 100
Table 17.- Average number of selected variables over 200 runs of the considered methods with sample sizen= 50 using SVM.
2 Results with quotient criterion and other classifiers
The outputs included in the paper correspond to the difference criterion (in order to combine the relevance and redundancy measures). We provide here some additional results for the quotient criterion, instead of the difference one. Besides, Figure 2 in the paper was produced using only the outputs for the k-NN classfier. In this document we show the analogous displays for LDA, SVM and NB. Again, the entire simulation outputs can be downloaded from www.uam.es/antonio.cuevas/exp/mRMR-outputs.xlsx.
Tables 1a-9a below correspond and Tables 1-9 in the main paper with the difference criterion replaced with the quotient one. Figures 2a, 2b and 2c correspond to Figure 2 using the other classifiers.
Output (NB) Sample size MIQ FCQ RQ VQ CQ Base
Avgerage accuracy n= 30 78.10 78.76 79.53 79.58 79.12 77.28 n= 50 79.59 79.73 80.86 80.81 80.26 78.29 n= 100 80.62 80.54 81.82 81.75 81.16 78.84 n= 200 81.24 81.03 82.48 82.35 81.77 79.13 Average dim. red n= 30 8.9 8.6 7.2 7.0 7.9 100
n= 50 8.3 8.1 6.7 6.7 7.4 100
n= 100 7.7 7.2 6.1 6.1 6.9 100
n= 200 7.1 6.7 5.7 5.9 6.5 100
Victories over Base n= 30 60 65 72 76 68 -
n= 50 67 61 79 78 68 -
n= 100 71 64 88 86 79 -
n= 200 75 68 92 91 84 -
Table 1a.- Performance outputs for the considered methods, using NB and the quotient criterion, with different sample sizes. Each output is the result of the 100 different models for each sample size.
Output (k-NN) Sample size MIQ FCQ RQ VQ CQ Base
Avgerage accuracy n= 30 80.02 79.65 80.85 80.82 80.09 78.98 n= 50 81.32 80.40 81.72 81.67 80.87 80.34 n= 100 82.84 81.34 82.73 82.65 81.83 81.99 n= 200 84.06 82.09 83.56 83.49 82.65 83.38
Average dim. red n= 30 9.3 9.5 7.4 7.6 8.1 100
n= 50 9.6 9.6 7.6 7.8 8.3 100
n= 100 9.9 9.9 8.0 8.2 8.6 100
n= 200 10.1 10.1 8.3 8.5 9.0 100
Victories over Base n= 30 71 58 76 74 67 -
n= 50 67 53 73 72 64 -
n= 100 71 49 64 64 55 -
n= 200 64 42 62 64 54 -
Table 2a.- Performance outputs for the considered methods, usingk-NN and the quotient criterion, with different sample sizes. Each output is the result of the 100 different models for each sample size.
Output (LDA) Sample size MIQ FCQ RQ VQ CQ Base Avgerage accuracy n= 30 78.67 77.58 78.64 78.64 78.15 60.80 n= 50 80.20 78.53 79.58 79.52 79.11 58.75 n= 100 81.74 79.62 80.65 80.52 80.25 53.11 n= 200 82.90 80.47 81.53 81.35 81.10 73.25
Average dim. red n= 30 5.8 4.7 4.7 4.7 5.1 100
n= 50 6.9 5.7 5.6 5.6 6.0 100
n= 100 8.3 7.1 6.9 7.0 7.3 100
n= 200 9.5 8.3 8.0 8.1 8.4 100
Table 3a.- Performance outputs for the considered methods, using LDA and the quotient criterion, with different sample sizes. Each output is the result of the 100 different models for each sample size.
Output (SVM) Sample size MIQ FCQ RQ VQ CQ Base
Avgerage accuracy n= 30 81.62 79.81 80.69 80.65 80.27 81.91 n= 50 82.69 80.42 81.43 81.35 80.96 82.99 n= 100 83.80 81.21 82.20 82.12 81.76 84.11 n= 200 84.61 81.79 82.90 82.76 82.42 84.91 Average dim. red n= 30 10.6 10.3 9.1 9.2 9.5 100
n= 50 10.7 10.4 9.3 9.4 9.7 100
n= 100 11.1 10.5 9.5 9.7 9.9 100
n= 200 11.4 10.7 9.7 9.9 10.0 100
Victories over Base n= 30 32 37 49 47 42 -
n= 50 35 34 51 52 44 -
n= 100 35 33 51 50 48 -
n= 200 33 31 52 51 48 -
Table 4a.- Performance outputs for the considered methods, using SVM and the quotient criterion, with different sample sizes. Each output is the result of the 100 different models for each sample size.
Ranking criterion (NB) Sample size MIQ FCQ RQ VQ CQ
Relative n= 30 2.27 5.67 8.69 8.63 7.65
n= 50 2.69 4.94 9.09 8.70 7.51
n= 100 2.75 4.75 9.21 8.80 7.44
n= 200 2.71 4.57 8.87 8.31 7.41
Positional n= 30 6.78 7.83 8.53 8.50 8.39
n= 50 6.79 7.57 8.93 8.51 8.22
n= 100 6.80 7.47 9.01 8.58 8.14
n= 200 6.84 7.56 8.92 8.43 8.25
F1 n= 30 12.25 15.85 17.39 17.58 17.01
n= 50 12.24 14.90 19.19 17.37 16.35 n= 100 12.35 14.60 19.67 17.38 16 n= 200 12.49 14.92 19.28 16.86 16.45
Table 5a.- Global scores of the considered (quotient-based) methods using three different ranking criteria with the NB classifier.
Ranking criterion (k-NN) Sample size MIQ FCQ RQ VQ CQ
Relative n= 30 3.70 3.97 8.51 8.09 6.03
n= 50 4.39 3.72 7.84 7.59 5.61
n= 100 4.93 3.46 7.24 6.91 5.15
n= 200 5.52 3.00 6.72 6.52 5.00
Positional n= 30 7.31 7.29 9.06 8.64 7.70
n= 50 7.55 7.30 8.90 8.67 7.58
n= 100 7.75 7.37 8.82 8.62 7.49
n= 200 7.96 7.43 8.56 8.45 7.60
F1 n= 30 14.32 13.96 19.66 17.80 14.26
n= 50 15.27 14.06 18.78 17.94 13.95 n= 100 16.02 14.19 18.44 17.90 13.62 n= 200 16.84 14.09 17.48 17.57 14.02 Table 6a.- Global scores of the considered (quotient-based) methods using three different ranking criteria with thek-NN classifier.
Ranking criterion (LDA) Sample size MIQ FCQ RQ VQ CQ
Relative n= 30 3.94 3.14 7.48 7.38 5.38
n= 50 4.26 2.86 6.97 6.66 5.19
n= 100 4.76 2.60 6.98 6.43 5.33
n= 200 5.27 2.36 6.78 6.13 5.23
Positional n= 30 7.49 6.99 9.01 8.96 7.55
n= 50 7.64 7.12 8.90 8.64 7.70
n= 100 7.72 7.13 8.89 8.52 7.74
n= 200 7.80 7.23 8.79 8.36 7.82
F1 n= 30 15.05 12.67 19.11 19.32 13.85
n= 50 15.63 12.95 18.91 18.07 14.44 n= 100 15.80 13.04 18.83 17.72 14.61 n= 200 16.25 13.29 18.62 17.04 14.80
Table 7a.- Global scores of the considered (quotient-based) methods using three different ranking criteria with LDA.
Ranking criterion (SVM) Sample size MIQ FCQ RQ VQ CQ
Relative n= 30 6.02 2.85 6.58 6.28 4.42
n= 50 5.99 2.72 6.70 6.15 4.73
n= 100 6.14 2.61 6.43 5.99 4.66
n= 200 6.42 2.30 6.29 5.75 4.68
Positional n= 30 8.26 7.20 8.74 8.46 7.34
n= 50 8.16 7.19 8.80 8.43 7.42
n= 100 8.26 7.31 8.66 8.32 7.48
n= 200 8.28 7.36 8.61 8.22 7.56
F1 n= 30 17.90 13.78 17.99 17.17 13.16
n= 50 17.58 13.51 18.21 17.28 13.42 n= 100 17.97 13.86 17.71 16.95 13.59 n= 200 17.89 13.85 17.70 16.58 14.06 Table 8a.- Global scores of the considered (quotient-based) methods using three different ranking criteria with the linear SVM.
NB outputs
Output Data MIQ FCQ RQ VQ CQ Base
Classification accuracy Growth 88.17 87.10 86.02 86.02 87.10 84.95 Tecator 96.28 97.67 99.53 99.53 98.14 97.21 Phoneme 73.08 80.20 80.38 80.15 80.32 74.08
Number of variables Growth 1.5 1.1 1.1 1.1 1.1 31
Tecator 4.8 5.0 1.0 1.0 4.4 100
Phoneme 14.5 10.6 16.7 16.8 14.1 256
k-NN outputs
Output Data MIQ FCQ RQ VQ CQ Base
Classification accuracy Growth 95.70 83.87 83.87 83.87 83.87 96.77 Tecator 96.74 99.07 99.53 99.53 98.60 98.60 Phoneme 75.53 81.42 79.79 80.38 80.61 78.80
Number of variables Growth 3.9 1.0 1.0 1.0 1.0 31
Tecator 4.0 3.0 1.0 1.0 4.3 100
Phoneme 18.4 12.1 12.3 15.2 6.7 256
LDA outputs
Output Data MIQ FCQ RQ VQ CQ Base
Classification accuracy Growth 95.70 91.40 91.40 91.40 91.40 -
Tecator 94.88 94.42 94.88 94.42 95.35 -
Phoneme 74.55 78.88 79.10 79.63 80.26 -
Number of variables Growth 3.7 5.0 4.9 4.9 5.0 -
Tecator 6.1 8.4 4.1 2.2 3.1 -
Phoneme 19.0 8.9 10.0 9.0 9.2 -
SVM outputs
Output Data MIQ FCQ RQ VQ CQ Base
Classification accuracy Growth 94.62 87.10 87.10 87.10 86.02 95.70 Tecator 98.14 99.07 99.53 99.07 98.60 99.07 Phoneme 75.30 80.71 80.67 80.37 80.33 80.96
Number of variables Growth 3.5 5.0 4.9 4.9 5.0 31
Tecator 6.7 2.1 1.0 1.0 4.1 100
Phoneme 19.3 10.1 11.3 10.8 12.2 256
Table 9a.- Performances of the different (quotient-based) mRMR methods in three real data sets. From top to bottom tables stand for Naive Bayes,k-NN, LDA and linear SVM outputs respectively.
Simulation experiments
50 100 150 200 250 300 350 400
MID
FDC
RD
VD
CD
0 1 2 3 4 5 6 7 8 9 10
Figure 2a.- Cromatic version of the global relative ranking table taking into account the 400 considered experiments (columns) and the Naive Bayes classifier.
Simulation experiments
50 100 150 200 250 300 350 400
MID
FDC
RD
VD
CD
0 1 2 3 4 5 6 7 8 9 10
Figure 2b.- Cromatic version of the global relative ranking table taking into account the 400 considered experiments (columns) and the Linear Discriminant Analysis.
Simulation experiments
50 100 150 200 250 300 350 400
MID
FDC
RD
VD
CD
0 1 2 3 4 5 6 7 8 9 10
Figure 2c.- Cromatic version of the global relative ranking table taking into account the 400 considered experiments (columns) and the linear SVM.