NOVENO DISTRITO - LOS TESTS DE PLAGIO DEL SISTEMA JUDICIAL DE EEUU

III. ATENDIENDO AL TIPO DE USUARIO Cine infantil

4. ELABORACIÓN DE NUESTRO MÉTODO

4.1. HERRAMIENTAS JUDICIALES

4.1.1. LOS TESTS DE PLAGIO DEL SISTEMA JUDICIAL DE EEUU

4.1.1.2. NOVENO DISTRITO

In this section, the results of the experimental study are presented and discussed. First, analysis and comparison of G3P-kEMLC and RAkEL is performed; then, G3P-kEMLC is compared to other state-of-the-art EMLCs. Hereafter, the two versions of our proposed method are indicated as G3P- kEMLC-3 whenk= 3is used, and as G3P-kEMLC-V when a variable value ofkin the range[3, q/2] is used.

A study of the efficiency of G3P-kEMLC was also carried out; however, due to space constraints, the results of this study are available in the additional material, as well as detailed results of the experiments and alsop-values of statistical tests.

A. G3P-kEMLC vs RAkEL

In this section, we compare the performance of G3P- kEMLC and RAkEL, as methods that use base classifiers focused onk-labelsets. In Table II, the average ranking values for each metric, computed for all datasets, are shown. For each pair dataset-metric, the method that performs best obtains a ranking of1, the next a ranking of2, and so on; the lower the value the better. We can see that in four of the metrics G3P- kEMLC-V has the best average ranking, while G3P-kEMLC-3 is the best in one. Further, both methods are ahead of RAkEL in all cases except in SA, where RAkEL has a better average ranking than G3P-kEMLC-3.

In order to determine if significant differences exist among algorithms, Friedman’s [32] and Holm’s [33] tests were performed, using a confidence value α = 0.05. Friedman’s test has shown that significant differences existed for all metrics except for SA. Further, Holm’s test has shown that for the rest of the metrics, the performance of both configurations of G3P- kEMLC is statistically the same, and both are significantly better than RAkEL. Fig. 5 shows the critical diagrams for

G3P-kEMLC-3 G3P-kEMLC-V RAkEL AHL 1.72 1.47 2.81 SA 2.19 1.81 2.00 ExF 1.72 1.47 2.81 MiF 1.76 1.65 2.59 MaF 1.53 1.78 2.69 1 2 3 G3P−kV G3P−k3 RAkEL (a) AHL 1 2 3 G3P−kV G3P−k3 RAkEL (b) ExF 1 2 3 G3P−kV G3P−k3 RAkEL (c) MiF 1 2 3 G3P−k3 G3P−kV RAkEL (d) MaF

Fig. 5. Critical diagrams of Holm’s test at 95% confidence for the comparison between G3P-kEMLC and RAkEL. G3P-kEMLC-3 and G3P-kEMLC-V are indicated as G3P-k3 and G3P-kV, respectively.

the four metrics comparing G3P-kEMLC with RAkEL; in these diagrams, a line linking two methods indicates that their performance is statistically the same at 95% confidence.

Fig. 6 shows the average number of votes per label,v, of the best configuration in each case. Note that, although RAkEL was executed with a large number of classifiers (such as n calculated forv= 10and forvequal to the best configuration of G3P-kEMLC), in most cases it performed better just with 6votes on average for each label. We see that G3P-kEMLC is able to model and adjust the number of classifiers depending on each dataset, and is flexible to adapt to each specific case. In many cases, G3P-kEMLC obtained better results than RAkEL using less classifiers. Therefore, we also show that the number of classifiers used in the ensemble is not the only characteristic that contributes for our method to achieve good performance, but the selection of classifiers into an optimal tree-shaped ensemble structure is decisive for its performance.

B. G3P-kEMLC vs state-of-the-art

Once we have shown that the performance of G3P-kEMLC is significantly better than RAkEL, we next compare it with other state-of-the-art EMLCs. As in the previous experiment, Table III shows the average ranking values of each method on all datasets. In it, G3P-kEMLC-3 and G3P-kEMLC-V are indicated as G3P-3 and G3P-V respectively. We see that both G3P-kEMLC configurations always have better average ranking than the rest of EMLCs, except for SA, where EPS is better ranked.

Friedman’s test determined that there existed significant differences in the performance of the algorithms for all metrics. Fig. 7 shows the critical diagrams of the Holm’s test for all metrics. For AHL, ExF, and MiF, the performance of

! " # $ %&' (( ) * * (* *+ % ") % ") , )

Fig. 6. Average number of votes per label in each dataset.

TABLE III

AVERAGERANKINGS IN THECOMPARISON AMONGG3P-KEMLCAND STATE-OF-THE-ARTEMLCS.

G3P-3 G3P-V ECC EBR EPS RF-PCT AHL 1.81 1.44 4.59 4.69 3.81 4.66 SA 3.22 2.88 3.22 4.09 2.66 4.94 ExF 2.00 1.63 4.41 4.34 3.88 4.75 MiF 1.75 1.50 4.72 4.16 4.25 4.63 MaF 1.47 1.97 4.41 3.75 4.50 4.91 1 2 3 4 5 G3P−kV G3P−k3 EPS ECC RFPCT EBR (a) AHL 2 3 4 5 EPS G3P−kV G3P−k3 ECC EBR RFPCT (b) SA 1 2 3 4 5 G3P−kV G3P−k3 EPS EBR ECC RFPCT (c) ExF 1 2 3 4 5 G3P−kV G3P−k3 EBR EPS RFPCT ECC (d) MiF 1 2 3 4 5 G3P−k3 G3P−kV EBR ECC EPS RFPCT (e) MaF

Fig. 7. Critical diagrams of Holm’s test at 95% confidence for the comparison between G3P-kEMLC and state-of-the-art EMLCs. G3P-kEMLC-3 and G3P- kEMLC-V are indicated as G3P-k3 and G3P-kV, respectively.

both G3P-kEMLC configurations is significantly better than all other methods. For MaF, the performance of G3P-kEMLC-V is statistically comparable to EBR, while for SA, RF-PCT is the only method that perform significantly worse than both EPS and G3P-kEMLC-V.

Therefore, we demonstrate that given the optimal structure of the ensemble that G3P-kEMLC obtains, independently of the configuration used, it outperformed state-of-the-art EMLCs.

building EMLCs. Given a pool of multi-label classifiers, each focused on a subset of k labels, where k could be different for each, the algorithm evolves individuals representing a tree- shaped ensemble. At each node of the ensemble, predictions of children nodes are combined, while the leaves represent any multi-label classifier of the pool. The experimental study on 16 multi-label dataset using 5 evaluation metrics yielded promising results, demonstrating that the fact of building a tree-shaped ensemble where not all members have the same importance in the final prediction not only outperformed RAkEL (which also is focused on small subsets of the labels) but also the other state-of-the-art EMLCs, obtaining the model in acceptable time. Moreover, the proposed method is flexible and able to adapt the number of members of the ensemble according to each specific case.

In the future, we will explore other types of non-terminal nodes to combine the predictions, which also may use different types or number of child nodes. Further, we will develop a heuristic or evolutionary process to get a pool of multi-label classifiers that are accurate, diverse, and have an optimal size of thek-labelsets.

ACKNOWLEDGMENT

This research was supported by the Spanish Ministry of Economy and Competitiveness and the European Regional Development Fund, project TIN2017-83445-P. This research was also supported by the Spanish Ministry of Education under FPU Grant FPU15/02948.

REFERENCES

[1] E. Gibaja and S. Ventura, “Multi-label learning: a review of the state of the art and ongoing research,”Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 4, no. 6, pp. 411–444, 2014. [2] H. Shao, G. Li, G. Liu, and Y. Wang, “Symptom selection for multi-

label data of inquiry diagnosis in traditional chinese medicine,”Science China Information Sciences, vol. 56, no. 5, pp. 1–13, 2013.

[3] J. Xu, “Fast multi-label core vector machine,” Pattern Recognition, vol. 46, no. 3, pp. 885–898, 2013.

[4] G. Tsoumakas, I. Katakis, and I. Vlahavas, “Effective and efficient multilabel classification in domains with large number of labels,” in

Proceedings of the ECML/PKDD 2008 Workshop on Mining Multidi- mensional Data (MMD’08), 2008, pp. 53–59.

[5] J. Read, B. Pfahringer, G. Holmes, and E. Frank, “Classifier chains for multi-label classification,”Machine Learning, vol. 85, no. 3, pp. 335– 359, 2011.

[6] G. Tsoumakas, I. Katakis, and I. Vlahavas, “Random k-labelsets for multi-label classification,”IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 7, pp. 1079–1089, 2011.

[7] J. M. Moyano, E. L. Gibaja, K. J. Cios, and S. Ventura, “Review of ensembles of multi-label classifiers: Models, experimental study and prospects,”Information Fusion, vol. 44, pp. 33 – 45, 2018.

[8] G. Tsoumakas, I. Partalas, and I. Vlahavas, “A taxonomy and short review of ensemble selection,” inWorkshop on Supervised and Unsu- pervised Ensemble Methods and Their Applications, 2008, pp. 1–6. [9] Y. Bi, “The impact of diversity on the accuracy of evidential classifier

ensembles,”International Journal of Approximate Reasoning, vol. 53, no. 4, pp. 584 – 607, 2012.

[10] P. G. Espejo, S. Ventura, and F. Herrera, “A survey on the application of genetic programming to classification,”IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 40, no. 2, pp. 121–144, 2009.

Intelligence,, 2015, pp. 377–384.

[12] G. Tsoumakas, I. Katakis, and I. Vlahavas,Data Mining and Knowledge Discovery Handbook, Part 6. Springer, 2010, ch. Mining Multi-label Data, pp. 667–685.

[13] J. Read, B. Pfahringer, and G. Holmes, “Multi-label classification using ensembles of pruned sets,” in Data Mining, 2008. ICDM’08. Eighth IEEE International Conference on. IEEE, 2008, pp. 995–1000. [14] G. Tsoumakas and I. Katakis, “Multi-label classification: An overview,”

International Journal of Data Warehousing and Mining, vol. 3, no. 3, pp. 1–13, 2007.

[15] D. Kocev, C. Vens, J. Struyf, and S. Dˇzeroski, “Ensembles of multi- objective decision trees,” inEuropean conference on machine learning, 2007, pp. 624–631.

[16] H. Blockeel, L. D. Raedt, and J. Ramon, “Top-down induction of cluster- ing trees,” inProceedings of the Fifteenth International Conference on Machine Learning, ser. ICML ’98. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1998, pp. 55–63.

[17] R. I. Mckay, N. X. Hoai, P. A. Whigham, Y. Shan, and M. O’neill, “Grammar-based genetic programming: a survey,”Genetic Programming and Evolvable Machines, vol. 11, no. 3-4, pp. 365–396, 2010. [18] A. Tsakonas, G. Dounias, M. Doumpos, and C. Zopounidis, “Bankruptcy

prediction with neural logic networks by means of grammar-guided genetic programming,”Expert Systems with Applications, vol. 30, no. 3, pp. 449–461, 2006.

[19] A. Zafra and S. Ventura, “Multi-instance genetic programming for predicting student performance in web based educational environments,”

Applied Soft Computing, vol. 12, no. 8, pp. 2693–2706, 2012. [20] J. M. Luna, J. R. Romero, C. Romero, and S. Ventura, “On the use

of genetic programming for mining comprehensible rules in subgroup discovery,”IEEE transactions on cybernetics, vol. 44, no. 12, pp. 2329– 2341, 2014.

[21] A. Tahmassebi and A. H. Gandomi, “Genetic programming based on error decomposition: A big data approach,” inGenetic Programming Theory and Practice XV. Springer, 2018, pp. 135–147.

[22] A. Cano, A. Zafra, E. L. Gibaja, and S. Ventura, “A grammar-guided genetic programming algorithm for multi-label classification,” inEuropean Conference on Genetic Programming. Springer, 2013, pp. 217–228. [23] A. G. C. de S´a, A. A. Freitas, and G. L. Pappa, “Automated selection

and configuration of multi-label classification algorithms with grammar- based genetic programming,” inParallel Problem Solving from Nature – PPSN XV, 2018, pp. 308–320.

[24] V. L´opez, A. Fern´andez, S. Garc´ıa, V. Palade, and F. Herrera, “An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics,”Information sciences, vol. 250, pp. 113–141, 2013.

[25] E. Gibaja and S. Ventura, “A tutorial on multilabel learning,” ACM Computing Surveys, vol. 47, no. 3, 2015.

[26] J. M. Moyano, E. L. Gibaja, and S. Ventura, “MLDA: A tool for analyzing multi-label datasets,”Knowledge-Based Systems, vol. 121, pp. 1–3, 2017.

[27] F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera, “Mlenn: A first approach to heuristic multilabel undersampling,” inIntelligent Data Engineering and Automated Learning – IDEAL 2014, 2014, pp. 1–9. [28] S. Ventura, C. Romero, A. Zafra, J. A. Delgado, and C. Herv´as,

“JCLEC: a Java framework for evolutionary computation,”Soft Com- puting, vol. 12, no. 4, pp. 381–392, 2008.

[29] G. Tsoumakas, E. Spyromitros-Xioufis, J. Vilcek, and I. Vlahavas, “Mulan: A Java library for multi-label learning,” Journal of Machine Learning Research, vol. 12, pp. 2411–2414, 2011.

[30] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The weka data mining software: An update,”SIGKDD Explor. Newsl., vol. 11, no. 1, pp. 10–18, 2009.

[31] J. R. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., 1993.

[32] M. Friedman, “A comparison of alternative tests of significance for the problem ofmrankings,”Ann. Math. Statist., vol. 11, no. 1, pp. 86–92, 03 1940.

[33] S. Holm, “A simple sequentially rejective multiple test procedure,”

In document El plagio cinematográfico. Un método para su detección (página 153-156)