There are several avenues of investigation for future work on this research. The ANN structure and training process used in this study were limited by computational resources and simulation time constraints, and would benefit from the use of expanded parameters. The hidden layer of the ANN for both MNIST databases was rather small for the size of the input layer in either database. Additionally, the number of training iterations allocated to the search algorithms was also smaller than normal. Rerunning the study simulations with a larger hidden layer and much higher number of training iterations could yield different results and behaviors than the ones seen in this study. Another line of research would be a more thorough exploration of dimension reduction techniques. The PCA dimension reduction technique is robust, but it is an older technique that may be unintentionally sacrificing data important to the ANN classifiers even as it preserves the data variances of the original data. The difference between the PCA and standard MNIST databases was not as pronounced as it was hoped to be in this study. Therefore, testing other dimension reduction techniques along with PCA techniques in compressing the MNIST database to different dimensions could give some useful insight as to how those techniques could help search algorithms with such high dimension problems. Another interesting line of research would be investigating the CPSO algorithm’s performance in more detail. The CPSO algorithm did not work
well with either database in this study, so it would be interesting to investigate how the CPSO algorithm handles optimizing pre-trained ANN in general to see if the problems that the CPSO algorithm had in this study were unique or not. Additionally, it would be worthwhile to try and determine some general guidelines for selecting a CPSO dimension splitting architecture of a problem, as well as the number of sub-swarms that should be used with it. The last line of research would be comparing the performance of the conjugate- gradient BP algorithm against the search algorithms in this paper. This would provide a more thorough comparison between the performance of the BP algorithm and the search algorithms. The results of the BBFWA indicate that it could viably compete with the BP algorithm in training an ANN on the MNIST database, as its performance in this study shows that it was not unduly impacted by the optimization problem’s dimensionality unlike the PSO and CPSO algorithms.
BIBLIOGRAPHY
[1] Parul Shah, Sunil Karamchandani, Taskeen Nadkar, Nikita Gulechha, Kaushik Koli, and Ketan Lad, “Ocr-based chassis-number recognition using artificial neural networks,” 2009 IEEE International Conference on Vehicular Electronics and Safety (ICVES), Nov 2009.
[2] Michael Negnevitsky, Artificial Intelligence: A Guide to Intelligent Systems, chapter Artificial Neural Networks, pp. 165–217, Addison-Wesley, Harlow, 2005.
[3] Donald Knuth, The Art of Computer Programming, Volume 3: Sorting and Searching, Addison-Wesley, Redwood City,CA, 2 edition, 1998.
[4] Frans van den Bergh and Andries Engelbrecht, “Cooperative learning in neural networks using particle swarm optimizers,” South African Computer Journal, vol. 26, pp. 84–90, 01 2000.
[5] V. G. Gudise and G. K. Venayagamoorthy, “Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks,” in Proceedings of the 2003 IEEE Swarm Intelligence Symposium. SIS’03 (Cat. No.03EX706), April 2003, pp. 110–117.
[6] Ram Kinkar Dutta, Nabin Kanti Karmakar, and Tapas Si, “Artificial neural network training using fireworks algorithm in medical data mining,” International Journal of Computer Applications, vol. 137, no. 1, pp. 975–8887, 04 2016.
[7] A. Rakitianskaia and A. Engelbrecht, “Training high-dimensional neural networks with cooperative particle swarm optimiser,” in 2014 International Joint Conference on Neu- ral Networks (IJCNN), July 2014, pp. 4011–4018.
[8] James Kennedy and Russell Eberhart, “Particle swarm optimization,” in Proceedings of ICNN’95 - International Conference on Neural Networks, Nov 1995, vol. 4, pp. 1942– 1948.
[9] Ying Tan and Junzhi Li, “The bare bones fireworks algorithm: A minimalist global optimizer,” Applied Soft Computing, vol. 62, pp. 454–462, 2018.
[10] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on Knowl- edge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, 2010.
[11] Jorge Nocedal and J Stephen Wright, Numerical Optimization, Springer-Verlag GmbH, second edition, 2006.
[12] Abhijit Suresh, K.V. Harish, and N. Radhika, “Particle swarm optimization over back propagation neural network for length of stay prediction,” Procedia Computer Science, vol. 46, pp. 268 – 275, 2015, Proceedings of the International Conference on Information and Communication Technologies, ICICT 2014, 3-5 December 2014 at Bolgatty Palace & Island Resort, Kochi, India.
[13] Asaju Bolaji, Ali Aminu Ahmad, and Peter Shola, “Training of neural network for pattern classification using fireworks algorithm:,” International Journal of System As- surance Engineering and Management, vol. 9, 07 2016.
[14] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, Nov 1998.
[15] Sara Sharifzadeh, Ali Ghodsi, Line H. Clemmensen, and Bjarne K. Ersbøll, “Sparse supervised principal component analysis (sspca) for dimension reduction and variable selection,” Engineering Applications of Artificial Intelligence, vol. 65, pp. 168 – 177, 2017.
[16] Chigozie Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall, “Ac- tivation functions: Comparison of trends in practice and research for deep learning,” CoRR, vol. abs/1811.03378, 2018.
[17] G.E. Nasr, E. Badr, and C. Joun, “Cross entropy error function in neural networks: Forecasting gasoline demand.,” in Proceedings of the Fifteenth International Florida Artificial Intelligence Research Society Conference, Jan. 2002, pp. 381–384.
[18] Andrew Ng, “Feature selection, l 1 vs. l 2 regularization, and rotational invariance,” Proceedings of the Twenty-First International Conference on Machine Learning, 09 2004.
[19] Patrik Gilley and Yanjun Yan, “Comparison of search optimization algorithms in two- stage artificial neural network training for handwritten digits recognition,” in 2020 SoutheastCon. IEEE, 2020.
[20] M. Clerc, “The swarm and the queen: towards a deterministic and adaptive par- ticle swarm optimization,” in Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406), July 1999, vol. 3, pp. 1951–1957 Vol. 3.
APPENDIX A: COLLECTED DATA
A.1 Search Algorithm Error Bar Fitness Curves