Optimizing Neural Networks for Academic Performance Classification Using Feature Selection and Resampling Approach
Abstract
The features present in large datasets significantly affect the performance of machine learning models. Redundant and irrelevant features will be rejected and cause a decrease in machine learning model performance. This paper proposes HyFeS-ROS-ANN: Hybrid Feature Selection and Resampling combination method for binary classification using artificial neural network multilayer perceptron (MLP). The first stage of this approach is to use a combination of two feature selection methods to select essential features that are highly correlated with model performance. The second stage of this approach is to use a combination of resampling methods to handle unbalanced data classes. Both approaches are applied to the academic performance classification model using the MLP neural network. This research dataset is obtained using three-dimensional (3D) frameworks such as the Big Five Personality to determine the Personality that affects academic performance from the student dimension, the Family Influence Scale (FIS), which measures factors that affect academic performance from the family dimension, and Higher Education Institutions Service Quality (HEISQUAL) to measure service quality and its influence on academic performance from the Education institution dimension. Previous research shows that the CoR-ANN algorithm has a model accuracy rate of 94%. The research results based on the dataset show that our proposed method can improve accuracy by selecting more relevant and essential features in improving model performance. The results show that the features are reduced from 135 to 108, while the HyFS-ROS-ANN model for binary classification accuracy increases to 100%.
References
Abbas, J. Heisqual: A modern approach to measure service quality in higher education institutions. Studies in Educational Evaluation 67 (2020), 100933.
Abdulla, S. H., Sagheer, A. M., and Veisi, H. Improving breast cancer classification using (smote) technique and pectoral muscle removal in mammographic images. Mendel 27, 2 (2021), 36–43.
Al-Zawqari, A., Peumans, D., and Vandersteen, G. A flexible feature selection approach for predicting students’ academic performance in online courses. Computers and Education: Artificial Intelligence 3 (2022), 100103.
Alboaneen, D., et al. Development of a webbased prediction system for students’ academic performance. Data 7, 2 (2022), 21.
Alsariera, Y. A., et al. Assessment and evaluation of different machine learning algorithms for predicting student performance. Computational Intelligence and Neuroscience 2022 (2022).
Baashar, Y., et al. Evaluation of postgraduate academic performance using artificial intelligence models. Alexandria Engineering Journal 61, 12 (2022), 9867–9878.
Babu, I., MathuSoothana, R., and Kumar, S. Evolutionary algorithm based feature subset selection for students academic performance analysis. Intelligent Automation & Soft Computing 36, 3 (2023).
Bagui, S., and Li, K. Resampling imbalanced data for network intrusion detection datasets. Journal of Big Data 8, 1 (2021), 1–41.
Beltr´an-Velasco, A. I., Donoso-Gonz´alez, M., and Clemente-Su´arez, V. J. Analysis of perceptual, psychological, and behavioral factors that affect the academic performance of education university students. Physiology & Behavior 238 (2021), 113497.
Channabasavaraju, B. D., and Vinayakamurthy, U. An analysis of heart disease for diabetic patients using recursive feature elimination with random forest, 2020.
Costa, P. T., and McCrae, R. R. Normal personality assessment in clinical practice: The neo personality inventory. Psychological assessment 4, 1 (1992), 5.
Duong, H. T.-H., et al. Academic performance warning system based on data driven for higher education. Neural Computing and Applications 35, 8 (2023), 5819–5837.
Foong, C. C., Liew, P. Y., and Lye, A. J. Changes in motivation and its relationship with academic performance among first-year chemical engineering students. Education for Chemical Engineers 38 (2022), 70–77.
Fouad, N. A., et al. Development and validation of the family influence scale. Journal of Career Assessment 18, 3 (2010), 276–291.
Jaiswal, G., Sharma, A., and Sarup, R. Machine learning in higher education: Predicting student attrition status using educational data mining. In Handbook of research on emerging trends and applications of machine learning. IGI Global, 2020, pp. 27–46.
Jeon, H., and Oh, S. Hybrid-recursive feature elimination for efficient feature selection. Applied Sciences 10, 9 (2020), 3211.
Jovic, A., Brkic, K., and Bogunovic, N. A review of feature selection methods with applications. In 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO) (2015), Ieee, pp. 1200–1205.
Komarraju, M., Karau, S. J., Schmeck, R. R., and Avdic, A. The big five personality traits, learning styles, and academic achievement. Personality and individual differences 51, 4 (2011), 472–477.
Lee, S., and Chung, J. Y. The machine learning-based dropout early warning system for improving the performance of dropout prediction. Applied Sciences 9, 15 (2019), 3093.
Liu, Y., Mu, Y., Chen, K., Li, Y., and Guo, J. Daily activity feature selection in smart homes based on pearson correlation coefficient. Neural Processing Letters 51 (2020), 1771–1787.
Lottering, R., Hans, R., and Lall, M. A machine learning approach to identifying students at risk of dropout: A case study. International Journal of Advanced Computer Science and Applications 11, 10 (2020), 417–422.
Lyu, Y., Feng, Y., and Sakurai, K. A survey on feature selection techniques based on filtering methods for cyber attack detection. Information 14, 3 (2023), 191.
Martins, M. V., Baptista, L., Machado, J., and Realinho, V. Multi-class phased prediction of academic performance and dropout in higher education. Applied Sciences 13, 8 (2023), 4702.
Mateus, C., Campis, R., Aguaded, I., Parody, A., and Ruiz, F. Analysis of personality traits and academic performance in higher education at a colombian university. Heliyon 7, 5 (2021).
Mishra, S. Social networks, social capital, social support and academic success in higher education: A systematic review with a special focus on ‘underrepresented’students. Educational Research Review 29 (2020), 100307.
Mooijman, P., Catal, C., Tekinerdogan, B., Lommen, A., and Blokland, M. The effects of data balancing approaches: A case study. Applied Soft Computing 132 (2023), 109853.
Nabil, A., Seyam, M., and Abou-Elfetouh, A. Prediction of students’ academic performance based on courses’ grades using deep neural networks. IEEE Access 9 (2021), 140731–140746.
Nye, C. D., Prasad, J., and Rounds, J. The effects of vocational interests on motivation, satisfaction,and academic performance: Test of a mediated model. Journal of Vocational Behavior 127 (2021), 103583.
Rodrıguez-Hern´andez, C. F., Musso, M., Kyndt, E., and Cascallar, E. Artificial neural networks in academic performance prediction: Systematic implementation and predictor evaluation. Computers and Education: Artificial Intelligence 2 (2021), 100018.
Salih, N. Z., and Khalaf, W. Prediction of student’s performance through educational data mining techniques. Indonesian Journal of Electrical Engineering and Computer Science 22, 3 (2021), 1708–1715.
Sanchez-Marono, N., Alonso-Betanzos, A., and Tombilla-Sanrom´an, M. Filter methods for feature selection–a comparative study. In International Conference on Intelligent Data Engineering and Automated Learning (2007), Springer, pp. 178–187.
Saritas, M. M., and Yasar, A. Performance analysis of ann and naive bayes classification algorithm for data classification. International journal of intelligent systems and applications in engineering 7, 2 (2019), 88–91.
Silva, A. D., Vautero, J., and Usssene, C. The influence of family on academic performance of mozambican university students. International Journal of Educational Development 87 (2021), 102476.
Soric, I., Penezic, Z., and Buric, I. The big five personality traits, goal orientations, and academic achievement. Learning and individual differences 54 (2017), 126–134.
Supriyadi, D., Purwanto, P., and Warsito, B. Comparison of random forest algorithm, support vector machine and neural network for classification of student satisfaction towards higher education services. In AIP Conference Proceedings (2022), vol. 2578, AIP Publishing.
Supriyadi, D., Warsito, B., et al. Performance comparison of machine learning algorithms for student personality classification. In 2022 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT) (2022), IEEE, pp. 73–78.
Syed Mustapha, S. Predictive analysis of students’ learning performance using data mining techniques: A comparative study of feature selection methods. Applied System Innovation 6, 5 (2023), 86.
Van, T. D., Thi, K. C. N., and Thi, H. P. T. Data survey on the factors affecting students’ satisfaction and academic performance among private universities in vietnam. Data in brief 33 (2020), 106357.
van Assen, M., Lee, S. J., and De Cecco, C. N. Artificial intelligence from a to z: from neural network to legal framework. European journal of radiology 129 (2020), 109083.
Vautero, J., Taveira, M. d. C., Silva, A. D., and Fouad, N. A. Family influence on academic and life satisfaction: A social cognitive perspective. Journal of Career Development 48, 6 (2021), 817–830.
Wen, X., and Juan, H. Early prediction of students’ performance using a deep neural network based on online learning activity sequence. Applied Sciences 13, 15 (2023), 8933.
Yin, Y., et al. Igrf-rfe: a hybrid feature selection method for mlp-based network intrusion detection on unsw-nb15 dataset. Journal of Big Data 10, 1 (2023), 1–26.
Zaffar, M., Hashmani, M. A., Savita, K., and Rizvi, S. S. H. A study of feature selection algorithms for predicting students academic performance. International Journal of Advanced Computer Science and Applications 9, 5 (2018).
Zeineddine, H., Braendle, U., and Farah, A. Enhancing prediction of student success: Automated machine learning approach. Computers & Electrical Engineering 89 (2021), 106903.
Zelinka, I., and Amer, E. An ensemble-based malware detection model using minimum feature set. Mendel 25, 2 (2019), 1–10.
Zhu, R., Guo, Y., and Xue, J.-H. Adjusting the imbalance ratio by the dimensionality of imbalanced data. Pattern Recognition Letters 133 (2020), 217–223.
Copyright (c) 2023 MENDEL
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
MENDEL open access articles are normally published under a Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/4.0/ . Under the CC BY-NC-SA 4.0 license permitted 3rd party reuse is only applicable for non-commercial purposes. Articles posted under the CC BY-NC-SA 4.0 license allow users to share, copy, and redistribute the material in any medium of format, and adapt, remix, transform, and build upon the material for any purpose. Reusing under the CC BY-NC-SA 4.0 license requires that appropriate attribution to the source of the material must be included along with a link to the license, with any changes made to the original material indicated.