Predicting Football Match Outcomes with Machine Learning Approaches
Abstract
The increasing use of data-driven approaches has led to the development of models to predict football match outcomes. However, predicting match outcomes accurately remains a challenge due to the sport's inherent unpredictability. In this study, we have investigated the usage of different machine learning models in predicting the outcome of English Premier League matches. We assessed the performance of random forest, logistic regression, linear support vector classifier and extreme gradient boosting models for binary and multiclass classification. These models are trained with datasets obtained using different sampling techniques. The result showed that the models performed better when trained with dataset obtained using a balanced sampling technique for binary classification. Additionally, the models' predictions were evaluated by conducting simulation on football betting profits based on the 2022-2023 EPL season. The model achieved the highest accuracy is the binary class random forest, but the model provided the highest football betting profit is the binary class logistic regression.
References
Anfilets, S., Bezobrazov, S., Golovko, V., Sachenko, A., Komar, M., Dolny, R., Kasyanik, V., Bykovyy, P., Mikhno, E., and Osolinskyi, O. Deep multilayer neural network for predicting the winner of football matches. International Journal of Computing 19, 1 (2020), 70–77.
Baboota, R., and Kaur, H. Predictive analysis and modelling football results using machine learning approach for english premier league. International Journal of Forecasting 35, 2 (2019), 741–755.
Bunker, R. P., and Thabtah, F. A machine learning framework for sport result prediction. Applied computing and informatics 15, 1 (2019), 27–33.
Cintia, P., Giannotti, F., Pappalardo, L., Pedreschi, D., and Malvaldi, M. The harsh rule of the goals: Data-driven performance indicators for football teams. In 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA) (2015), IEEE, pp. 1–10.
Cotta, L., de Melo, P., Benevenuto, F., and Loureiro, A. Using fifa soccer video game data for soccer analytics. In Workshop on large scale sports analytics (2016).
Danisik, N., Lacko, P., and Farkas, M. Football match prediction using players attributes. In 2018 World Symposium on Digital Intelligence for Systems and Machines (DISA) (2018), IEEE, pp. 201–206.
Elmiligi, H., and Saad, S. Predicting the outcome of soccer matches using machine learning and statistical analysis. In 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC) (2022), IEEE, pp. 1–8.
FBref.com. Football statistics and history. https://fbref.com/en/ (accessed 01 October 2023).
FIFA. Fifa index. https://www.fifaindex.com/ (accessed 01 October 2023).
Football Data. Football results, statistics & soccer betting odds data. https://football-data.co.uk/ (accessed 01 October 2023).
Gomes, J., Portela, F., and Santos, M. F. Decision support system for predicting football game result. In Computers-19th International Conference on Circuits, Systems, Communications and Computers-Intelligent Systems and Applications Special Sessions. Series (2015), vol. 32, pp. 348–353.
Igiri, C. P., and Nwachukwu, E. O. An improved prediction system for football a match result. IOSR Journal of Engineering (2014).
Kursa, M. B., and Rudnicki, W. R. Feature selection with the boruta package. Journal of statistical software 36 (2010), 1–13.
Leite, W. S. Home advantage: Comparison between the major european football leagues. Athens Journal of Sports 4, 1 (2017), 65–74.
Pappalardo, L., and Cintia, P. Quantifying the relation between performance and success in soccer. Advances in Complex Systems 21, 03n04 (2018), 1750014.
Prasetio, D., et al. Predicting football match results with logistic regression. In 2016 International Conference On Advanced Informatics: Concepts, Theory And Application (ICAICTA) (2016), IEEE, pp. 1–5.
Raju, M. A., Mia, M. S., Sayed, M. A., and Uddin, M. R. Predicting the outcome of english premier league matches using machine learning. In 2020 2nd International Conference on Sustainable Technologies for Industry 4.0 (STI) (2020), IEEE, pp. 1–6.
Rodrigues, F., and Pinto, ˆA. Prediction of football match results with machine learning. Procedia Computer Science 204 (2022), 463–470.
Rudrapal, D., Boro, S., Srivastava, J., and Singh, S. A deep learning approach to predict football match result. In Computational Intelligence in Data Mining: Proceedings of the International Conference on ICCIDM 2018 (2020), Springer, pp. 93–99.
Saiedy, S., Qachmas, M., and Amanullah, F. Predicting epl football matches resutls using machine learning algorithms. International Journal of Engineering Applied Sciences and Technology 5 (2020), 83–91.
Stubinger, J., Mangold, B., and Knoll, J. Machine learning in football betting: Prediction of match results based on player characteristics. Applied Sciences 10, 1 (2019), 46.
Tax, N., and Joustra, Y. Predicting the dutch football competition using public data: A machine learning approach. Transactions on knowledge and data engineering 10, 10 (2015), 1–13.
Copyright (c) 2023 MENDEL
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
MENDEL open access articles are normally published under a Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/4.0/ . Under the CC BY-NC-SA 4.0 license permitted 3rd party reuse is only applicable for non-commercial purposes. Articles posted under the CC BY-NC-SA 4.0 license allow users to share, copy, and redistribute the material in any medium of format, and adapt, remix, transform, and build upon the material for any purpose. Reusing under the CC BY-NC-SA 4.0 license requires that appropriate attribution to the source of the material must be included along with a link to the license, with any changes made to the original material indicated.