Sport Result Prediction Using Classification Methods

Document Type : Original Article


1 Department of Computer Engineering, Faculty of Engineering, Golestan University, Gorgan, Iran.

2 Department of Computer Engineering, Babol Branch, Islamic Azad University, Babol, Iran

3 Department of Computer Engineering, Shiraz Branch, Islamic Azad University, Shiraz, Iran


Traditional sport was based on the ability of the players and less science and knowledge was considered. However, sport has become a profession and an industry. Therefore, the use of technology and analysis on data in order to achieve goals is very important. Classification is one of technologies to classify new incoming samples. Furthermore, sports produce considerable information about each season, teams, matches and players. Classification on sport data helps managers and coaches in order to predict the match result, evaluate the player performance, predict the player injury, identify the sports talent and evaluate the match strategy. There are many algorithms to predict the basketball results, track the health of players and determine the strategy of the match against different opponents, which help coaches a lot. Further, preprocessing procedure makes better dataset. In this paper, we use classification methods on sport dataset using preprocessing procedure and without preprocessing. The results show an improvement was obtained results using preprocessing.


[1] D. J. Hand and N. M. Adams, “Data Mining,” in Wiley StatsRef: Statistics Reference Online, Chichester, UK: John Wiley & Sons, Ltd, 2015, pp. 1–7.
[2] I. Karakatsanis et al., “Data mining approach to monitoring the requirements of the job market: A case study,” Inf. Syst., vol. 65, pp. 1–6, Apr. 2017.
[3] A. Mazidi, F. Roshanfar, and V. Parvin Darabad, “A Review of Outliers: Towards a Novel Fuzzy Method for Outlier Detection ‎,” J. Appl. Dyn. Syst. Control, vol. 2, no. 1, pp. 7–17, Jun. 2019.
[4] S. Tofighy and S. M. Fakhrahmad, “A proposed scheme for sentiment analysis: Effective feature reduction based on statistical information of SentiWordNet,” Kybernetes, vol. 47, no. 5, pp. 957–984, 2018.
[5] A. Mazidi and F. Roshanfar, “PSPGA: A New Method for Protein Structure Prediction based on Genetic Algorithm,” J. Appl. Dyn. Syst. Control, vol. 3, no. 1, pp. 9–16, Jun. 2020.
[6] A. Mazidi, M. H. Saddredini, and H. Tahayori, “ProposingA NewAlgorithmtoDetectLocalOutliersinData Stream,” vol. 4, no. 4. JournalofSoft Computingand Information Technology (JSCIT), pp. 31–42, 01-Jan-2016.
[7] N. Etminan, E. Parvinnia, and A. Sharifi-Zarchi, “FAME: Fast And Memory Efficient multiple sequences alignment tool through compatible chain of roots,” Bioinformatics, 2020.
[8] A. Mazidi, M. Golsorkhtabaramiri, and M. Yadollahzadeh Tabari, “An autonomic risk- and penalty-aware resource allocation with probabilistic resource scaling mechanism for multilayer cloud resource provisioning,” Int. J. Commun. Syst., p. e4334, Feb. 2020.
[9] A. Mazidi, M. Golsorkhtabaramiri, and M. Y. Tabari, “Autonomic resource provisioning for multilayer cloud applications with K‐nearest neighbor resource scaling and prioritybased resource allocation,” Softw. Pract. Exp., vol. 50, no. 8, pp. 1600–1625, Aug. 2020.
[10] A. Mazidi, E. Damghanijazi, and S. Tofighy, “An Energy-efficient Virtual Machine Placement Algorithm based Service Level Agreement in Cloud Computing Environments,” Circ. Comput. Sci., vol. 2, no. 6, pp. 1–6, 2017.
[11] H. Yan, N. Yang, Y. Peng, and Y. Ren, “Data mining in the construction industry: Present status, opportunities, and future trends,” Automation in Construction, vol. 119. Elsevier B.V., p. 103331, 01-Nov-2020.
[12] R. P. Bonidia, J. D. Brancher, and R. M. Busto, “Data Mining in Sports: A Systematic Review,” IEEE Latin America Transactions, vol. 16, no. 1. IEEE Computer Society, pp. 232–239, 01-Jan-2018.
[13] D. Rojas-Valverde, C. D. Gómez-Carmona, R. Gutiérrez-Vargas, and J. Pino-Ortega, “From big data mining to technical sport reports: The case of inertial measurement units,” BMJ Open Sport and Exercise Medicine, vol. 5, no. 1. BMJ Publishing Group, p. e000565, 01-Oct-2019.
[14] F. Thabtah, L. Zhang, and N. Abdelhamid, “NBA Game Result Prediction Using Feature Analysis and Machine Learning,” Ann. Data Sci., vol. 6, no. 1, pp. 103–116, Mar. 2019.
[15] A. McCabe and J. Trevathan, “Artificial intelligence in sports prediction,” in Proceedings - International Conference on Information Technology: New Generations, ITNG 2008, 2008, pp. 1194–1197.
[16] B. Min, J. Kim, C. Choe, H. Eom, and R. Ian, “A Compound Framework for Sports Prediction: The Case Study of Football,” undefined, 2007.
[17] K. Trawiński, “A fuzzy classification system for prediction of the results of the basketball games,” in 2010 IEEE World Congress on Computational Intelligence, WCCI 2010, 2010.
[18] M. Haghighat, H. Rastegari, and N. Nourafza, “A Review of  Data Mining Techniques for Result Prediction in Sports,” Adv. Comput. Sci.  an Int. J., vol. 2, no. 5, pp. 7–12, Nov. 2013.
[19] D. Miljković, L. Gajić, A. Kovačević, and Z. Konjović, “The use of data mining for basketball matches outcomes prediction,” in SIISY 2010 - 8th IEEE International Symposium on Intelligent Systems and Informatics, 2010, pp. 309–312.
[20] C. K. Leung and K. W. Joseph, “Sports data mining: Predicting results for the college football games,” in Procedia Computer Science, 2014, vol. 35, no. C, pp. 710–719.
[21] I. Bhandari, E. Colet, J. Parker, Z. Pines, R. Pratap, and K. K. Ramanujam, “Advanced scout: Data mining and knowledge discovery in NBA data,” Data Min. Knowl. Discov., vol. 1, no. 1, pp. 121–125, 1997.
[22] Z. Ivanković, M. Racković, B. Markoski, D. Radosav, and M. Ivković, “Analysis of basketball games using neural networks,” in 11th IEEE International Symposium on Computational Intelligence and Informatics, CINTI 2010 - Proceedings, 2010, pp. 251–256.
[23] S. Kampakis and A. Adamides, “Using Twitter to predict football outcomes,” Nov. 2014.
[24] C. Peace and E. Okechukwu, “An Improved Prediction System for Football a Match Result,” 2014.
[25] A. Khanteymoori, E. Davoodi, and A. R. Khanteymoori, Horse racing prediction using artificial neural networks Detection of minimum number of driver genes in gene regulatory networks for applying control signals to control the network View project Knowledge-based and Parallel Gene Network Reconstruction View project Horse Racing Prediction Using Artificial Neural Networks. 2010.
[26] S. Serwe and C. Frings, “Who will win Wimbledon? The recognition heuristic in predicting sports events,” J. Behav. Decis. Mak., vol. 19, no. 4, pp. 321–332, Oct. 2006.
[27] K. Ravichandran, L. Gattani, A. Nair, and B. Das, “A Novel Graph Based Approach to Predict Man of the Match for Cricket,” in Communications in Computer and Information Science, 2020, vol. 1241 CCIS, pp. 600–611.
[28] N. C. Oza, “Online bagging and boosting,” in Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, 2005, vol. 3, pp. 2340–2345.
[29] J. Tang, C. Deng, and G. Bin Huang, “Extreme Learning Machine for Multilayer Perceptron,” IEEE Trans. Neural Networks Learn. Syst., vol. 27, no. 4, pp. 809–821, Apr. 2016.
[30] Y. Freund and R. E. Schapire, “Large margin classification using the perceptron algorithm,” Mach. Learn., vol. 37, no. 3, pp. 277–296, Dec. 1999.
[31] J. Otero and L. Sánchez, “Induction of descriptive fuzzy classifiers with the Logitboost algorithm,” Soft Comput., vol. 10, no. 9, pp. 825–835, Jul. 2006.
[32] A. Nurzahputra, M. A. Muslim, and B. Prasetiyo, “Optimization of C4.5 algorithm using meta learning in diagnosing of chronic kidney diseases,” in Journal of Physics: Conference Series, 2019, vol. 1321, no. 3, p. 32022.
[33] F. Hutter, K. Leyton-Brown, C. Thornton, and H. H. Hoos, “Auto-WEKA: Combined Selection and Hyperparameter Optimization of Classification Algorithms Computer Go View project Sparkle: A PbO-based Multi-agent Problem-solving Platform View project Auto-WEKA: Automated Selection and Hyper-Parameter Optimization of Classification Algorithms,” 2012.
[34] A. Afshar, A. Zahraei, and M. A. Mariño, “Large-Scale Nonlinear Conjunctive Use Optimization Problem: Decomposition Algorithm,” J. Water Resour. Plan. Manag., vol. 136, no. 1, pp. 59–71, Jan. 2010.
[35] C. Kingsford and S. L. Salzberg, “What are decision trees?,” Nature Biotechnology, vol. 26, no. 9. Nature Publishing Group, pp. 1011–1012, Sep-2008.
[36] C. Chen, G. Zhang, J. Yang, J. C. Milton, and A. D. Alcántara, “An explanatory analysis of driver injury severity in rear-end crashes using a decision table/Naïve Bayes (DTNB) hybrid classifier,” Accid. Anal. Prev., vol. 90, pp. 95–107, May 2016.
[37] E. J. Alqahtani, F. H. Alshamrani, H. F. Syed, and S. O. Olatunji, “Classification of Parkinson’s Disease Using NNge Classification Algorithm.,” in 21st Saudi Computer Society National Computer Conference, NCC 2018, 2018.
[38] Z. Muda, W. Yassin, M. N. Sulaiman, and N. I. Udzir, “Intrusion detection based on K-means clustering and OneR classification,” in Proceedings of the 2011 7th International Conference on Information Assurance and Security, IAS 2011, 2011, pp. 192–197.
[39] Y. Cao and J. Wu, “Dynamics of projective adaptive resonance theory model: The foundation of PART algorithm,” IEEE Trans. Neural Networks, vol. 15, no. 2, pp. 245–260, Mar. 2004.
[40] D. H. Toneva, S. Y. Nikolova, G. P. Agre, D. K. Zlatareva, V. G. Hadjidekov, and N. E. Lazarov, “Data mining for sex estimation based on cranial measurements,” Forensic Sci. Int., vol. 315, p. 110441, Oct. 2020.
[41] “Predictive Analysis in Agriculture to Improve the Crop Productivity using ZeroR algorithm | SCIA.” [Online]. Available: [Accessed: 09-Jan-2021].
[42] A. Mazidi, M. Fakhrahmad, and M. Sadreddini, “A meta-heuristic approach to CVRP problem : local search optimization based on GA and ant colony,” J. Adv. Comput. Res., vol. 7, no. December, pp. 1–22, 2016.
[43] A. Mazidi, M. Mahdavi, and F.Roshanfar, “An autonomic decision tree‐based and deadline‐constraint resource provisioning in cloud applications” Concurrency and Computation: Practice and Experience., 2021, e6196.