Abstract:
Data analytics and the use of machine learning are consolidating in the world of sports, specifically in professional football. More and more professional clubs have teams dedicated to this area, which is becoming increasingly important when it comes to making decisions.
This ranges from donations at the level of sports management and club management, as in the analysis of rivals for the preparation of the matches. In this paper, a state-of-the-art study of the use of these techniques is carried out, from the evaluation of player performance and the identification of patterns in the game to the optimization of strategies and the management of resources within the clubs.
The core of this project focuses on the application of an unsupervised machine learning algorithm to cluster teams and players with similar characteristics. The study has all the characteristic phases of such a project. First, a study of the different existing data providers was carried out. Then, a data preprocessing stage was performed in order to obtain and normalize the different statistics and metrics to be evaluated in the study. Finally, the correlation between the different metrics was studied, and the K-Means algorithm was applied. Once the clustering algorithm was applied, an analysis of the results was performed to draw conclusions about the performance of the teams and players.
Additionally, an interactive application was developed in Streamlit, which offers a platform for visualization of a variety of metrics and statistics, thus facilitating their accessibility and understanding. A valuable tool for coaches and analysts, it serves as a complement to data-driven decision-making