March Madness Tournament Predictions Model: A Mathematical Modeling Approach
Christian McIver, Karla Avalos, Nikhil Nayak
TL;DR
The paper addresses predicting March Madness outcomes with an objective statistical approach. It employs a logistic-regression model using four predictors derived from efficiency metrics and a power rating to estimate win probabilities for 1-on-1 matchups, and then runs Monte Carlo simulations of entire brackets. It evaluates performance with naive matchup accuracy and Spearman correlation between predicted and actual final rounds, reporting around 74.6% test accuracy and bracket-level correlations ranging from about 0.37 to 0.75. The work demonstrates that a compact, interpretable feature set can yield competitive predictive power and highlights directions for incorporating time-varying statistics and other factors.
Abstract
This paper proposes a model to predict the outcome of the March Madness tournament based on historical NCAA basketball data since 2013. The framework of this project is a simplification of the FiveThrityEight NCAA March Madness prediction model, where the only four predictors of interest are Adjusted Offensive Efficiency (ADJOE), Adjusted Defensive Efficiency (ADJDE), Power Rating, and Two-Point Shooting Percentage Allowed. A logistic regression was utilized with the aforementioned metrics to generate a probability of a particular team winning each game. Then, a tournament simulation is developed and compared to real-world March Madness brackets to determine the accuracy of the model. Accuracies of performance were calculated using a naive approach and a Spearman rank correlation coefficient.
