Predicting the Outcomes of NCAA Women’s Sports
View/ Open
Abstract
Sports competitions provide excellent opportunities for model building and using basic statistical methodology in an interesting way. More attention has been paid to and more research has been conducted pertaining to men’s sports as opposed to women’s sports. This paper will focus on three kinds of women’s sports, i.e. NCAA women’s basketball, volleyball and soccer. Several ordinary least squares models were developed that help explain the variation in point spread of a women’s basketball game, volleyball game and soccer game based on in-game statistics. Several logistic models were also developed that help estimate the probability that a particular team will win the game for women’s basketball, volleyball and soccer tournaments. Ordinary least squares models for Round 1, Round 2 and Rounds 3-6 with point spread being the dependent variable by using differences in ranks of seasonal averages and differences of seasonal averages were developed to predict winners of games in each of those rounds for the women’s basketball, volleyball and soccer tournament. Logistic models for Round 1, Round 2 and Rounds 3-6 that estimate the probability of a team winning the game by using differences in ranks of seasonal averages and differences of seasonal averages were developed to predict winners of games in each of those rounds for the basketball, volleyball and soccer tournaments. The prediction models were validated before doing the prediction. For basketball, the least squares model developed by using differences in ranks of seasonal averages with a double scoring system variable predicted the results of a 76.2% of the games for the entire tournament with all the predictions made before the start of the tournament. For volleyball, the logistic model developed by using differences of seasonal averages predicted 65.1% of the games for the entire tournament. For soccer, the logistic regression model developed by using differences of seasonal averages predicted 45% of all games in the tournament. Correctly when all 6 rounds were predicted before the tournament began. In this case, team predicted to win in the second round or higher might not have even made it to this round since prediction was done ahead of time.