Table of Contents
Fetching ...

The Impacts of Increasingly Complex Matchup Models on Baseball Win Probability

Tristan Mott, Caleb Bradshaw, David Grimsman, Christopher Archibald

TL;DR

The paper addresses how increasingly accurate matchup predictions can inform in-game baseball decisions and improve win probability. It develops four hierarchical Bayesian, log5-inspired frameworks that integrate pitcher and batter tendencies, recency of data, and batter-specific base running to predict plate-appearance outcomes and base-state transitions, then evaluates these within a game-theoretic setting to approximate subgame perfect Nash equilibria. Key findings show measurable win-probability gains from refined projections (roughly one additional win per 162 games in simulations) and demonstrate that recency weighting can have a sizable effect on optimal decisions, while base-running refinements mainly affect predictive alignment with markets. The work also compares model-driven win probabilities to sportsbook lines, finding broad market alignment and practical implications for on-field strategy and predictive analytics, with avenues for future work including home-field effects and fatigue modeling.

Abstract

Baseball is a game of strategic decisions including bullpen usage, pinch-hitting and intentional walks. Managers must adjust their strategies based on the changing state of the game in order to give their team the best chance of winning. In this thesis, we investigate how matchup models -- tools that predict the probabilities of plate appearance outcomes -- impact in-game strategy and ultimately affect win probability. We develop four progressively complex, hierarchical Bayesian models that predict plate appearance outcomes by combining information from both pitchers and batters, their handedness, and recent data, along with base running probabilities calibrated to a player's base-stealing tendencies. Using each model within a game-theoretic framework, we approximate subgame perfect Nash equilibria for in-game decisions, including substitutions and intentional walks. Simulations of the 2024 MLB postseason show that more accurate matchup models can yield tangible gains in win probability -- as much as one additional victory per 162-game season. Furthermore, employing the most detailed model to generate win predictions for actual playoff games demonstrates alignment with market expectations, underscoring both the power and potential of advanced matchup modeling for on-field strategy and prediction.

The Impacts of Increasingly Complex Matchup Models on Baseball Win Probability

TL;DR

The paper addresses how increasingly accurate matchup predictions can inform in-game baseball decisions and improve win probability. It develops four hierarchical Bayesian, log5-inspired frameworks that integrate pitcher and batter tendencies, recency of data, and batter-specific base running to predict plate-appearance outcomes and base-state transitions, then evaluates these within a game-theoretic setting to approximate subgame perfect Nash equilibria. Key findings show measurable win-probability gains from refined projections (roughly one additional win per 162 games in simulations) and demonstrate that recency weighting can have a sizable effect on optimal decisions, while base-running refinements mainly affect predictive alignment with markets. The work also compares model-driven win probabilities to sportsbook lines, finding broad market alignment and practical implications for on-field strategy and predictive analytics, with avenues for future work including home-field effects and fatigue modeling.

Abstract

Baseball is a game of strategic decisions including bullpen usage, pinch-hitting and intentional walks. Managers must adjust their strategies based on the changing state of the game in order to give their team the best chance of winning. In this thesis, we investigate how matchup models -- tools that predict the probabilities of plate appearance outcomes -- impact in-game strategy and ultimately affect win probability. We develop four progressively complex, hierarchical Bayesian models that predict plate appearance outcomes by combining information from both pitchers and batters, their handedness, and recent data, along with base running probabilities calibrated to a player's base-stealing tendencies. Using each model within a game-theoretic framework, we approximate subgame perfect Nash equilibria for in-game decisions, including substitutions and intentional walks. Simulations of the 2024 MLB postseason show that more accurate matchup models can yield tangible gains in win probability -- as much as one additional victory per 162-game season. Furthermore, employing the most detailed model to generate win predictions for actual playoff games demonstrates alignment with market expectations, underscoring both the power and potential of advanced matchup modeling for on-field strategy and prediction.

Paper Structure

This paper contains 11 sections, 6 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Base probability prior distributions for the 9 plate appearance outcomes, estimated from pitchers with more than 100 plate appearances since 2015.
  • Figure 2: Handedness offset prior distributions, capturing how pitcher performance differs between same-handed and opposite-handed matchups.
  • Figure 3: Posterior distributions of Added Wins Per Season for each model, estimated via Monte Carlo simulation.
  • Figure 4: Actual ROI that would have been made at different cushion thresholds, along with 90% confidence intervals, assuming our model's predictions are correct.
  • Figure 5: Betting lines vs. our predicted win rates. Blue dots mark the overround, white dots mark unplaced bets, green dots represent winning bets, and red dots represent losing bets. Dotted lines denote cushion thresholds. Teams that actually won are shown in bold.