Improving S&P 500 Volatility Forecasting through Regime-Switching Methods
Ava C. Blake, Nivika A. Gandhi, Anurag R. Jakkula
TL;DR
The paper tackles volatility forecasting for the S&P 500 by introducing regime-switching HAR frameworks that use soft regime probabilities and clustering to capture structural market shifts. It analyzes three approaches—Markov regime-switching, distributional clustering via Wasserstein-based segmentation, and coefficient-based soft clustering—with rolling-window training and recursive forecasting, including a dual-recursive HAR-VIX architecture to exploit forward-looking sentiment. Across pre-, during-, and post-COVID periods, the coefficient-based soft clustering consistently yields the lowest forecasting errors, with dual-recursive implementations outperforming single-recursive ones and VIX enhancements improving responsiveness to regime changes. The findings highlight the value of regime-aware, soft-clustering methods for short-horizon volatility forecasting and suggest potential generalization to other asset classes and regimes.
Abstract
Accurate prediction of financial market volatility is critical for risk management, derivatives pricing, and investment strategy. In this study, we propose a multitude of regime-switching methods to improve the prediction of S&P 500 volatility by capturing structural changes in the market across time. We use eleven years of SPX data, from May 1st, 2014 to May 27th, 2025, to compute daily realized volatility (RV) from 5-minute intraday log returns, adjusted for irregular trading days. To enhance forecast accuracy, we engineered features to capture both historical dynamics and forward-looking market sentiment across regimes. The regime-switching methods include a soft Markov switching algorithm to estimate soft-regime probabilities, a distributional spectral clustering method that uses XGBoost to assign clusters at prediction time, and a coefficient-based soft regime algorithm that extracts HAR coefficients from time segments segmented through the Mood test and clusters through Bayesian GMM for soft regime weights, using XGBoost to predict regime probabilities. Models were evaluated across three time periods--before, during, and after the COVID-19 pandemic. The coefficient-based clustering algorithm outperformed all other models, including the baseline autoregressive model, during all time periods. Additionally, each model was evaluated on its recursive forecasting performance for 5- and 10-day horizons during each time period. The findings of this study demonstrate the value of regime-aware modeling frameworks and soft clustering approaches in improving volatility forecasting, especially during periods of heightened uncertainty and structural change.
