Table of Contents
Fetching ...

Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion

Abdulrahman Alswaidan, Jeffrey D. Varner

TL;DR

A hybrid hidden Markov framework that discretizes continuous excess growth rates into Laplace quantile-defined market states and augments regime switching with a Poisson-driven jump-duration mechanism to enforce realistic tail-state dwell times and offers the best joint quality profile across distributional, temporal, and tail-coverage metrics.

Abstract

Generating synthetic financial time series that preserve statistical properties of real market data is essential for stress testing, risk model validation, and scenario design. Existing approaches, from parametric models to deep generative networks, struggle to simultaneously reproduce heavy-tailed distributions, negligible linear autocorrelation, and persistent volatility clustering. We propose a hybrid hidden Markov framework that discretizes continuous excess growth rates into Laplace quantile-defined market states and augments regime switching with a Poisson-driven jump-duration mechanism to enforce realistic tail-state dwell times. Parameters are estimated by direct transition counting, bypassing the Baum-Welch EM algorithm. Synthetic data quality is evaluated using Kolmogorov-Smirnov and Anderson-Darling pass rates for distributional fidelity, and ACF mean absolute error for temporal structure. Applied to ten years of SPY data across 1,000 simulated paths, the framework achieves KS and AD pass rates exceeding 97% and 91% in-sample and 94% out-of-sample (calendar year 2025), partially reproducing the ARCH effect that standard regime-switching models miss. No single model dominates all quality dimensions: GARCH(1,1) reproduces volatility clustering more accurately but fails distributional tests (5.5% KS pass rate), while the standard HMM without jumps achieves higher distributional fidelity but cannot generate persistent high-volatility regimes. The proposed framework offers the best joint quality profile across distributional, temporal, and tail-coverage metrics. A Single-Index Model extension propagates the SPY factor path to a 424-asset universe, enabling scalable correlated synthetic path generation while preserving cross-sectional correlation structure.

Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion

TL;DR

A hybrid hidden Markov framework that discretizes continuous excess growth rates into Laplace quantile-defined market states and augments regime switching with a Poisson-driven jump-duration mechanism to enforce realistic tail-state dwell times and offers the best joint quality profile across distributional, temporal, and tail-coverage metrics.

Abstract

Generating synthetic financial time series that preserve statistical properties of real market data is essential for stress testing, risk model validation, and scenario design. Existing approaches, from parametric models to deep generative networks, struggle to simultaneously reproduce heavy-tailed distributions, negligible linear autocorrelation, and persistent volatility clustering. We propose a hybrid hidden Markov framework that discretizes continuous excess growth rates into Laplace quantile-defined market states and augments regime switching with a Poisson-driven jump-duration mechanism to enforce realistic tail-state dwell times. Parameters are estimated by direct transition counting, bypassing the Baum-Welch EM algorithm. Synthetic data quality is evaluated using Kolmogorov-Smirnov and Anderson-Darling pass rates for distributional fidelity, and ACF mean absolute error for temporal structure. Applied to ten years of SPY data across 1,000 simulated paths, the framework achieves KS and AD pass rates exceeding 97% and 91% in-sample and 94% out-of-sample (calendar year 2025), partially reproducing the ARCH effect that standard regime-switching models miss. No single model dominates all quality dimensions: GARCH(1,1) reproduces volatility clustering more accurately but fails distributional tests (5.5% KS pass rate), while the standard HMM without jumps achieves higher distributional fidelity but cannot generate persistent high-volatility regimes. The proposed framework offers the best joint quality profile across distributional, temporal, and tail-coverage metrics. A Single-Index Model extension propagates the SPY factor path to a 424-asset universe, enabling scalable correlated synthetic path generation while preserving cross-sectional correlation structure.
Paper Structure (26 sections, 10 equations, 7 figures, 6 tables, 4 algorithms)

This paper contains 26 sections, 10 equations, 7 figures, 6 tables, 4 algorithms.

Figures (7)

  • Figure 1: Empirical stylized facts of SPY daily excess growth rates (2014--2024). Panel (a) shows the leptokurtic marginal distribution; the Laplace fit substantially outperforms the Gaussian. Panel (b) confirms heavy tails via a normal Q-Q plot. Panels (c) and (d) contrast the near-zero autocorrelation of raw returns (consistent with the efficient markets hypothesis) against the persistent autocorrelation of absolute returns, motivating the jump-duration extension.
  • Figure 2: Architecture of the Hybrid Hidden Markov Model with Jump-Diffusion (HMM-WJ). At each step the chain transitions according to the empirical transition matrix $\mathbf{T}$ (probability $1-\epsilon$, thin bidirectional arrows) or enters a Poisson jump episode (probability $\epsilon$, dashed upward arrow to the Poisson clock). During a jump episode the active state is forced to the bottom tail set $\mathcal{S}_{-}$ (red, lowest-valued states) or the top tail set $\mathcal{S}_{+}$ (blue, highest-valued states) for $K\!\sim\!\mathrm{Poisson}(\lambda)$ consecutive steps before normal Markovian evolution resumes. Small bell curves above each state depict the state-conditional location-scale Student-t ($\nu=5$) emission distribution. Tail sets are defined as the $N_{\rm tail}$ lowest- and highest-quantile states under the Laplace quantile partition.
  • Figure 3: Head-to-head in-sample model comparison for SPY ($N=100$, 1,000 simulated paths). Panel (a): marginal density of excess growth rates with IS KS pass rates annotated. GARCH(1,1) fails the two-sample KS test on 95% of paths (pass rate 5.5%), indicating a systematic shape mismatch with the empirical distribution. Both HMM variants pass at rates $\geq\!97\%$, confirming that the Student-t emission structure adequately captures the observed heavy tails. Out-of-sample, HMM-WJ maintains 94% KS pass rate, though HMM-NJ achieves a higher 97%; GARCH recovers partially (OoS 80%). Panel (b): autocorrelation function of absolute excess growth rates, $\mathrm{ACF}(|G_t|)$, at lags 1--252 (shaded bands: 10th--90th percentile across paths). HMM-NJ is structurally incapable of producing persistent volatility clustering: without a jump mechanism the latent states are i.i.d. conditional on the Markov chain, so $\mathrm{ACF}(|G_t|)\approx 0$ for all lags beyond one (dotted curve). HMM-WJ generates a mixture of two path families under the same model: approximately 76% of IS paths contain no Poisson jump and therefore behave identically to HMM-NJ (dashed curve, near zero); the remaining $\sim$24% of paths contain at least one jump event and sustain substantial $\mathrm{ACF}(|G_t|)$ decay across all lags (solid navy curve $\pm$ band). The jump frequency, and hence the fraction of the ensemble exhibiting slow ACF decay, is controlled by the tail-entry probability $\epsilon$, making the volatility-clustering strength directly tunable. Panel (c): tail Q-Q plot at the 0.1st--99.9th percentile region; HMM-WJ provides the closest mean quantile match in the extreme tails.
  • Figure 4: Statistical validation of HMM-WJ ($N=100$, 1,000 simulated paths, $\alpha=0.05$). Panels (a) and (b) show the distribution of two-sample KS $p$-values in-sample and out-of-sample, respectively; a well-calibrated generative model produces $p$-values that are approximately uniform above the significance threshold. The leftward shift from (a) to (b) reflects the expected distributional degradation when stationary parameters are applied to a regime with elevated macro uncertainty. Panel (c) shows the out-of-sample marginal density fan chart; the observed density (dashed red) falls within the 10th--90th percentile simulation envelope, confirming adequate distributional coverage despite the regime shift. Panel (d) shows the ACF of $|G_t|$ in the out-of-sample window. The observed ACF (dashed red) exceeds the simulation band at medium-to-long lags, indicating that the 2025 test period exhibited stronger volatility clustering than the IS-calibrated jump parameters $(\epsilon^*,\lambda^*)$ predict. This gap provides direct evidence that the jump hyperparameters are regime-dependent and motivates the time-varying extensions discussed in Section \ref{['sec:discussion']}.
  • Figure 5: Multi-asset extension via the Single-Index Model (SIM) across 424 S&P 500 constituents. Panel (a) summarizes the SIM regression fit quality by GICS sector. Panel (b) shows that KS distributional consistency is maintained across the full asset universe when paths are generated via the factor structure $\hat{G}_{i,t} = \hat{\alpha}_i + \hat{\beta}_i G_{{\rm SPY},t} + \hat{\eta}_{i,t}$. Panel (c) illustrates representative marginal density comparisons for three assets spanning a wide range of systematic risk exposure.
  • ...and 2 more figures