Modeling Market States with Clustering and State Machines
Christian Oliva, Silviu Gabriel Tinjala
TL;DR
The paper addresses the challenge of modeling financial market states with interpretability and robustness by proposing a framework that clusters multi-horizon momentum and risk features to identify market regimes, then builds a probabilistic state machine from a transition matrix $M \in \mathbb{Z}_+^{K\times K}$. Returns are generated as a state-weighted Gaussian mixture $R \sim \sum_{i=1}^K c_i \mathcal{N}(\mu_i,\sigma_i)$ with weights $c_i$ derived from state frequencies, enabling capture of higher moments. Empirical results show the state-machine approach better matches skewness and kurtosis and yields lower distributional distances (KL, KS, Wasserstein) to real returns than a normal model, robustly across assets and time periods with optimal performance around $K\approx 10$. The framework offers an interpretable regime-aware tool for signal generation and risk management, with potential extensions to volume, macro indicators, and multi-asset regime-aware allocation.
Abstract
This work introduces a new framework for modeling financial markets through an interpretable probabilistic state machine. By clustering historical returns based on momentum and risk features across multiple time horizons, we identify distinct market states that capture underlying regimes, such as expansion phase, contraction, crisis, or recovery. From a transition matrix representing the dynamics between these states, we construct a probabilistic state machine that models the temporal evolution of the market. This state machine enables the generation of a custom distribution of returns based on a mixture of Gaussian components weighted by state frequencies. We show that the proposed benchmark significantly outperforms the traditional approach in capturing key statistical properties of asset returns, including skewness and kurtosis, and our experiments across random assets and time periods confirm its robustness.
