End-to-End Policy Learning of a Statistical Arbitrage Autoencoder Architecture
Fabian Krause, Jan-Peter Calliess
TL;DR
This work generalises statistical arbitrage by replacing traditional PCA-based factor models with Autoencoder-derived residuals and embedding them in an end-to-end neural trading policy. It systematically compares Autoencoder residual generation against Fama-French and PCA baselines and introduces an end-to-end AE-StatArb policy that optimizes both the residual representation and risk-adjusted returns. Across US equity data (2000–2022), Autoencoder approaches achieve competitive pre-cost performance, with the learned policy showing superior risk-adjusted returns and aligning with the literature on the optimal number of latent factors (roughly 10–15). The study highlights the potential of integrated, end-to-end training to reduce modeling risk and streamline StatArb pipeline design, while outlining concrete avenues for improving after-cost performance and intraday applicability.
Abstract
In Statistical Arbitrage (StatArb), classical mean reversion trading strategies typically hinge on asset-pricing or PCA based models to identify the mean of a synthetic asset. Once such a (linear) model is identified, a separate mean reversion strategy is then devised to generate a trading signal. With a view of generalising such an approach and turning it truly data-driven, we study the utility of Autoencoder architectures in StatArb. As a first approach, we employ a standard Autoencoder trained on US stock returns to derive trading strategies based on the Ornstein-Uhlenbeck (OU) process. To further enhance this model, we take a policy-learning approach and embed the Autoencoder network into a neural network representation of a space of portfolio trading policies. This integration outputs portfolio allocations directly and is end-to-end trainable by backpropagation of the risk-adjusted returns of the neural policy. Our findings demonstrate that this innovative end-to-end policy learning approach not only simplifies the strategy development process, but also yields superior gross returns over its competitors illustrating the potential of end-to-end training over classical two-stage approaches.
