Attention Factors for Statistical Arbitrage
Elliot L. Epstein, Rose Wang, Jaewon Choi, Markus Pelger
TL;DR
The paper tackles statistical arbitrage by proposing an end-to-end Attention Factor Model that jointly learns tradable factors and arbitrage portfolio allocations from firm-characteristic embeddings. It combines conditional latent factors with a LongConv sequence model to extract time-series mispricing signals from residuals, optimizing performance after transaction costs. Empirically, the approach yields an out-of-sample annualized Sharpe ratio above 4 without frictions and about 2.3 after costs on 24 years of U.S. equity data, outperforming PCA-based and OU-threshold benchmarks. The findings highlight the importance of weak factors and show that end-to-end optimization with cost-aware objectives significantly improves profitability and interpretability through industry-aligned factor structure.
Abstract
Statistical arbitrage exploits temporal price differences between similar assets. We develop a framework to jointly identify similar assets through factors, identify mispricing and form a trading policy that maximizes risk-adjusted performance after trading costs. Our Attention Factors are conditional latent factors that are the most useful for arbitrage trading. They are learned from firm characteristic embeddings that allow for complex interactions. We identify time-series signals from the residual portfolios of our factors with a general sequence model. Estimating factors and the arbitrage trading strategy jointly is crucial to maximize profitability after trading costs. In a comprehensive empirical study we show that our Attention Factor model achieves an out-of-sample Sharpe ratio above 4 on the largest U.S. equities over a 24-year period. Our one-step solution yields an unprecedented Sharpe ratio of 2.3 net of transaction costs. We show that weak factors are important for arbitrage trading.
