Large (and Deep) Factor Models
Bryan Kelly, Boris Kuznetsov, Semyon Malamud, Teng Andrea Xu
TL;DR
The paper establishes a theoretical bridge between wide, deep neural networks trained to maximize the Sharpe ratio of the stochastic discount factor and large factor models, via Neural Tangent Kernel theory. It shows that, in the infinite-width limit, DNN-SDFs admit a closed-form, kernel-based representation (the LFM-SDF) built from an extensive portfolio of non-linear characteristics and past market states. Depth and initialization embed inductive biases that interact with data availability to determine out-of-sample performance, with spectral regularization emerging from gradient descent dynamics. Empirically, deeper networks yield meaningful out-of-sample improvements when data are ample (e.g., longer rolling windows), supporting the depth-complexity narrative, while shallower models may suffice with limited data; nonetheless, in the kernel regime, the DNN-SDF behaves like an analyzable factor model with interpretable kernel-based features.
Abstract
We open up the black box behind Deep Learning for portfolio optimization and prove that a sufficiently wide and arbitrarily deep neural network (DNN) trained to maximize the Sharpe ratio of the Stochastic Discount Factor (SDF) is equivalent to a large factor model (LFM): A linear factor pricing model that uses many non-linear characteristics. The nature of these characteristics depends on the architecture of the DNN in an explicit, tractable fashion. This makes it possible to derive end-to-end trained DNN-based SDFs in closed form for the first time. We evaluate LFMs empirically and show how various architectural choices impact SDF performance. We document the virtue of depth complexity: With enough data, the out-of-sample performance of DNN-SDF is increasing in the NN depth, saturating at huge depths of around 100 hidden layers.
