NeuralFactors: A Novel Factor Learning Approach to Generative Modeling of Equities

Achintya Gopal

NeuralFactors: A Novel Factor Learning Approach to Generative Modeling of Equities

Achintya Gopal

TL;DR

NeuralFactors introduces a probabilistic, factor-learning framework for equities that discovers latent market factors and stock exposures from rich features using a conditional variational autoencoder with an IWAE objective. The linear decoder ties stock returns to latent factors via time-varying exposures, while the encoder uses a Normal approximation to a CIWAE posterior, enabling end-to-end learning and fast covariance-based risk assessments. Empirical results on S&P 500 constituents show superior joint likelihood and covariance forecasting, competitive VaR calibration, and improved portfolio performance compared to BDG and PPCA baselines, with qualitative factor embeddings demonstrating sectorial clustering. The approach offers a scalable, interpretable alternative to fixed-factor models and paves the way for incorporating novel data sources into factor-driven risk management and portfolio construction.

Abstract

The use of machine learning for statistical modeling (and thus, generative modeling) has grown in popularity with the proliferation of time series models, text-to-image models, and especially large language models. Fundamentally, the goal of classical factor modeling is statistical modeling of stock returns, and in this work, we explore using deep generative modeling to enhance classical factor models. Prior work has explored the use of deep generative models in order to model hundreds of stocks, leading to accurate risk forecasting and alpha portfolio construction; however, that specific model does not allow for easy factor modeling interpretation in that the factor exposures cannot be deduced. In this work, we introduce NeuralFactors, a novel machine-learning based approach to factor analysis where a neural network outputs factor exposures and factor returns, trained using the same methodology as variational autoencoders. We show that this model outperforms prior approaches both in terms of log-likelihood performance and computational efficiency. Further, we show that this method is competitive to prior work in generating realistic synthetic data, covariance estimation, risk analysis (e.g., value at risk, or VaR, of portfolios), and portfolio optimization. Finally, due to the connection to classical factor analysis, we analyze how the factors our model learns cluster together and show that the factor exposures could be used for embedding stocks.

NeuralFactors: A Novel Factor Learning Approach to Generative Modeling of Equities

TL;DR

Abstract

Paper Structure (36 sections, 19 equations, 4 figures, 7 tables)

This paper contains 36 sections, 19 equations, 4 figures, 7 tables.

Introduction
Background
Student's T Distribution
VAE and Conditional VAE (CVAE)
CIWAE
Methodology
Problem Formulation
Linear Decoder
Approximating the Encoder
Features
Architecture and Optimization
Time Complexity
Usage
Mean and Covariance
One-day sampling
...and 21 more sections

Figures (4)

Figure 1: We show a high-level diagram of our final model architecture. $\mathbf{r_{i, t}}$ denotes the returns of security $i$ at time $t$. Note that the "Neural Network" is the same across all stocks.
Figure 2: A diagrammatic representation of stock embedder. $l$ refers to the lookback window size and $||$ denotes concatenation.
Figure 3: Comparison of returns of long-only $L=1$ portfolios. To make all of our final results comparable in scale, we lever the returns to match the volatility of the S&P 500.
Figure 4: TSNE embedding of $\bm\beta_{i,t}$ for $t=$ 03 Jan 2019.

NeuralFactors: A Novel Factor Learning Approach to Generative Modeling of Equities

TL;DR

Abstract

NeuralFactors: A Novel Factor Learning Approach to Generative Modeling of Equities

Authors

TL;DR

Abstract

Table of Contents

Figures (4)