Empirical Asset Pricing via Ensemble Gaussian Process Regression
Damir Filipović, Puneet Pasricha
TL;DR
This work develops an ensemble Gaussian Process Regression framework to predict conditional stock returns from a rich feature set, delivering predictive distributions rather than point estimates. By partitioning data into monthly subsets and mixing GP posteriors, the method achieves strong out-of-sample $R^2$ and information coefficients, and leverages predictive uncertainty to construct uncertainty-weighted portfolios with superior Sharpe ratios. The empirical study on US equities (1962–2016) shows that a non-linear kernel plus uncertainty-aware weighting substantially outperforms linear benchmarks across stock and portfolio levels, with economically meaningful gains and robust cross-sectional patterns. The approach also offers scalable computation and online-learning capabilities, enabling practical deployment and future exploration of kernel-based methods for financial risk modeling.
Abstract
We introduce an ensemble learning method based on Gaussian Process Regression (GPR) for predicting conditional expected stock returns given stock-level and macro-economic information. Our ensemble learning approach significantly reduces the computational complexity inherent in GPR inference and lends itself to general online learning tasks. We conduct an empirical analysis on a large cross-section of US stocks from 1962 to 2016. We find that our method dominates existing machine learning models statistically and economically in terms of out-of-sample $R$-squared and Sharpe ratio of prediction-sorted portfolios. Exploiting the Bayesian nature of GPR, we introduce the mean-variance optimal portfolio with respect to the prediction uncertainty distribution of the expected stock returns. It appeals to an uncertainty averse investor and significantly dominates the equal- and value-weighted prediction-sorted portfolios, which outperform the S&P 500.
