Table of Contents
Fetching ...

Empirical Asset Pricing via Ensemble Gaussian Process Regression

Damir Filipović, Puneet Pasricha

TL;DR

This work develops an ensemble Gaussian Process Regression framework to predict conditional stock returns from a rich feature set, delivering predictive distributions rather than point estimates. By partitioning data into monthly subsets and mixing GP posteriors, the method achieves strong out-of-sample $R^2$ and information coefficients, and leverages predictive uncertainty to construct uncertainty-weighted portfolios with superior Sharpe ratios. The empirical study on US equities (1962–2016) shows that a non-linear kernel plus uncertainty-aware weighting substantially outperforms linear benchmarks across stock and portfolio levels, with economically meaningful gains and robust cross-sectional patterns. The approach also offers scalable computation and online-learning capabilities, enabling practical deployment and future exploration of kernel-based methods for financial risk modeling.

Abstract

We introduce an ensemble learning method based on Gaussian Process Regression (GPR) for predicting conditional expected stock returns given stock-level and macro-economic information. Our ensemble learning approach significantly reduces the computational complexity inherent in GPR inference and lends itself to general online learning tasks. We conduct an empirical analysis on a large cross-section of US stocks from 1962 to 2016. We find that our method dominates existing machine learning models statistically and economically in terms of out-of-sample $R$-squared and Sharpe ratio of prediction-sorted portfolios. Exploiting the Bayesian nature of GPR, we introduce the mean-variance optimal portfolio with respect to the prediction uncertainty distribution of the expected stock returns. It appeals to an uncertainty averse investor and significantly dominates the equal- and value-weighted prediction-sorted portfolios, which outperform the S&P 500.

Empirical Asset Pricing via Ensemble Gaussian Process Regression

TL;DR

This work develops an ensemble Gaussian Process Regression framework to predict conditional stock returns from a rich feature set, delivering predictive distributions rather than point estimates. By partitioning data into monthly subsets and mixing GP posteriors, the method achieves strong out-of-sample and information coefficients, and leverages predictive uncertainty to construct uncertainty-weighted portfolios with superior Sharpe ratios. The empirical study on US equities (1962–2016) shows that a non-linear kernel plus uncertainty-aware weighting substantially outperforms linear benchmarks across stock and portfolio levels, with economically meaningful gains and robust cross-sectional patterns. The approach also offers scalable computation and online-learning capabilities, enabling practical deployment and future exploration of kernel-based methods for financial risk modeling.

Abstract

We introduce an ensemble learning method based on Gaussian Process Regression (GPR) for predicting conditional expected stock returns given stock-level and macro-economic information. Our ensemble learning approach significantly reduces the computational complexity inherent in GPR inference and lends itself to general online learning tasks. We conduct an empirical analysis on a large cross-section of US stocks from 1962 to 2016. We find that our method dominates existing machine learning models statistically and economically in terms of out-of-sample -squared and Sharpe ratio of prediction-sorted portfolios. Exploiting the Bayesian nature of GPR, we introduce the mean-variance optimal portfolio with respect to the prediction uncertainty distribution of the expected stock returns. It appeals to an uncertainty averse investor and significantly dominates the equal- and value-weighted prediction-sorted portfolios, which outperform the S&P 500.
Paper Structure (23 sections, 24 equations, 22 figures, 10 tables)

This paper contains 23 sections, 24 equations, 22 figures, 10 tables.

Figures (22)

  • Figure 1: This figure shows the size of the cross-section of stocks in each month of the sample. The full sample is split into training sample (green), Feb 1962 to Dec 1981, validation sample (yellow), Jan 1982 to Dec 1986, and test sample (red), Jan 1987 to Dec 2016.
  • Figure 2: This figure describes the mechanism of the rolling scheme with training window, including a calibration month, of length $K$.
  • Figure 3: This figure presents $R^2_{pool}$ over the validation sample, Jan 1982 to Dec 1986 for MSE- and Equal- weighting scheme against the length of the training window.
  • Figure 4: This figure shows the evolution of $R^2_{pool}$ (solid lines) and $R^2_{avg}$ (dotted lines) over an expanding test subsample for our model, E-GPR ($\gamma$-exp), against the linear benchmark models, E-GPR (affine), E-LR and LR. The shaded periods indicate NBER recessions.
  • Figure 5: This figure shows the evolution of Spearman's rank correlation $\rho_t$ between the realized and predicted returns over the test sample. The flat line gives the information coefficient. The shaded periods indicate NBER recessions.
  • ...and 17 more figures