Table of Contents
Fetching ...

Joint Estimation of Conditional Mean and Covariance for Unbalanced Panels

Damir Filipovic, Paul Schneider

TL;DR

The paper tackles the challenge of jointly estimating conditional mean and covariance in large, unbalanced panels by introducing the COCO (joint conditional mean and covariance) estimator, a kernel-based nonparametric framework that ensures symmetric, positive semidefinite covariances. It builds a moment kernel from RKHS feature maps to represent conditional moments, proves a universal representation with a Representer Theorem, and reduces computation via a Nyström low-rank approach to a convex optimization over low-dimensional matrices. The COCO methodology yields consistent estimates with finite-sample guarantees and is validated on a long-span panel of US stock returns, where idiosyncratic risk dominates cross-sectional variance yet the joint mean–variance structure substantially improves out-of-sample Sharpe ratios for cMVE portfolios. The empirical results also reveal that increasing the number of systematic factors strengthens predictive performance while diminishing alignment with traditional factor models, offering practical asset-pricing insights and a robust, scalable tool for econometric inference in finance.

Abstract

We develop a nonparametric, kernel-based joint estimator for conditional mean and covariance matrices in large and unbalanced panels. The estimator is supported by rigorous consistency results and finite-sample guarantees, ensuring its reliability for empirical applications. We apply it to an extensive panel of monthly US stock excess returns from 1962 to 2021, using macroeconomic and firm-specific covariates as conditioning variables. The estimator effectively captures time-varying cross-sectional dependencies, demonstrating robust statistical and economic performance. We find that idiosyncratic risk explains, on average, more than 75% of the cross-sectional variance.

Joint Estimation of Conditional Mean and Covariance for Unbalanced Panels

TL;DR

The paper tackles the challenge of jointly estimating conditional mean and covariance in large, unbalanced panels by introducing the COCO (joint conditional mean and covariance) estimator, a kernel-based nonparametric framework that ensures symmetric, positive semidefinite covariances. It builds a moment kernel from RKHS feature maps to represent conditional moments, proves a universal representation with a Representer Theorem, and reduces computation via a Nyström low-rank approach to a convex optimization over low-dimensional matrices. The COCO methodology yields consistent estimates with finite-sample guarantees and is validated on a long-span panel of US stock returns, where idiosyncratic risk dominates cross-sectional variance yet the joint mean–variance structure substantially improves out-of-sample Sharpe ratios for cMVE portfolios. The empirical results also reveal that increasing the number of systematic factors strengthens predictive performance while diminishing alignment with traditional factor models, offering practical asset-pricing insights and a robust, scalable tool for econometric inference in finance.

Abstract

We develop a nonparametric, kernel-based joint estimator for conditional mean and covariance matrices in large and unbalanced panels. The estimator is supported by rigorous consistency results and finite-sample guarantees, ensuring its reliability for empirical applications. We apply it to an extensive panel of monthly US stock excess returns from 1962 to 2021, using macroeconomic and firm-specific covariates as conditioning variables. The estimator effectively captures time-varying cross-sectional dependencies, demonstrating robust statistical and economic performance. We find that idiosyncratic risk explains, on average, more than 75% of the cross-sectional variance.

Paper Structure

This paper contains 23 sections, 10 theorems, 66 equations, 13 figures, 1 table.

Key Result

Theorem 2.1

The proposed framework is universal in the following sense:

Figures (13)

  • Figure 1: Size of cross section. The blue line shows the number of assets $N_t$ over time. The red line shows a running average. The sample consists of stock data compiled by gukellyxiu20, covering the period from 1962 to 2021.
  • Figure 2: Out-of-sample predictive $R^{2}$ performance. The panels display rolling $R^{2}_{t-r,t,\text{OOS}}$ (over $r = 24$ months), and expanding $R^{2}_{0,t,\text{OOS}}$ as defined in \ref{['eq_kellyoos']}, using the COCO model with $m = 5,\, 10,\, 20,\, 40$ systematic factors. The analysis is based on unbalanced US common stock excess returns and associated covariates from 1962 to 2021. Shaded areas indicate major market crashes: the 1987 Crash, the Dot-Com Bubble, the Global Financial Crisis, and the COVID-19 Pandemic.
  • Figure 3: Out-of-sample predictive $R^{2,2}$ performance. The panels display the rolling $R^{2,2}_{t-r,t,\text{OOS}}$ (over $r = 24$ months) and expanding $R^{2,2}_{0,t,\text{OOS}}$ as defined in \ref{['eq_kellysecmomoos']}, using the COCO model with $m = 5,\, 10,\, 20,\, 40$ systematic factors. The analysis is based on unbalanced US common stock excess returns and associated covariates from 1962 to 2021. Shaded areas indicate major market crashes: the 1987 Crash, the Dot-Com Bubble, the Global Financial Crisis, and the COVID-19 Pandemic.
  • Figure 4: Out-of-sample scoring loss differential performance. The panels display the rolling ${\mathcal{S}}_{t-r,t,\text{OOS}}$ (over $r = 24$ months) and expanding ${\mathcal{S}}_{0,t,\text{OOS}}$ from \ref{['eqRcaltT']}, using the COCO model with $m = 5,\, 10,\, 20,\, 40$ systematic factors. The analysis is based on unbalanced US common stock excess returns and associated covariates from 1962 to 2021. Shaded areas indicate major market crashes: the 1987 Crash, the Dot-Com Bubble, the Global Financial Crisis, and the COVID-19 Pandemic.
  • Figure 5: Out-of-sample explained variation by portfolio factors. The panels display the rolling $R^{2,\bm f}_{t-r,t,\text{OOS} }$ (over $r = 24$ months) and expanding $R^{2,\bm f}_{0,t,\text{OOS} }$ as defined in \ref{['eq_totalrsqr']}, using the COCO model with $m = 5,\, 10,\, 20,\, 40$ systematic factors. The analysis is based on unbalanced US common stock excess returns and associated covariates from 1962 to 2021. Shaded areas indicate major market crashes: the 1987 Crash, the Dot-Com Bubble, the Global Financial Crisis, and the COVID-19 Pandemic.
  • ...and 8 more figures

Theorems & Definitions (12)

  • Theorem 2.1
  • Theorem 3.1: Representer Theorem
  • Proposition 3.2
  • Theorem 3.3
  • Example 3.4
  • Lemma 4.1
  • Lemma 4.2
  • Theorem 4.3
  • Lemma 4.4
  • Lemma A.1
  • ...and 2 more