Growing the Efficient Frontier on Panel Trees
Lin William Cong, Guanhao Feng, Jingyu He, Xin He
TL;DR
The paper tackles the challenge of estimating the mean-variance efficient ($MVE$) frontier from large, unbalanced panels of asset returns using a novel Panel Tree ($P$-Tree) framework. By enforcing a time-invariant tree structure and a global objective that maximizes the Sharpe ratio of the $MVE$ portfolio, $P$-Trees generate leaf-basis portfolios and an $SDF$ without overfitting, while enabling interpretable nonlinear interactions among high-dimensional characteristics. Empirically, $P$-Tree test assets substantially surpass traditional sorted portfolios and common factor models in spanning the frontier, with significant out-of-sample alphas and robust performance of boosted, multi-factor variants; random-forest extensions reveal persistent characteristic importance (e.g., SUE, DOLVOL, BM_IA). The framework provides a transparent, scalable alternative to deep-learning approaches for asset pricing and investment, delivering sparse, economically interpretable clustering and multiple traded factor opportunities that improve portfolio efficiency and cross-sectional pricing. Overall, $P$-Trees broaden the toolkit for asset pricing and investment by uniting economic objectives with machine-learning-like search in a parsimonious, interpretable structure that remains robust across regimes and out-of-sample tests.
Abstract
We introduce a new class of tree-based models, P-Trees, for analyzing (unbalanced) panel of individual asset returns, generalizing high-dimensional sorting with economic guidance and interpretability. Under the mean-variance efficient framework, P-Trees construct test assets that significantly advance the efficient frontier compared to commonly used test assets, with alphas unexplained by benchmark pricing models. P-Tree tangency portfolios also constitute traded factors, recovering the pricing kernel and outperforming popular observable and latent factor models for investments and cross-sectional pricing. Finally, P-Trees capture the complexity of asset returns with sparsity, achieving out-of-sample Sharpe ratios close to those attained only by over-parameterized large models.
