Return Prediction for Mean-Variance Portfolio Selection: How Decision-Focused Learning Shapes Forecasting Models

Junhyeong Lee; Haeun Jeon; Hyunglip Bae; Yongjae Lee

Return Prediction for Mean-Variance Portfolio Selection: How Decision-Focused Learning Shapes Forecasting Models

Junhyeong Lee, Haeun Jeon, Hyunglip Bae, Yongjae Lee

TL;DR

This paper tackles the problem that standard prediction losses (e.g., MSE) can misalign with downstream portfolio decisions in mean-variance optimization (MVO) under uncertainty. It analyzes Decision-Focused Learning (DFL), showing that the DFL gradient effectively tilts prediction errors by the inverse covariance $\Sigma^{-1}$, embedding inter-asset correlations into the learning process. Empirically, DFL produces highly concentrated portfolios and systematic prediction biases aligned with portfolio relevance, yet yields superior decision quality (e.g., higher Sharpe ratios) especially for intermediate $\alpha$ values. The work provides a theoretical and empirical explanation for why decision-focused training improves portfolio outcomes and highlights that prediction accuracy metrics can be misleading when predictions feed optimization.

Abstract

Markowitz laid the foundation of portfolio theory through the mean-variance optimization (MVO) framework. However, the effectiveness of MVO is contingent on the precise estimation of expected returns, variances, and covariances of asset returns, which are typically uncertain. Machine learning models are becoming useful in estimating uncertain parameters, and such models are trained to minimize prediction errors, such as mean squared errors (MSE), which treat prediction errors uniformly across assets. Recent studies have pointed out that this approach would lead to suboptimal decisions and proposed Decision-Focused Learning (DFL) as a solution, integrating prediction and optimization to improve decision-making outcomes. While studies have shown DFL's potential to enhance portfolio performance, the detailed mechanisms of how DFL modifies prediction models for MVO remain unexplored. This study investigates how DFL adjusts stock return prediction models to optimize decisions in MVO. Theoretically, we show that DFL's gradient can be interpreted as tilting the MSE-based prediction errors by the inverse covariance matrix, effectively incorporating inter-asset correlations into the learning process, while MSE treats each asset's error independently. This tilting mechanism leads to systematic prediction biases where DFL overestimates returns for assets included in portfolios while underestimating excluded assets. Our findings reveal why DFL achieves superior portfolio performance despite higher prediction errors. The strategic biases are features, not flaws.

Return Prediction for Mean-Variance Portfolio Selection: How Decision-Focused Learning Shapes Forecasting Models

TL;DR

, embedding inter-asset correlations into the learning process. Empirically, DFL produces highly concentrated portfolios and systematic prediction biases aligned with portfolio relevance, yet yields superior decision quality (e.g., higher Sharpe ratios) especially for intermediate

values. The work provides a theoretical and empirical explanation for why decision-focused training improves portfolio outcomes and highlights that prediction accuracy metrics can be misleading when predictions feed optimization.

Abstract

Paper Structure (14 sections, 12 equations, 5 figures, 2 tables)

This paper contains 14 sections, 12 equations, 5 figures, 2 tables.

Introduction
Background
Decision-Focused Learning (DFL)
Mean-Variance Optimization (MVO)
Mechanism of DFL for MVO
Experiment
Dataset
Loss Functions
Experimental Setup
Experimental Results
Model Performance
Prediction Bias
Conclusion
Limitations and Future Work

Figures (5)

Figure 1: DFL training procedure for MVO. The prediction model outputs $\hat{\mu}$, which determines portfolio weights $w^*(\hat{\mu})$ through the optimization layer. The combined loss $\mathcal{L}_{combined} = \alpha\mathcal{L}_{MVO} + (1-\alpha)\mathcal{L}_{MSE}$ balances prediction accuracy and decision quality. The optimization layer enables computation of $\frac{dw^*(\hat{\mu})}{d\hat{\mu}}$, which is necessary for calculating the MVO loss gradient.
Figure 2: MSE and regret losses on DOW30 test set for varying $\alpha$. The boxes show distribution across 5 random seeds. MSE increases exponentially while regret decreases as $\alpha$ increases, showing DFL's trade-off between prediction accuracy and portfolio performance. The regret improvement diminishes at $\lambda = 5.0$ where risk penalty dominates the optimization.
Figure 3: Prediction bias across Up/Down assets in DOW 30. As $\alpha$ increases, the Up group becomes increasingly overestimated while the Down group becomes underestimated, reaching extreme polarization at $\alpha = 1$.
Figure 4: Predicted return distributions for IN/OUT portfolio groups across different $\lambda$ and $\alpha$ values. As $\alpha$ increases, the separation between IN and OUT group distributions widens. The case of $\alpha = 1$ is excluded due to extreme distribution separation.
Figure 5: Prediction bias patterns for portfolio assets under MSE loss ($\alpha=0$) versus MVO loss ($\alpha=1$). With MSE loss, prediction biases are randomly distributed across all assets regardless of portfolio inclusion. With MVO loss, prediction biases exhibit clear polarization: assets with high portfolio weights (Top) show positive bias while assets with low weights (Bottom) show negative bias, demonstrating how DFL induces strategic differentiation based on portfolio relevance.

Return Prediction for Mean-Variance Portfolio Selection: How Decision-Focused Learning Shapes Forecasting Models

TL;DR

Abstract

Return Prediction for Mean-Variance Portfolio Selection: How Decision-Focused Learning Shapes Forecasting Models

Authors

TL;DR

Abstract

Table of Contents

Figures (5)