Table of Contents
Fetching ...

Beyond Expectations: Learning with Stochastic Dominance Made Practical

Shicong Cen, Jincheng Mei, Hanjun Dai, Dale Schuurmans, Yuejie Chi, Bo Dai

TL;DR

The paper tackles the challenge of incorporating stochastic dominance (SD) into practical learning settings, where SD provides a richer, risk-sensitive criterion than simple expectations. It introduces Learning with Stochastic Dominance (LSD), which generalizes SD via a fixed-point formulation using the functional $\Omega_k(X,Y)=\max_{\eta\in[a,b]}[F_X^k(\eta)-F_Y^k(\eta)]$ and derives a tractable first-order method with a nested inner loop to obtain subgradients. The authors prove convergence to an $\epsilon$-approximate non-dominated solution in $\tilde{O}(\epsilon^{-2})$ iterations and connect the approach to distributionally robust optimization, enabling compatibility with various risk measures. Empirically, LSD achieves competitive performance against risk-neutral baselines in supervised learning, while delivering improved risk profiles in reinforcement learning and portfolio optimization, demonstrating its practical value for risk-averse decision-making under uncertainty.

Abstract

Stochastic dominance serves as a general framework for modeling a broad spectrum of decision preferences under uncertainty, with risk aversion as one notable example, as it naturally captures the intrinsic structure of the underlying uncertainty, in contrast to simply resorting to the expectations. Despite theoretical appeal, the application of stochastic dominance in machine learning has been scarce, due to the following challenges: $\textbf{i)}$, the original concept of stochastic dominance only provides a $\textit{partial order}$, and therefore, is not amenable to serve as a general optimality criterion; and $\textbf{ii)}$, an efficient computational recipe remains lacking due to the continuum nature of evaluating stochastic dominance. In this work, we make the first attempt towards establishing a general framework of learning with stochastic dominance. We first generalize the stochastic dominance concept to enable feasible comparisons between any arbitrary pair of random variables. We next develop a simple and computationally efficient approach for finding the optimal solution in terms of stochastic dominance, which can be seamlessly plugged into many learning tasks. Numerical experiments demonstrate that the proposed method achieves comparable performance as standard risk-neutral strategies and obtains better trade-offs against risk across a variety of applications including supervised learning, reinforcement learning, and portfolio optimization.

Beyond Expectations: Learning with Stochastic Dominance Made Practical

TL;DR

The paper tackles the challenge of incorporating stochastic dominance (SD) into practical learning settings, where SD provides a richer, risk-sensitive criterion than simple expectations. It introduces Learning with Stochastic Dominance (LSD), which generalizes SD via a fixed-point formulation using the functional and derives a tractable first-order method with a nested inner loop to obtain subgradients. The authors prove convergence to an -approximate non-dominated solution in iterations and connect the approach to distributionally robust optimization, enabling compatibility with various risk measures. Empirically, LSD achieves competitive performance against risk-neutral baselines in supervised learning, while delivering improved risk profiles in reinforcement learning and portfolio optimization, demonstrating its practical value for risk-averse decision-making under uncertainty.

Abstract

Stochastic dominance serves as a general framework for modeling a broad spectrum of decision preferences under uncertainty, with risk aversion as one notable example, as it naturally captures the intrinsic structure of the underlying uncertainty, in contrast to simply resorting to the expectations. Despite theoretical appeal, the application of stochastic dominance in machine learning has been scarce, due to the following challenges: , the original concept of stochastic dominance only provides a , and therefore, is not amenable to serve as a general optimality criterion; and , an efficient computational recipe remains lacking due to the continuum nature of evaluating stochastic dominance. In this work, we make the first attempt towards establishing a general framework of learning with stochastic dominance. We first generalize the stochastic dominance concept to enable feasible comparisons between any arbitrary pair of random variables. We next develop a simple and computationally efficient approach for finding the optimal solution in terms of stochastic dominance, which can be seamlessly plugged into many learning tasks. Numerical experiments demonstrate that the proposed method achieves comparable performance as standard risk-neutral strategies and obtains better trade-offs against risk across a variety of applications including supervised learning, reinforcement learning, and portfolio optimization.
Paper Structure (37 sections, 6 theorems, 58 equations, 5 figures, 2 tables, 3 algorithms)

This paper contains 37 sections, 6 theorems, 58 equations, 5 figures, 2 tables, 3 algorithms.

Key Result

Proposition 1

It is guaranteed that a non-dominated solution $\theta^\star$ exists as long as $\Theta$ is compact and that $F_{X_\theta}^k(\eta)$ is continuous with regard to $\theta$ for every $\eta \in \mathbb{R}$.

Figures (5)

  • Figure 1: Probability density and second-order CDF of $\mathcal{N}(0;1)$ and $\mathcal{N}(0;2)$.
  • Figure 2: Illustration of the CliffWalking environment.
  • Figure 3: The $F_2$ (left panel) and density (right panel) of the cumulative return in the CliffWalking environment, by executing the policy learned by REINFORCE and LSD-PG, respectively.
  • Figure 4: The $F_2$ (left panel) and density (middle panel) of the cumulative return, as well as the density of the visited cart position (right panel) in the CartPole environment, by executing the policy learned by REINFORCE, LSD-PG and CVaR$_{\alpha}$PG, respectively.
  • Figure 5: Density of the portfolio returns achieved by different methods.

Theorems & Definitions (6)

  • Proposition 1
  • Theorem 2
  • Theorem 3
  • Theorem 4: ogryczak2001consistency
  • Lemma 5
  • Lemma 6