Table of Contents
Fetching ...

Contextual Learning for Stochastic Optimization

Anna Heuser, Thomas Kesselheim

TL;DR

We address learning across contexts when the reward distribution is context-dependent by introducing contextual value distributions and a convex surrogate loss that yields small Lévy distance to the true distributions from samples. The core idea is to minimize a capped loss over discretized values $C_\epsilon$, which is convex and Lipschitz, and to show that small empirical loss implies proximity in the Lévy metric, enabling polynomial sample complexity for a broad class of stochastic-optimization problems. The framework yields problem-specific bounds for contextual Single-item Revenue Maximization, Pandora's Box, and Optimal Stopping, via strongly monotone and stable properties, and a general Lévy-stability result that connects distributional learning to near-optimal policies. This has practical impact by enabling reliable approximate policies across diverse contexts with a tractable number of samples per distribution, even when the full distribution is unknown.

Abstract

Motivated by stochastic optimization, we introduce the problem of learning from samples of contextual value distributions. A contextual value distribution can be understood as a family of real-valued distributions, where each sample consists of a context $x$ and a random variable drawn from the corresponding real-valued distribution $D_x$. By minimizing a convex surrogate loss, we learn an empirical distribution $D'_x$ for each context, ensuring a small Lévy distance to $D_x$. We apply this result to obtain the sample complexity bounds for the learning of an $ε$-optimal policy for stochastic optimization problems defined on an unknown contextual value distribution. The sample complexity is shown to be polynomial for the general case of strongly monotone and stable optimization problems, including Single-item Revenue Maximization, Pandora's Box and Optimal Stopping.

Contextual Learning for Stochastic Optimization

TL;DR

We address learning across contexts when the reward distribution is context-dependent by introducing contextual value distributions and a convex surrogate loss that yields small Lévy distance to the true distributions from samples. The core idea is to minimize a capped loss over discretized values , which is convex and Lipschitz, and to show that small empirical loss implies proximity in the Lévy metric, enabling polynomial sample complexity for a broad class of stochastic-optimization problems. The framework yields problem-specific bounds for contextual Single-item Revenue Maximization, Pandora's Box, and Optimal Stopping, via strongly monotone and stable properties, and a general Lévy-stability result that connects distributional learning to near-optimal policies. This has practical impact by enabling reliable approximate policies across diverse contexts with a tractable number of samples per distribution, even when the full distribution is unknown.

Abstract

Motivated by stochastic optimization, we introduce the problem of learning from samples of contextual value distributions. A contextual value distribution can be understood as a family of real-valued distributions, where each sample consists of a context and a random variable drawn from the corresponding real-valued distribution . By minimizing a convex surrogate loss, we learn an empirical distribution for each context, ensuring a small Lévy distance to . We apply this result to obtain the sample complexity bounds for the learning of an -optimal policy for stochastic optimization problems defined on an unknown contextual value distribution. The sample complexity is shown to be polynomial for the general case of strongly monotone and stable optimization problems, including Single-item Revenue Maximization, Pandora's Box and Optimal Stopping.

Paper Structure

This paper contains 18 sections, 20 theorems, 58 equations.

Key Result

Theorem 1

Using $m\geq \frac{32d\xi^2c_{\max}^4}{\epsilon^4\delta^2}$ samples we can learn a weight distribution $V'$, such that with probability of at least $1-\delta$ we have if $x$ is a context drawn from $X$.

Theorems & Definitions (39)

  • Theorem 1
  • Lemma 2
  • Lemma 3
  • proof
  • Lemma 4
  • proof : Proof of Theorem \ref{['max_expectation:high-prob']}
  • Lemma 5
  • Corollary 1
  • Lemma 6
  • proof
  • ...and 29 more