Table of Contents
Fetching ...

A Minimalist Bayesian Framework for Stochastic Optimization

Kaizheng Wang

TL;DR

The paper addresses the challenge of applying Bayesian methods to structured stochastic optimization by proposing a minimalist framework that places priors only on the parameter of interest (such as the optimum) and profiles out nuisance parameters via the profile likelihood. This yields a generalized posterior that can accommodate constraints and structure, enabling the MINTS algorithm to operate effectively in complex settings like continuum-armed Lipschitz bandits and dynamic pricing. The authors provide near-optimal regret guarantees for MINTS in multi-armed bandits, and offer novel probabilistic interpretations of classical convex-optimization methods through the Bayesian lens. Overall, the approach delivers flexible, structure-aware Bayesian decision-making with solid theoretical guarantees and practical performance enhancements.

Abstract

The Bayesian paradigm offers principled tools for sequential decision-making under uncertainty, but its reliance on a probabilistic model for all parameters can hinder the incorporation of complex structural constraints. We introduce a minimalist Bayesian framework that places a prior only on the component of interest, such as the location of the optimum. Nuisance parameters are eliminated via profile likelihood, which naturally handles constraints. As a direct instantiation, we develop a MINimalist Thompson Sampling (MINTS) algorithm. Our framework accommodates structured problems, including continuum-armed Lipschitz bandits and dynamic pricing. It also provides a probabilistic lens on classical convex optimization algorithms such as the center of gravity and ellipsoid methods. We further analyze MINTS for multi-armed bandits and establish near-optimal regret guarantees.

A Minimalist Bayesian Framework for Stochastic Optimization

TL;DR

The paper addresses the challenge of applying Bayesian methods to structured stochastic optimization by proposing a minimalist framework that places priors only on the parameter of interest (such as the optimum) and profiles out nuisance parameters via the profile likelihood. This yields a generalized posterior that can accommodate constraints and structure, enabling the MINTS algorithm to operate effectively in complex settings like continuum-armed Lipschitz bandits and dynamic pricing. The authors provide near-optimal regret guarantees for MINTS in multi-armed bandits, and offer novel probabilistic interpretations of classical convex-optimization methods through the Bayesian lens. Overall, the approach delivers flexible, structure-aware Bayesian decision-making with solid theoretical guarantees and practical performance enhancements.

Abstract

The Bayesian paradigm offers principled tools for sequential decision-making under uncertainty, but its reliance on a probabilistic model for all parameters can hinder the incorporation of complex structural constraints. We introduce a minimalist Bayesian framework that places a prior only on the component of interest, such as the location of the optimum. Nuisance parameters are eliminated via profile likelihood, which naturally handles constraints. As a direct instantiation, we develop a MINimalist Thompson Sampling (MINTS) algorithm. Our framework accommodates structured problems, including continuum-armed Lipschitz bandits and dynamic pricing. It also provides a probabilistic lens on classical convex optimization algorithms such as the center of gravity and ellipsoid methods. We further analyze MINTS for multi-armed bandits and establish near-optimal regret guarantees.

Paper Structure

This paper contains 44 sections, 7 theorems, 104 equations, 2 figures, 2 algorithms.

Key Result

Lemma 4.1

The profile likelihood $\bar{\mathcal{L}} ( {x}; \mathcal{D}_{t} )$ is $1$ if the convex polytope is non-empty, and $0$ otherwise.

Figures (2)

  • Figure 1: Dynamic pricing experiments. $x$-axis: time. $y$-axis: regret (left panel) and relative regret (right panel). Blue: MINTS with Bernoulli likelihood. Cyan: MINTS with Gaussian likelihood. Red: Thompson sampling. Black: UCB1.
  • Figure 2: Multi-armed bandit experiments. $x$-axis: time. $y$-axis: regret (left panel) and relative regret (right panel). Blue: MINTS. Red: Thompson sampling. Black: UCB1.

Theorems & Definitions (24)

  • Example 2.1: Multi-armed bandit
  • Example 2.2: Lipschitz bandit KSU08
  • Example 2.3: Dynamic pricing Den15
  • Example 2.4: Continuous optimization
  • Example 3.1: Multi-armed bandit with Gaussian rewards
  • Remark 1: Computational efficiency
  • Remark 2: Translation invariance
  • Remark 3: Structured bandits
  • Remark 4: Other reward distributions
  • Lemma 4.1
  • ...and 14 more