Table of Contents
Fetching ...

Zeroth-order gradient estimators for stochastic problems with decision-dependent distributions

Yuya Hikima, Akiko Takeda

TL;DR

This work tackles stochastic optimization where the data distribution depends on the decision and is unknown, a setting common in performative prediction. It develops and analyzes zeroth‑order gradient estimators that use coordinate, sphere, or Gaussian directions, and shows that averaging over multiple random directions (N>1) achieves the smallest sample complexity in nonconvex/unbounded problems. The results demonstrate that sphere and Gaussian random‑direction estimators substantially outperform coordinate‑wise methods under mild assumptions, with FFT‑style complexities scaling as O(d^2 ε^{-6}) or O(d^2 ε^{-5}) depending on smoothness, outperforming prior bounds. Empirical tests in multiproduct pricing and strategic classification corroborate the theory, showing practical gains when using random directions, and align with the conclusion that averaging directions yields superior performance in decision‑dependent stochastic problems.

Abstract

Stochastic optimization problems with unknown decision-dependent distributions have attracted increasing attention in recent years due to its importance in applications. Since the gradient of the objective function is inaccessible as a result of the unknown distribution, various zeroth-order methods have been developed to solve the problem. However, it remains unclear which search direction to construct a gradient estimator is more appropriate and how to set the algorithmic parameters. In this paper, we conduct a unified sample complexity analysis of zeroth-order methods across gradient estimators with different search directions. As a result, we show that gradient estimators that average over multiple directions, either uniformly from the unit sphere or from a Gaussian distribution, achieve the lowest sample complexity. The attained sample complexities improve those of existing zeroth-order methods in the problem setting that allows nonconvexity and unboundedness of the objective function. Moreover, by simulation experiments on multiple products pricing and strategic classification applications, we show practical performance of zeroth-order methods with various gradient estimators.

Zeroth-order gradient estimators for stochastic problems with decision-dependent distributions

TL;DR

This work tackles stochastic optimization where the data distribution depends on the decision and is unknown, a setting common in performative prediction. It develops and analyzes zeroth‑order gradient estimators that use coordinate, sphere, or Gaussian directions, and shows that averaging over multiple random directions (N>1) achieves the smallest sample complexity in nonconvex/unbounded problems. The results demonstrate that sphere and Gaussian random‑direction estimators substantially outperform coordinate‑wise methods under mild assumptions, with FFT‑style complexities scaling as O(d^2 ε^{-6}) or O(d^2 ε^{-5}) depending on smoothness, outperforming prior bounds. Empirical tests in multiproduct pricing and strategic classification corroborate the theory, showing practical gains when using random directions, and align with the conclusion that averaging directions yields superior performance in decision‑dependent stochastic problems.

Abstract

Stochastic optimization problems with unknown decision-dependent distributions have attracted increasing attention in recent years due to its importance in applications. Since the gradient of the objective function is inaccessible as a result of the unknown distribution, various zeroth-order methods have been developed to solve the problem. However, it remains unclear which search direction to construct a gradient estimator is more appropriate and how to set the algorithmic parameters. In this paper, we conduct a unified sample complexity analysis of zeroth-order methods across gradient estimators with different search directions. As a result, we show that gradient estimators that average over multiple directions, either uniformly from the unit sphere or from a Gaussian distribution, achieve the lowest sample complexity. The attained sample complexities improve those of existing zeroth-order methods in the problem setting that allows nonconvexity and unboundedness of the objective function. Moreover, by simulation experiments on multiple products pricing and strategic classification applications, we show practical performance of zeroth-order methods with various gradient estimators.

Paper Structure

This paper contains 42 sections, 24 theorems, 111 equations, 1 figure, 3 tables, 1 algorithm.

Key Result

Lemma 1

Suppose that Assumption asm:F_smooth and $\eta\le \frac{1}{4M}$. Then, Algorithm alg:simple obtains $\bar{\bm{x}}$ such that where $F^* := \min_{\bm{x}} F(\bm{x})$ and $\bm{g}_{[t]}:=\{\bm{g}_1, \dots, \bm{g}_t\}$.

Figures (1)

  • Figure 1: Change in obj in the simulation experiment with real data. Each graph shows the result in one problem instance for each week. The horizontal axis indicates the number of samples, and the vertical axis indicates obj.

Theorems & Definitions (42)

  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Theorem 4
  • proof
  • Theorem 5
  • proof
  • ...and 32 more