Table of Contents
Fetching ...

Zeroth-Order Methods for Nonconvex Stochastic Problems with Decision-Dependent Distributions

Yuya Hikima, Akiko Takeda

TL;DR

This work tackles nonconvex optimization with decision-dependent uncertainty, where the distribution $D(\bm{x})$ is unknown and gradients are inaccessible. It introduces two zeroth-order methods: a one-point gradient estimator with a variance-reduction parameter and a two-point gradient estimator, both analyzed under Gaussian smoothing and mild regularity assumptions to ensure convergence to stationary points. The authors establish worst-case iteration and sample complexity bounds, notably $O(d^{5/2}\varepsilon^{-4})$ iterations and $O(d^{9/2}\varepsilon^{-6})$ samples, with improvements over prior zeroth-order schemes when the function range is large or unbounded. Empirical results on a retail pricing application demonstrate that these methods achieve lower objective values than conventional zeroth-order techniques, highlighting practical relevance for decision-dependent problems.

Abstract

In this study, we consider an optimization problem with uncertainty dependent on decision variables, which has recently attracted attention due to its importance in machine learning and pricing applications. In this problem, the gradient of the objective function cannot be obtained explicitly because the decision-dependent distribution is unknown. Therefore, several zeroth-order methods have been proposed, which obtain noisy objective values by sampling and update the iterates. Although these existing methods have theoretical convergence for optimization problems with decision-dependent uncertainty, they require strong assumptions about the function and distribution or exhibit large variances in their gradient estimators. To overcome these issues, we propose two zeroth-order methods under mild assumptions. First, we develop a zeroth-order method with a new one-point gradient estimator including a variance reduction parameter. The proposed method updates the decision variables while adjusting the variance reduction parameter. Second, we develop a zeroth-order method with a two-point gradient estimator. There are situations where only one-point estimators can be used, but if both one-point and two-point estimators are available, it is more practical to use the two-point estimator. As theoretical results, we show the convergence of our methods to stationary points and provide the worst-case iteration and sample complexity analysis. Our simulation experiments with real data on a retail service application show that our methods output solutions with lower objective values than the conventional zeroth-order methods.

Zeroth-Order Methods for Nonconvex Stochastic Problems with Decision-Dependent Distributions

TL;DR

This work tackles nonconvex optimization with decision-dependent uncertainty, where the distribution is unknown and gradients are inaccessible. It introduces two zeroth-order methods: a one-point gradient estimator with a variance-reduction parameter and a two-point gradient estimator, both analyzed under Gaussian smoothing and mild regularity assumptions to ensure convergence to stationary points. The authors establish worst-case iteration and sample complexity bounds, notably iterations and samples, with improvements over prior zeroth-order schemes when the function range is large or unbounded. Empirical results on a retail pricing application demonstrate that these methods achieve lower objective values than conventional zeroth-order techniques, highlighting practical relevance for decision-dependent problems.

Abstract

In this study, we consider an optimization problem with uncertainty dependent on decision variables, which has recently attracted attention due to its importance in machine learning and pricing applications. In this problem, the gradient of the objective function cannot be obtained explicitly because the decision-dependent distribution is unknown. Therefore, several zeroth-order methods have been proposed, which obtain noisy objective values by sampling and update the iterates. Although these existing methods have theoretical convergence for optimization problems with decision-dependent uncertainty, they require strong assumptions about the function and distribution or exhibit large variances in their gradient estimators. To overcome these issues, we propose two zeroth-order methods under mild assumptions. First, we develop a zeroth-order method with a new one-point gradient estimator including a variance reduction parameter. The proposed method updates the decision variables while adjusting the variance reduction parameter. Second, we develop a zeroth-order method with a two-point gradient estimator. There are situations where only one-point estimators can be used, but if both one-point and two-point estimators are available, it is more practical to use the two-point estimator. As theoretical results, we show the convergence of our methods to stationary points and provide the worst-case iteration and sample complexity analysis. Our simulation experiments with real data on a retail service application show that our methods output solutions with lower objective values than the conventional zeroth-order methods.
Paper Structure (48 sections, 28 theorems, 79 equations, 1 figure, 1 table, 2 algorithms)

This paper contains 48 sections, 28 theorems, 79 equations, 1 figure, 1 table, 2 algorithms.

Key Result

Lemma 1

ray2022decision Suppose that there exist matrix $\bm{A}$ and distribution $D'$ such that where $\bm{\nu}$ has mean $\bar{\bm{\nu}} :=\mathbb{E}_{\bm{\nu} \sim D'}[\bm{\nu}]$ and co-variance $\mathbb{E}_{\bm{\nu} \sim D'}[(\bm{\nu}-\bar{\bm{\nu}})(\bm{\nu}-\bar{\bm{\nu}})^{\top}]$. Moreover, suppose that $f(\bm{x},\bm{\xi})$ is $\rho$-smooth with respect to both $\bm{x}$ and $\bm{\xi}$. where $\|

Figures (1)

  • Figure 1: Change in obj in the first 3000 samples in the simulation experiment with real data. Each graph shows the result in one problem instance for each week. The horizontal axis indicates the number of samples, and the vertical axis indicates obj.

Theorems & Definitions (39)

  • Lemma 1
  • Definition 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Lemma 6
  • Theorem 1
  • Lemma 7
  • Lemma 8
  • ...and 29 more