Zeroth-Order Methods for Nonconvex Stochastic Problems with Decision-Dependent Distributions

Yuya Hikima; Akiko Takeda

Zeroth-Order Methods for Nonconvex Stochastic Problems with Decision-Dependent Distributions

Yuya Hikima, Akiko Takeda

TL;DR

This work tackles nonconvex optimization with decision-dependent uncertainty, where the distribution $D(\bm{x})$ is unknown and gradients are inaccessible. It introduces two zeroth-order methods: a one-point gradient estimator with a variance-reduction parameter and a two-point gradient estimator, both analyzed under Gaussian smoothing and mild regularity assumptions to ensure convergence to stationary points. The authors establish worst-case iteration and sample complexity bounds, notably $O(d^{5/2}\varepsilon^{-4})$ iterations and $O(d^{9/2}\varepsilon^{-6})$ samples, with improvements over prior zeroth-order schemes when the function range is large or unbounded. Empirical results on a retail pricing application demonstrate that these methods achieve lower objective values than conventional zeroth-order techniques, highlighting practical relevance for decision-dependent problems.

Abstract

In this study, we consider an optimization problem with uncertainty dependent on decision variables, which has recently attracted attention due to its importance in machine learning and pricing applications. In this problem, the gradient of the objective function cannot be obtained explicitly because the decision-dependent distribution is unknown. Therefore, several zeroth-order methods have been proposed, which obtain noisy objective values by sampling and update the iterates. Although these existing methods have theoretical convergence for optimization problems with decision-dependent uncertainty, they require strong assumptions about the function and distribution or exhibit large variances in their gradient estimators. To overcome these issues, we propose two zeroth-order methods under mild assumptions. First, we develop a zeroth-order method with a new one-point gradient estimator including a variance reduction parameter. The proposed method updates the decision variables while adjusting the variance reduction parameter. Second, we develop a zeroth-order method with a two-point gradient estimator. There are situations where only one-point estimators can be used, but if both one-point and two-point estimators are available, it is more practical to use the two-point estimator. As theoretical results, we show the convergence of our methods to stationary points and provide the worst-case iteration and sample complexity analysis. Our simulation experiments with real data on a retail service application show that our methods output solutions with lower objective values than the conventional zeroth-order methods.

Zeroth-Order Methods for Nonconvex Stochastic Problems with Decision-Dependent Distributions

TL;DR

This work tackles nonconvex optimization with decision-dependent uncertainty, where the distribution

is unknown and gradients are inaccessible. It introduces two zeroth-order methods: a one-point gradient estimator with a variance-reduction parameter and a two-point gradient estimator, both analyzed under Gaussian smoothing and mild regularity assumptions to ensure convergence to stationary points. The authors establish worst-case iteration and sample complexity bounds, notably

iterations and

samples, with improvements over prior zeroth-order schemes when the function range is large or unbounded. Empirical results on a retail pricing application demonstrate that these methods achieve lower objective values than conventional zeroth-order techniques, highlighting practical relevance for decision-dependent problems.

Abstract

Paper Structure (48 sections, 28 theorems, 79 equations, 1 figure, 1 table, 2 algorithms)

This paper contains 48 sections, 28 theorems, 79 equations, 1 figure, 1 table, 2 algorithms.

Introduction
Notation.
Related Work
Zeroth-order Methods
Other Methods for Stochastic Problems with Decision-dependent Uncertainty
Retraining methods perdomo2020performativemendler2020stochastic.
Stochastic gradient descent methods hikima2023stochasticsutton2018reinforcement.
Two-stage approach miller2021outside.
Bayesian optimization brochu2010tutorialfrazier2018tutorial.
Preliminaries
Problem Definition
Assumptions
Gaussian Smoothed Function
Proposed Method with One-point Gradient Estimator
One-point Gradient Estimator
...and 33 more sections

Key Result

Lemma 1

ray2022decision Suppose that there exist matrix $\bm{A}$ and distribution $D'$ such that where $\bm{\nu}$ has mean $\bar{\bm{\nu}} :=\mathbb{E}_{\bm{\nu} \sim D'}[\bm{\nu}]$ and co-variance $\mathbb{E}_{\bm{\nu} \sim D'}[(\bm{\nu}-\bar{\bm{\nu}})(\bm{\nu}-\bar{\bm{\nu}})^{\top}]$. Moreover, suppose that $f(\bm{x},\bm{\xi})$ is $\rho$-smooth with respect to both $\bm{x}$ and $\bm{\xi}$. where $\|

Figures (1)

Figure 1: Change in obj in the first 3000 samples in the simulation experiment with real data. Each graph shows the result in one problem instance for each week. The horizontal axis indicates the number of samples, and the vertical axis indicates obj.

Theorems & Definitions (39)

Lemma 1
Definition 1
Lemma 2
Lemma 3
Lemma 4
Lemma 5
Lemma 6
Theorem 1
Lemma 7
Lemma 8
...and 29 more

Zeroth-Order Methods for Nonconvex Stochastic Problems with Decision-Dependent Distributions

TL;DR

Abstract

Zeroth-Order Methods for Nonconvex Stochastic Problems with Decision-Dependent Distributions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (39)