Table of Contents
Fetching ...

Enhanced Derivative-Free Optimization Using Adaptive Correlation-Induced Finite Difference Estimators

Guo Liang, Guangwu Liu, Kun Zhang

TL;DR

This work tackles stochastic derivative-free optimization when only noisy function evaluations are available. It combines a batch-based correlation-induced finite-difference (Cor-CFD) gradient surrogate with an adaptive sampling rule and a stochastic line search to improve gradient estimation and sample efficiency, while preserving convergence guarantees. The authors establish consistency, linear convergence with a constant step size, and an $\mathcal{S}(\epsilon)=\mathcal{O}(\epsilon^{-3/2})$ sample complexity, matching classical KW/SPSA rates, and demonstrate superior empirical performance on noisy, high-dimensional problems. The proposed AdaDFO framework provides a practical, robust approach for black-box optimization in simulation and related settings, particularly when function evaluations are expensive or noisy.

Abstract

Gradient-based methods are well-suited for derivative-free optimization (DFO), where finite-difference (FD) estimates are commonly used as gradient surrogates. Traditional stochastic approximation methods, such as Kiefer-Wolfowitz (KW) and simultaneous perturbation stochastic approximation (SPSA), typically utilize only two samples per iteration, resulting in imprecise gradient estimates and necessitating diminishing step sizes for convergence. In this paper, we first explore an efficient FD estimate, referred to as correlation-induced FD estimate, which is a batch-based estimate. Then, we propose an adaptive sampling strategy that dynamically determines the batch size at each iteration. By combining these two components, we develop an algorithm designed to enhance DFO in terms of both gradient estimation efficiency and sample efficiency. Furthermore, we establish the consistency of our proposed algorithm and demonstrate that, despite using a batch of samples per iteration, it achieves the same convergence rate as the KW and SPSA methods. Additionally, we propose a novel stochastic line search technique to adaptively tune the step size in practice. Finally, comprehensive numerical experiments confirm the superior empirical performance of the proposed algorithm.

Enhanced Derivative-Free Optimization Using Adaptive Correlation-Induced Finite Difference Estimators

TL;DR

This work tackles stochastic derivative-free optimization when only noisy function evaluations are available. It combines a batch-based correlation-induced finite-difference (Cor-CFD) gradient surrogate with an adaptive sampling rule and a stochastic line search to improve gradient estimation and sample efficiency, while preserving convergence guarantees. The authors establish consistency, linear convergence with a constant step size, and an sample complexity, matching classical KW/SPSA rates, and demonstrate superior empirical performance on noisy, high-dimensional problems. The proposed AdaDFO framework provides a practical, robust approach for black-box optimization in simulation and related settings, particularly when function evaluations are expensive or noisy.

Abstract

Gradient-based methods are well-suited for derivative-free optimization (DFO), where finite-difference (FD) estimates are commonly used as gradient surrogates. Traditional stochastic approximation methods, such as Kiefer-Wolfowitz (KW) and simultaneous perturbation stochastic approximation (SPSA), typically utilize only two samples per iteration, resulting in imprecise gradient estimates and necessitating diminishing step sizes for convergence. In this paper, we first explore an efficient FD estimate, referred to as correlation-induced FD estimate, which is a batch-based estimate. Then, we propose an adaptive sampling strategy that dynamically determines the batch size at each iteration. By combining these two components, we develop an algorithm designed to enhance DFO in terms of both gradient estimation efficiency and sample efficiency. Furthermore, we establish the consistency of our proposed algorithm and demonstrate that, despite using a batch of samples per iteration, it achieves the same convergence rate as the KW and SPSA methods. Additionally, we propose a novel stochastic line search technique to adaptively tune the step size in practice. Finally, comprehensive numerical experiments confirm the superior empirical performance of the proposed algorithm.

Paper Structure

This paper contains 23 sections, 5 theorems, 58 equations, 2 figures, 2 tables.

Key Result

Proposition 2.2

Assume that $F(x)$ is fifth differentiable at $x_0$ with non-zero fifth derivative, and the noise $\sigma(x) > 0$ is continuous at $x_0$. For any $k = 1,...,K$$(K \geq 2)$, let $h_k = c_k n_b^{-1/10}$$(c_k \neq 0)$ and for any $j \neq k$, $c_j \neq c_k$. If $n \to \infty$, then we have where ${\boldsymbol{c}} = [|c_1|,...,|c_K|]^{\top}$, ${\boldsymbol{c}^4} = \left[c_1^4,...,c_K^4\right]^{\top}$,

Figures (2)

  • Figure 1: Comparison of the pilot samples, transformed samples and optimal samples for $f(x) = 10\sin(x) + \hbox{noise}$.
  • Figure 2: Comparison of the average OG of Algorithm \ref{['alg:cor_cfd_dfo_ls']} and SPSA for second type of problem.

Theorems & Definitions (8)

  • Example 2.1
  • Remark 2.1
  • Proposition 2.2: Theorem 4.1 in liang2024cor
  • Remark 3.1
  • Theorem 4.1
  • Corollary 4.2: A weak version of Theorem \ref{['thm:Converge_Iteration']}
  • Theorem 4.3: Sample complexity
  • Theorem 4.4