Table of Contents
Fetching ...

Enhancing finite-difference based derivative-free optimization methods with machine learning

Timothé Taminiau, Estelle Massart, Geovani Nunes Grapiglia

TL;DR

This work tackles black-box, possibly nonconvex optimization by augmenting finite-difference derivative-free methods with a surrogate-based heuristic. A Sobolev-learning surrogate is trained on accumulated data and approximate gradients, then refined via gradient steps on the surrogate with an Armijo-type check against the true objective, allowing early surrogate-guided progress before reverting to the base method. The authors provide a worst-case complexity bound $O(n\epsilon^{-2})$ for finding an $\epsilon$-approximate stationary point, with the bound improved by the surrogate gain $\eta(S(T))$ when surrogate steps succeed often. Numerical experiments on a CUTEst subset show substantial performance gains, especially when Sobolev learning leverages gradient information, with SoftPlus-based neural surrogates and Gaussian RBFs delivering the strongest improvements and robust behavior across models. The framework offers a practical, general enhancement to a wide class of finite-difference-based DFO methods, potentially reducing expensive function evaluations in simulation- or experiment-driven optimization tasks.

Abstract

Derivative-Free Optimization (DFO) involves methods that rely solely on evaluations of the objective function. One of the earliest strategies for designing DFO methods is to adapt first-order methods by replacing gradients with finite-difference approximations. The execution of such methods generates a rich dataset about the objective function, including iterate points, function values, approximate gradients, and successful step sizes. In this work, we propose a simple auxiliary procedure to leverage this dataset and enhance the performance of finite-difference-based DFO methods. Specifically, our procedure trains a surrogate model using the available data and applies the gradient method with Armijo line search to the surrogate until it fails to ensure sufficient decrease in the true objective function, in which case we revert to the original algorithm and improve our surrogate based on the new available information. As a proof of concept, we integrate this procedure with the derivative-free method proposed in (Optim. Lett. 18: 195--213, 2024). Numerical results demonstrate significant performance improvements, particularly when the approximate gradients are also used to train the surrogates.

Enhancing finite-difference based derivative-free optimization methods with machine learning

TL;DR

This work tackles black-box, possibly nonconvex optimization by augmenting finite-difference derivative-free methods with a surrogate-based heuristic. A Sobolev-learning surrogate is trained on accumulated data and approximate gradients, then refined via gradient steps on the surrogate with an Armijo-type check against the true objective, allowing early surrogate-guided progress before reverting to the base method. The authors provide a worst-case complexity bound for finding an -approximate stationary point, with the bound improved by the surrogate gain when surrogate steps succeed often. Numerical experiments on a CUTEst subset show substantial performance gains, especially when Sobolev learning leverages gradient information, with SoftPlus-based neural surrogates and Gaussian RBFs delivering the strongest improvements and robust behavior across models. The framework offers a practical, general enhancement to a wide class of finite-difference-based DFO methods, potentially reducing expensive function evaluations in simulation- or experiment-driven optimization tasks.

Abstract

Derivative-Free Optimization (DFO) involves methods that rely solely on evaluations of the objective function. One of the earliest strategies for designing DFO methods is to adapt first-order methods by replacing gradients with finite-difference approximations. The execution of such methods generates a rich dataset about the objective function, including iterate points, function values, approximate gradients, and successful step sizes. In this work, we propose a simple auxiliary procedure to leverage this dataset and enhance the performance of finite-difference-based DFO methods. Specifically, our procedure trains a surrogate model using the available data and applies the gradient method with Armijo line search to the surrogate until it fails to ensure sufficient decrease in the true objective function, in which case we revert to the original algorithm and improve our surrogate based on the new available information. As a proof of concept, we integrate this procedure with the derivative-free method proposed in (Optim. Lett. 18: 195--213, 2024). Numerical results demonstrate significant performance improvements, particularly when the approximate gradients are also used to train the surrogates.

Paper Structure

This paper contains 9 sections, 5 theorems, 58 equations, 6 figures, 1 table, 3 algorithms.

Key Result

Lemma 2.1

Suppose that A1 holds. Then algorithm3 has finite termination.

Figures (6)

  • Figure 1: Data profiles comparing the base method to the surrogate-accelerated methods based on SoftPlus shallow neural networks with standard learning and Sobolev learning, for a budget of 100 simplex gradients.
  • Figure 2: Data profiles comparing the base method to the surrogate-accelerated methods based on Gaussian RBF with standard learning and Sobolev learning, for a budget of 100 simplex gradients.
  • Figure 3: Box plots for the distribution of the surrogate gain over the full test set with a budget of 100 simplex gradients.
  • Figure 4: Data profiles comparing the base method, the surrogate-accelerated methods based on SoftPlus shallow NN and Gaussian RBF with Sobolev learning, for a budget of 100 simplex gradients.
  • Figure 5: Data profiles for the tolerance $\tau=10^{-4}$ and budget of 100 simplex gradients. We compare different choices of NN models for the accelerated method with Sobolev learning.
  • ...and 1 more figures

Theorems & Definitions (10)

  • Lemma 2.1
  • proof
  • Lemma 2.2
  • proof
  • Lemma 2.3
  • proof
  • Theorem 2.4
  • proof
  • Proposition 3.1
  • proof