Table of Contents
Fetching ...

Improving the Flexibility and Robustness of Model-Based Derivative-Free Optimization Solvers

Coralia Cartis, Jan Fiala, Benjamin Marteau, Lindon Roberts

TL;DR

The paper presents DFO-LS, a derivative-free solver for nonlinear least-squares that uses regression-based residual models within a trust-region framework, enabling meaningful progress from as few as $2$ objective evaluations and robust performance under noise through multiple restarts, sample averaging, and optional regression. It also introduces Py-BOBYQA for general objective problems, sharing robustness features with DFO-LS and incorporating simplifications for practical performance. A key contribution is an adaptive testing framework with problem-specific accuracy levels $\tau_p$ to fairly benchmark noisy solvers via data profiles. Empirical results show both solvers achieve comparable or superior robustness to existing methods across expensive and noisy regimes, with reduced initialization costs and effective restart strategies that do not harm early performance. The work highlights practical guidelines for parameter choices and restarts, offering accessible tools for practitioners handling expensive or noisy evaluations in LS and general optimization tasks.

Abstract

We present DFO-LS, a software package for derivative-free optimization (DFO) for nonlinear Least-Squares (LS) problems, with optional bound constraints. Inspired by the Gauss-Newton method, DFO-LS constructs simplified linear regression models for the residuals. DFO-LS allows flexible initialization for expensive problems, whereby it can begin making progress from as few as two objective evaluations. Numerical results show DFO-LS can gain reasonable progress on some medium-scale problems with fewer objective evaluations than is needed for one gradient evaluation. DFO-LS has improved robustness to noise, allowing sample averaging, the construction of regression-based models, and multiple restart strategies together with an auto-detection mechanism. Our extensive numerical experimentation shows that restarting the solver when stagnation is detected is a cheap and effective mechanism for achieving robustness, with superior performance over both sampling and regression techniques. We also present our package Py-BOBYQA, a Python implementation of BOBYQA (Powell, 2009), which also implements robustness to noise strategies. Our numerical experiments show that Py-BOBYQA is comparable to or better than existing general DFO solvers for noisy problems. In our comparisons, we introduce a new adaptive measure of accuracy for the data profiles of noisy functions that strikes a balance between measuring the true and the noisy objective improvement.

Improving the Flexibility and Robustness of Model-Based Derivative-Free Optimization Solvers

TL;DR

The paper presents DFO-LS, a derivative-free solver for nonlinear least-squares that uses regression-based residual models within a trust-region framework, enabling meaningful progress from as few as objective evaluations and robust performance under noise through multiple restarts, sample averaging, and optional regression. It also introduces Py-BOBYQA for general objective problems, sharing robustness features with DFO-LS and incorporating simplifications for practical performance. A key contribution is an adaptive testing framework with problem-specific accuracy levels to fairly benchmark noisy solvers via data profiles. Empirical results show both solvers achieve comparable or superior robustness to existing methods across expensive and noisy regimes, with reduced initialization costs and effective restart strategies that do not harm early performance. The work highlights practical guidelines for parameter choices and restarts, offering accessible tools for practitioners handling expensive or noisy evaluations in LS and general optimization tasks.

Abstract

We present DFO-LS, a software package for derivative-free optimization (DFO) for nonlinear Least-Squares (LS) problems, with optional bound constraints. Inspired by the Gauss-Newton method, DFO-LS constructs simplified linear regression models for the residuals. DFO-LS allows flexible initialization for expensive problems, whereby it can begin making progress from as few as two objective evaluations. Numerical results show DFO-LS can gain reasonable progress on some medium-scale problems with fewer objective evaluations than is needed for one gradient evaluation. DFO-LS has improved robustness to noise, allowing sample averaging, the construction of regression-based models, and multiple restart strategies together with an auto-detection mechanism. Our extensive numerical experimentation shows that restarting the solver when stagnation is detected is a cheap and effective mechanism for achieving robustness, with superior performance over both sampling and regression techniques. We also present our package Py-BOBYQA, a Python implementation of BOBYQA (Powell, 2009), which also implements robustness to noise strategies. Our numerical experiments show that Py-BOBYQA is comparable to or better than existing general DFO solvers for noisy problems. In our comparisons, we introduce a new adaptive measure of accuracy for the data profiles of noisy functions that strikes a balance between measuring the true and the noisy objective improvement.

Paper Structure

This paper contains 56 sections, 5 theorems, 55 equations, 16 figures, 1 table, 2 algorithms.

Key Result

Lemma 2.2

Suppose $\mathbf{m}_k$eq_linear_models is constructed using eq_growing_min_norm with $p<n$, and where $\{\mathbf{y}_0,\ldots,\mathbf{y}_p\}$ are affinely independent. Then $J_k$ has column rank $p$.

Figures (16)

  • Figure 1: Normalized objective decrease achieved by DFO-LS, measured in both the noisy and true objective, and convergence information for test problem 'Osborne 1' with $(n,m)=(5,33)$ and unbiased multiplicative Gaussian noise of size $\sigma=10^{-2}$. The vertical black lines indicate when restarts occurred. Restart type was 'soft (moving $\mathbf{x}_k$)', and the budget was $300(n+1)$ evaluations; the (a) and (b) runs terminated early on small trust-region radius.
  • Figure 2: Normalized objective decrease achieved by DFO-LS, measured in both the noisy and true objective, and convergence information as per Figure \ref{['fig_restarts_motivation']}, but allowing auto-detection of restarts. The budget was $100(n+1)$ objective evaluations. The difference between the 'noisy' and 'true' objective reduction measures is discussed in Section \ref{['sec_testing_methodology']}.
  • Figure 3: A comparison of the results in Figure \ref{['fig_basic_noise2']} with additive Gaussian noise, using the data profiles $d_{\mathcal{S}}$ and $\widetilde{d}_{\mathcal{S}}$\ref{['eq_data_profile']}, and either choosing $\tau_p=10^{-5}$ for all problems, or applying the per-problem threshold \ref{['eq_tau_modification']}.
  • Figure 4: Data profiles showing the impact of the reduced initialization cost of DFO-LS (using $n+1$ interpolation points) against using the full initial set, for smooth objectives. Results an average of 10 runs in each case. The problem collection is (CR).
  • Figure 5: Comparison of different sample averaging methods for DFO-LS (using $n+1$ interpolation points). We are using noisy objective evaluations with $\sigma=10^{-2}$, high accuracy $\tau=10^{-5}$, and an average of 10 runs for each solver. The problem collection is (MW).
  • ...and 11 more figures

Theorems & Definitions (11)

  • Remark 2.1
  • Lemma 2.2
  • proof
  • Definition 2.3: $\Lambda$-poised, regression sense
  • Definition A.1: Fully linear, scalar model
  • Definition A.2: Fully linear, vector model
  • Lemma A.4
  • proof
  • Theorem A.7
  • Theorem A.8
  • ...and 1 more