Table of Contents
Fetching ...

A general framework for floating point error analysis of simplex derivatives

Yiwen Chen, Warren Hare, Amy Wiebe

TL;DR

This work develops a general framework for floating point error analysis of simplex derivatives used in derivative-free optimization, showing that any simplex derivative fitting the form $\\nabla_X f(\\mathbb{Y})=(A^\top)^{\\dagger} B f(\\mathbb{Y})$ can be analyzed for FP errors. It derives non-FP gradient error bounds (e.g., $\\mathcal{O}(\\Delta)$ for GSG and $\\mathcal{O}(\\Delta^2)$ for GCSG) and couples them with FP error bounds arising from pseudo-inverse computation and function evaluations, resulting in concrete bounds and guidance on selecting the sample-diameter $\\Delta$. The analysis applies to the generalized simplex gradient (GSG), generalized centred simplex gradient (GCSG), and generalized adapted centred simplex gradient (GACSG), including cases with misalignment or distortion between sample sets. A key contribution is the explicit derivation of minimal $\\Delta$ values that balance discretization accuracy against FP error, enabling more reliable gradient approximations in practice. The framework also highlights that tighter bounds may be achievable when the underlying structure of a specific simplex derivative is exploited.

Abstract

Gradient approximations are a class of numerical approximation techniques that are of central importance in numerical optimization. In derivative-free optimization, most of the gradient approximations, including the simplex gradient, centred simplex gradient, and adapted centred simplex gradient, are in the form of simplex derivatives. Owing to machine precision, the approximation accuracy of any numerical approximation technique is subject to the influence of floating point errors. In this paper, we provide a general framework for floating point error analysis of simplex derivatives. Our framework is independent of the choice of the simplex derivative as long as it satisfies a general form. We review the definition and approximation accuracy of the generalized simplex gradient and generalized centred simplex gradient. We define and analyze the accuracy of a generalized version of the adapted centred simplex gradient. As examples, we apply our framework to the generalized simplex gradient, generalized centred simplex gradient, and generalized adapted centred simplex gradient. Based on the results, we give suggestions on the minimal choice of approximate diameter of the sample set.

A general framework for floating point error analysis of simplex derivatives

TL;DR

This work develops a general framework for floating point error analysis of simplex derivatives used in derivative-free optimization, showing that any simplex derivative fitting the form can be analyzed for FP errors. It derives non-FP gradient error bounds (e.g., for GSG and for GCSG) and couples them with FP error bounds arising from pseudo-inverse computation and function evaluations, resulting in concrete bounds and guidance on selecting the sample-diameter . The analysis applies to the generalized simplex gradient (GSG), generalized centred simplex gradient (GCSG), and generalized adapted centred simplex gradient (GACSG), including cases with misalignment or distortion between sample sets. A key contribution is the explicit derivation of minimal values that balance discretization accuracy against FP error, enabling more reliable gradient approximations in practice. The framework also highlights that tighter bounds may be achievable when the underlying structure of a specific simplex derivative is exploited.

Abstract

Gradient approximations are a class of numerical approximation techniques that are of central importance in numerical optimization. In derivative-free optimization, most of the gradient approximations, including the simplex gradient, centred simplex gradient, and adapted centred simplex gradient, are in the form of simplex derivatives. Owing to machine precision, the approximation accuracy of any numerical approximation technique is subject to the influence of floating point errors. In this paper, we provide a general framework for floating point error analysis of simplex derivatives. Our framework is independent of the choice of the simplex derivative as long as it satisfies a general form. We review the definition and approximation accuracy of the generalized simplex gradient and generalized centred simplex gradient. We define and analyze the accuracy of a generalized version of the adapted centred simplex gradient. As examples, we apply our framework to the generalized simplex gradient, generalized centred simplex gradient, and generalized adapted centred simplex gradient. Based on the results, we give suggestions on the minimal choice of approximate diameter of the sample set.
Paper Structure (9 sections, 10 theorems, 56 equations)

This paper contains 9 sections, 10 theorems, 56 equations.

Key Result

theorem 1

Suppose that $\mathbb{Y}=\{y_0,y_0+d_1,\ldots,y_0+d_p\}\subseteq\mathbb{R}^n$. Let $\bar{\Delta}>0$. Suppose that $f\in\mathcal{C}^{1+}$ on $B_{\bar{\Delta}}(y_0)$ with constant $\nu$ and $L\in\mathbb{R}^{n\times p}$ has full rank. Suppose that $\Delta=\overline{\mathrm{diam}}(\mathbb{Y})\le\bar{\De where $\widehat{L}=L\slash\Delta$. In particular,

Theorems & Definitions (27)

  • definition thmcounterdefinition: Moore–Penrose pseudo-inverse
  • definition thmcounterdefinition: Generalized Simplex Gradient, GSG
  • definition thmcounterdefinition: Generalized Centred Simplex Gradient, GCSG
  • remark thmcounterremark
  • definition thmcounterdefinition: Stretching parameter
  • definition thmcounterdefinition: Rotation angle and rotation matrix
  • definition thmcounterdefinition: Generalized Adapted Centred Simplex Gradient, GACSG
  • theorem 1: Error bound of the GSG
  • theorem 2: Error bound of the GCSG
  • lemma thmcounterlemma
  • ...and 17 more