A general framework for floating point error analysis of simplex derivatives

Yiwen Chen; Warren Hare; Amy Wiebe

A general framework for floating point error analysis of simplex derivatives

Yiwen Chen, Warren Hare, Amy Wiebe

TL;DR

This work develops a general framework for floating point error analysis of simplex derivatives used in derivative-free optimization, showing that any simplex derivative fitting the form $\\nabla_X f(\\mathbb{Y})=(A^\top)^{\\dagger} B f(\\mathbb{Y})$ can be analyzed for FP errors. It derives non-FP gradient error bounds (e.g., $\\mathcal{O}(\\Delta)$ for GSG and $\\mathcal{O}(\\Delta^2)$ for GCSG) and couples them with FP error bounds arising from pseudo-inverse computation and function evaluations, resulting in concrete bounds and guidance on selecting the sample-diameter $\\Delta$. The analysis applies to the generalized simplex gradient (GSG), generalized centred simplex gradient (GCSG), and generalized adapted centred simplex gradient (GACSG), including cases with misalignment or distortion between sample sets. A key contribution is the explicit derivation of minimal $\\Delta$ values that balance discretization accuracy against FP error, enabling more reliable gradient approximations in practice. The framework also highlights that tighter bounds may be achievable when the underlying structure of a specific simplex derivative is exploited.

Abstract

Gradient approximations are a class of numerical approximation techniques that are of central importance in numerical optimization. In derivative-free optimization, most of the gradient approximations, including the simplex gradient, centred simplex gradient, and adapted centred simplex gradient, are in the form of simplex derivatives. Owing to machine precision, the approximation accuracy of any numerical approximation technique is subject to the influence of floating point errors. In this paper, we provide a general framework for floating point error analysis of simplex derivatives. Our framework is independent of the choice of the simplex derivative as long as it satisfies a general form. We review the definition and approximation accuracy of the generalized simplex gradient and generalized centred simplex gradient. We define and analyze the accuracy of a generalized version of the adapted centred simplex gradient. As examples, we apply our framework to the generalized simplex gradient, generalized centred simplex gradient, and generalized adapted centred simplex gradient. Based on the results, we give suggestions on the minimal choice of approximate diameter of the sample set.

A general framework for floating point error analysis of simplex derivatives

TL;DR

This work develops a general framework for floating point error analysis of simplex derivatives used in derivative-free optimization, showing that any simplex derivative fitting the form

can be analyzed for FP errors. It derives non-FP gradient error bounds (e.g.,

for GSG and

for GCSG) and couples them with FP error bounds arising from pseudo-inverse computation and function evaluations, resulting in concrete bounds and guidance on selecting the sample-diameter

. The analysis applies to the generalized simplex gradient (GSG), generalized centred simplex gradient (GCSG), and generalized adapted centred simplex gradient (GACSG), including cases with misalignment or distortion between sample sets. A key contribution is the explicit derivation of minimal

values that balance discretization accuracy against FP error, enabling more reliable gradient approximations in practice. The framework also highlights that tighter bounds may be achievable when the underlying structure of a specific simplex derivative is exploited.

Abstract

Paper Structure (9 sections, 10 theorems, 56 equations)

This paper contains 9 sections, 10 theorems, 56 equations.

Introduction
Notation
Simplex derivatives
Error bounds without floating point errors
Floating point errors
Floating point error in matrix inversion
Floating point error in function evaluations
Floating point errors in matrix inversion and function evaluations
Examples

Key Result

theorem 1

Suppose that $\mathbb{Y}=\{y_0,y_0+d_1,\ldots,y_0+d_p\}\subseteq\mathbb{R}^n$. Let $\bar{\Delta}>0$. Suppose that $f\in\mathcal{C}^{1+}$ on $B_{\bar{\Delta}}(y_0)$ with constant $\nu$ and $L\in\mathbb{R}^{n\times p}$ has full rank. Suppose that $\Delta=\overline{\mathrm{diam}}(\mathbb{Y})\le\bar{\De where $\widehat{L}=L\slash\Delta$. In particular,

Theorems & Definitions (27)

definition thmcounterdefinition: Moore–Penrose pseudo-inverse
definition thmcounterdefinition: Generalized Simplex Gradient, GSG
definition thmcounterdefinition: Generalized Centred Simplex Gradient, GCSG
remark thmcounterremark
definition thmcounterdefinition: Stretching parameter
definition thmcounterdefinition: Rotation angle and rotation matrix
definition thmcounterdefinition: Generalized Adapted Centred Simplex Gradient, GACSG
theorem 1: Error bound of the GSG
theorem 2: Error bound of the GCSG
lemma thmcounterlemma
...and 17 more

A general framework for floating point error analysis of simplex derivatives

TL;DR

Abstract

A general framework for floating point error analysis of simplex derivatives

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (27)