A general framework for floating point error analysis of simplex derivatives
Yiwen Chen, Warren Hare, Amy Wiebe
TL;DR
This work develops a general framework for floating point error analysis of simplex derivatives used in derivative-free optimization, showing that any simplex derivative fitting the form $\\nabla_X f(\\mathbb{Y})=(A^\top)^{\\dagger} B f(\\mathbb{Y})$ can be analyzed for FP errors. It derives non-FP gradient error bounds (e.g., $\\mathcal{O}(\\Delta)$ for GSG and $\\mathcal{O}(\\Delta^2)$ for GCSG) and couples them with FP error bounds arising from pseudo-inverse computation and function evaluations, resulting in concrete bounds and guidance on selecting the sample-diameter $\\Delta$. The analysis applies to the generalized simplex gradient (GSG), generalized centred simplex gradient (GCSG), and generalized adapted centred simplex gradient (GACSG), including cases with misalignment or distortion between sample sets. A key contribution is the explicit derivation of minimal $\\Delta$ values that balance discretization accuracy against FP error, enabling more reliable gradient approximations in practice. The framework also highlights that tighter bounds may be achievable when the underlying structure of a specific simplex derivative is exploited.
Abstract
Gradient approximations are a class of numerical approximation techniques that are of central importance in numerical optimization. In derivative-free optimization, most of the gradient approximations, including the simplex gradient, centred simplex gradient, and adapted centred simplex gradient, are in the form of simplex derivatives. Owing to machine precision, the approximation accuracy of any numerical approximation technique is subject to the influence of floating point errors. In this paper, we provide a general framework for floating point error analysis of simplex derivatives. Our framework is independent of the choice of the simplex derivative as long as it satisfies a general form. We review the definition and approximation accuracy of the generalized simplex gradient and generalized centred simplex gradient. We define and analyze the accuracy of a generalized version of the adapted centred simplex gradient. As examples, we apply our framework to the generalized simplex gradient, generalized centred simplex gradient, and generalized adapted centred simplex gradient. Based on the results, we give suggestions on the minimal choice of approximate diameter of the sample set.
