Table of Contents
Fetching ...

Using generalized simplex methods to approximate derivatives

Gabriel Jarry-Bolduc, Chayne Planiden

TL;DR

The paper addresses derivative-free estimation of Hessian information by introducing the generalized simplex Hessian (GSH) and the generalized centered simplex Hessian (GCSH), which compute Hessian entries using only function evaluations. It establishes rigorous error bounds, proving that GSH yields order-1 accuracy for the targeted Hessian entries and GCSH yields order-2 accuracy, with a projection operator $\operatorname{P}_{S,T_{1:m}}$ governing the approximation quality. The authors provide practical guidance for selecting direction matrices $S$ and $T_{1:m}$ to approximate diagonal, off-diagonal, and row/column blocks, and they extend the framework to Hessian-vector products and higher-order derivatives via the Tressian $\nabla^3 f$ and a recursive formula for order-$P$ derivatives. The work includes explicit function-evaluation counts, discusses relationships to existing diagonal/centered approaches, and offers an implementable MATLAB-oriented approach for accurate, scalable curvature information in derivative-free optimization.

Abstract

This paper presents two methods for approximating a proper subset of the entries of a Hessian using only function evaluations. These approximations are obtained using the techniques called \emph{generalized simplex Hessian} and \emph{generalized centered simplex Hessian}. We show how to choose the matrices of directions involved in the computation of these two techniques depending on the entries of the Hessian of interest. We discuss the number of function evaluations required in each case and develop a general formula to approximate all order-$P$ partial derivatives. Since only function evaluations are required to compute the methods discussed in this paper, they are suitable for use in derivative-free optimization methods.

Using generalized simplex methods to approximate derivatives

TL;DR

The paper addresses derivative-free estimation of Hessian information by introducing the generalized simplex Hessian (GSH) and the generalized centered simplex Hessian (GCSH), which compute Hessian entries using only function evaluations. It establishes rigorous error bounds, proving that GSH yields order-1 accuracy for the targeted Hessian entries and GCSH yields order-2 accuracy, with a projection operator governing the approximation quality. The authors provide practical guidance for selecting direction matrices and to approximate diagonal, off-diagonal, and row/column blocks, and they extend the framework to Hessian-vector products and higher-order derivatives via the Tressian and a recursive formula for order- derivatives. The work includes explicit function-evaluation counts, discusses relationships to existing diagonal/centered approaches, and offers an implementable MATLAB-oriented approach for accurate, scalable curvature information in derivative-free optimization.

Abstract

This paper presents two methods for approximating a proper subset of the entries of a Hessian using only function evaluations. These approximations are obtained using the techniques called \emph{generalized simplex Hessian} and \emph{generalized centered simplex Hessian}. We show how to choose the matrices of directions involved in the computation of these two techniques depending on the entries of the Hessian of interest. We discuss the number of function evaluations required in each case and develop a general formula to approximate all order- partial derivatives. Since only function evaluations are required to compute the methods discussed in this paper, they are suitable for use in derivative-free optimization methods.
Paper Structure (4 sections, 4 theorems, 26 equations)

This paper contains 4 sections, 4 theorems, 26 equations.

Key Result

theorem 1

hare2023hessianpublished Let $f:\operatorname{dom} f \subseteq \mathbb{R}^n\to\mathbb{R}$ be $\mathcal{C}^{3}$ on $B_n(x^0;\overline{\Delta})$ where $x^0 \in \operatorname{dom} f$ is the point of interest and $\overline{\Delta}>0$. Denote by $L_{\nabla^2 f}\geq 0$ the Lipschitz constant of $\nabla^2

Theorems & Definitions (14)

  • definition thmcounterdefinition: Moore-Penrose pseudoinverse
  • definition thmcounterdefinition: Generalized simplex gradient
  • definition thmcounterdefinition: Generalized simplex Hessian
  • definition thmcounterdefinition: Generalized centered simplex Hessian
  • definition thmcounterdefinition
  • theorem 1: Error bounds for the GSH
  • theorem 2: Error bounds for the GCSH
  • definition thmcounterdefinition
  • definition thmcounterdefinition: Centered simplex Hessian diagonal
  • definition thmcounterdefinition: Partial diagonal matrix
  • ...and 4 more