Table of Contents
Fetching ...

A Taxonomy of Numerical Differentiation Methods

Pavel Komarov, Floris van Breugel, J. Nathan Kutz

TL;DR

The paper surveys a comprehensive taxonomy of numerical differentiation methods across five problem regimes: analytic/static relationships, noiseless simulations, noisy data with prior structure, and noisy data without priors, including irregular sampling. It integrates theory, practical recommendations, and a unifying performance framework, highlighting when AutoDiff, spectral methods, FE, FD, and Kalman-based approaches are most effective. A key contribution is the detailed comparative guidance and the PyNumDiff toolkit, which enables practitioners to implement and tune methods within the described framework. The work emphasizes that constraining assumptions (periodicity, known dynamics, or smoothness) yield substantial gains, while unknown/noisy settings benefit from robust, flexible strategies like RTS smoothing and Pareto-guided hyperparameter tuning. Overall, it provides actionable guidance for selecting and configuring derivative estimators in diverse scientific and engineering contexts, supported by a public software ecosystem.

Abstract

Differentiation is a cornerstone of computing and data analysis in every discipline of science and engineering. Indeed, most fundamental physics laws are expressed as relationships between derivatives in space and time. However, derivatives are rarely directly measurable and must instead be computed, often from noisy, potentially corrupt data streams. There is a rich and broad literature of computational differentiation algorithms, but many impose extra constraints to work correctly, e.g. periodic boundary conditions, or are compromised in the presence of noise and corruption. It can therefore be challenging to select the method best-suited to any particular problem. Here, we review a broad range of numerical methods for calculating derivatives, present important contextual considerations and choice points, compare relative advantages, and provide basic theory for each algorithm in order to assist users with the mathematical underpinnings. This serves as a practical guide to help scientists and engineers match methods to application domains. We also provide an open-source Python package, PyNumDiff, which contains a broad suite of methods for differentiating noisy data.

A Taxonomy of Numerical Differentiation Methods

TL;DR

The paper surveys a comprehensive taxonomy of numerical differentiation methods across five problem regimes: analytic/static relationships, noiseless simulations, noisy data with prior structure, and noisy data without priors, including irregular sampling. It integrates theory, practical recommendations, and a unifying performance framework, highlighting when AutoDiff, spectral methods, FE, FD, and Kalman-based approaches are most effective. A key contribution is the detailed comparative guidance and the PyNumDiff toolkit, which enables practitioners to implement and tune methods within the described framework. The work emphasizes that constraining assumptions (periodicity, known dynamics, or smoothness) yield substantial gains, while unknown/noisy settings benefit from robust, flexible strategies like RTS smoothing and Pareto-guided hyperparameter tuning. Overall, it provides actionable guidance for selecting and configuring derivative estimators in diverse scientific and engineering contexts, supported by a public software ecosystem.

Abstract

Differentiation is a cornerstone of computing and data analysis in every discipline of science and engineering. Indeed, most fundamental physics laws are expressed as relationships between derivatives in space and time. However, derivatives are rarely directly measurable and must instead be computed, often from noisy, potentially corrupt data streams. There is a rich and broad literature of computational differentiation algorithms, but many impose extra constraints to work correctly, e.g. periodic boundary conditions, or are compromised in the presence of noise and corruption. It can therefore be challenging to select the method best-suited to any particular problem. Here, we review a broad range of numerical methods for calculating derivatives, present important contextual considerations and choice points, compare relative advantages, and provide basic theory for each algorithm in order to assist users with the mathematical underpinnings. This serves as a practical guide to help scientists and engineers match methods to application domains. We also provide an open-source Python package, PyNumDiff, which contains a broad suite of methods for differentiating noisy data.

Paper Structure

This paper contains 50 sections, 104 equations, 57 figures, 5 tables, 4 algorithms.

Figures (57)

  • Figure 1: Preferred differentiation algorithms at a glance, for the five major situations identified in this review. Names are shorthand, with details given in later sections, hyperlinked for convenience. Further details and rationale behind these distinctions and selections are mapped in \ref{['fig:flowchart']}. $\Delta x$ and $\Delta t$ are spacing between samples, $m$ is FD scheme order, $h$ is element side-length, and $\alpha$ is a constant, making both the FEM and FD error bounds similarly algebraic, while error of spectral methods decreases "super-algebraically" as the number of samples and basis functions increases.
  • Figure 1: Example of a simple computational graph for $\nabla_{\boldsymbol\theta} L$ with compositional $f(\mathbf{x}; \boldsymbol\theta) = f_3(f_2(f_1(\mathbf{x}; \boldsymbol\theta_1);\boldsymbol\theta_2);\boldsymbol\theta_3)$, which results in $\frac{\partial f}{\partial \boldsymbol\theta} = \frac{\partial f_3}{\partial f_2} \frac{\partial f_2}{\partial f_1} \frac{\partial f_1}{\partial \boldsymbol\theta_1} + \frac{\partial f_3}{\partial f_2} \frac{\partial f_2}{\partial \boldsymbol\theta_2} + \frac{\partial f_3}{\partial \boldsymbol\theta_3}$ by chain rule.
  • Figure 1: A vector change of coordinates, expressing vector $\vec{y}$ in terms of the spanning basis vectors $(\vec{\xi}_0, \vec{\xi}_1)$ instead of axis-aligned unit vectors $(\vec{e}_0,\vec{e}_1)$, can be accomplished by $\vec{y} = \alpha \vec{\xi}_0 + \beta \vec{\xi}_1$. If $\vec{\xi}_0$ and $\vec{\xi}_1$ are orthogonal, then $\alpha = \frac{\langle \vec{\xi_0},\vec{y}\rangle}{\|\vec{\xi}_0\|_2^2},\ \beta = \frac{\langle \vec{\xi}_1,\vec{y}\rangle}{\|\vec{\xi}_1\|_2^2}$.
  • Figure 1: We can stack red, blue, and green channels of an image, perform SVD on the 2D matrix, and reshape $U[:,:r] \cdot \Sigma[:r] \cdot V^T[:r]$ to produce reconstructions with $r$ modes. In this case, the first 10 modes alone contain more than half the signal energy. Source image from Smithsonian.
  • Figure 1: Visualization of one standard deviation of a 2D Gaussian distribution with mean $\boldsymbol{\mu} = 0$ and covariance $\boldsymbol{\Sigma} = \mathbf{V}\boldsymbol{\Lambda} \mathbf{V}^T$.
  • ...and 52 more figures