Coherence-based Approximate Derivatives via Web of Affine Spaces Optimization
Daniel Rakita, Chen Liang, Qian Wang
TL;DR
This work addresses the cost of computing sequences of derivatives for differentiable functions by exploiting coherence across consecutive inputs. It introduces Web of Affine Spaces (WASP) Optimization, reframing derivative computation as a constrained least-squares problem that uses a web of affine spaces anchored by past JVP observations to locate an approximate derivative $\mathbf{D}^{*}$ efficiently; the solution is obtained in closed form via a KKT system and can be updated with minimal forward passes. An error-detection/correction mechanism maintains accuracy over long sequences by substituting ground-truth JVPs when needed and updating the web of affine spaces to re-intersect at the new estimate. Empirical results show WASP achieves faster derivative computation than finite differencing and certain auto-diff baselines on small-to-medium problems and demonstrates practical benefits in robot optimization, while acknowledging limitations in precision, step-size sensitivity, and scalability to large networks. The approach offers a pragmatic pathway to real-time derivative estimation in robotics and related domains, enabling more responsive optimization with modest computational overhead and code changes.
Abstract
Computing derivatives is a crucial subroutine in computer science and related fields as it provides a local characterization of a function's steepest directions of ascent or descent. In this work, we recognize that derivatives are often not computed in isolation; conversely, it is quite common to compute a \textit{sequence} of derivatives, each one somewhat related to the last. Thus, we propose accelerating derivative computation by reusing information from previous, related calculations-a general strategy known as \textit{coherence}. We introduce the first instantiation of this strategy through a novel approach called the Web of Affine Spaces (WASP) Optimization. This approach provides an accurate approximation of a function's derivative object (i.e. gradient, Jacobian matrix, etc.) at the current input within a sequence. Each derivative within the sequence only requires a small number of forward passes through the function (typically two), regardless of the number of function inputs and outputs. We demonstrate the efficacy of our approach through several numerical experiments, comparing it with alternative derivative computation methods on benchmark functions. We show that our method significantly improves the performance of derivative computation on small to medium-sized functions, i.e., functions with approximately fewer than 500 combined inputs and outputs. Furthermore, we show that this method can be effectively applied in a robotics optimization context. We conclude with a discussion of the limitations and implications of our work. Open-source code, visual explanations, and videos are located at the paper website: \href{https://apollo-lab-yale.github.io/25-RSS-WASP-website/}{https://apollo-lab-yale.github.io/25-RSS-WASP-website/}.
