Higher Order Automatic Differentiation of Higher Order Functions
Mathieu Huot, Sam Staton, Matthijs Vákár
TL;DR
This work develops a rigorous semantic foundation for forward-mode automatic differentiation on higher-order languages by using diffeological spaces to model smooth programs and a new triplet semantic structure for higher types. It introduces a canonical forward-AD macro $oxed{ ightarrow ext{D}}_{(k,R)}$ that computes the $(k,R)$-Taylor representation, and proves its correctness through a categorical gluing/logical-relations argument, with a full result extended to all first-order types via jet bundles on manifolds. The paper extends the language with variants and inductive types and demonstrates how AD semantics and correctness carry over, while discussing non-uniqueness of derivatives for higher-order functions and potential canonical alternatives. Finally, it connects the theoretical framework to practical AD implementations, addressing vector representations, efficiency, and the prospects for reverse-mode and mixed-mode AD within this semantic foundation.
Abstract
We present semantic correctness proofs of automatic differentiation (AD). We consider a forward-mode AD method on a higher order language with algebraic data types, and we characterise it as the unique structure preserving macro given a choice of derivatives for basic operations. We describe a rich semantics for differentiable programming, based on diffeological spaces. We show that it interprets our language, and we phrase what it means for the AD method to be correct with respect to this semantics. We show that our characterisation of AD gives rise to an elegant semantic proof of its correctness based on a gluing construction on diffeological spaces. We explain how this is, in essence, a logical relations argument. Throughout, we show how the analysis extends to AD methods for computing higher order derivatives using a Taylor approximation.
