The Elements of Differentiable Programming
Mathieu Blondel, Vincent Roulet
TL;DR
The Elements of Differentiable Programming presents a rigorous, math-first account of differentiable programming, unifying automatic differentiation, optimization, and probabilistic learning. It builds from foundational calculus and differential geometry to practical representations of parameterized programs as computation graphs and DAGs, and details how to differentiate through complex constructs like control flow and data structures. A core thread is the JVP/VJP framework and the role of the exponential family in probabilistic learning, enabling end-to-end differentiable models with rich uncertainty quantification. The book also clarifies how to design differentiable operations, including smoothing and optimization techniques, to extend differentiable programming beyond deep learning to reinforcement learning and scientific computing.
Abstract
Artificial intelligence has recently experienced remarkable advances, fueled by large models, vast datasets, accelerated hardware, and, last but not least, the transformative power of differentiable programming. This new programming paradigm enables end-to-end differentiation of complex computer programs (including those with control flows and data structures), making gradient-based optimization of program parameters possible. As an emerging paradigm, differentiable programming builds upon several areas of computer science and applied mathematics, including automatic differentiation, graphical models, optimization and statistics. This book presents a comprehensive review of the fundamental concepts useful for differentiable programming. We adopt two main perspectives, that of optimization and that of probability, with clear analogies between the two. Differentiable programming is not merely the differentiation of programs, but also the thoughtful design of programs intended for differentiation. By making programs differentiable, we inherently introduce probability distributions over their execution, providing a means to quantify the uncertainty associated with program outputs.
