Table of Contents
Fetching ...

A tutorial on automatic differentiation with complex numbers

Nicholas Krämer

TL;DR

This work addresses the challenge of performing automatic differentiation on complex-valued programs without assuming holomorphicity. It proposes a practical framework based on latent real AD and Wirtinger derivatives to obtain forward- and reverse-mode Jacobian interactions for complex inputs and outputs. By recasting complex numbers as latent real pairs and using a basis change to Wirtinger derivatives, the tutorial derives transparent JVPs and VJPs, including their software implications and gradient conventions. The approach retains consistency with real-valued differentiation when possible and reduces the computational burden of deriving gradients for nonholomorphic functions, enabling robust complex-valued gradient propagation in practical systems.

Abstract

Automatic differentiation is everywhere, but there exists only minimal documentation of how it works in complex arithmetic beyond stating "derivatives in $\mathbb{C}^d$" $\cong$ "derivatives in $\mathbb{R}^{2d}$" and, at best, shallow references to Wirtinger calculus. Unfortunately, the equivalence $\mathbb{C}^d \cong \mathbb{R}^{2d}$ becomes insufficient as soon as we need to derive custom gradient rules, e.g., to avoid differentiating "through" expensive linear algebra functions or differential equation simulators. To combat such a lack of documentation, this article surveys forward- and reverse-mode automatic differentiation with complex numbers, covering topics such as Wirtinger derivatives, a modified chain rule, and different gradient conventions while explicitly avoiding holomorphicity and the Cauchy--Riemann equations (which would be far too restrictive). To be precise, we will derive, explain, and implement a complex version of Jacobian-vector and vector-Jacobian products almost entirely with linear algebra without relying on complex analysis or differential geometry. This tutorial is a call to action, for users and developers alike, to take complex values seriously when implementing custom gradient propagation rules -- the manuscript explains how.

A tutorial on automatic differentiation with complex numbers

TL;DR

This work addresses the challenge of performing automatic differentiation on complex-valued programs without assuming holomorphicity. It proposes a practical framework based on latent real AD and Wirtinger derivatives to obtain forward- and reverse-mode Jacobian interactions for complex inputs and outputs. By recasting complex numbers as latent real pairs and using a basis change to Wirtinger derivatives, the tutorial derives transparent JVPs and VJPs, including their software implications and gradient conventions. The approach retains consistency with real-valued differentiation when possible and reduces the computational burden of deriving gradients for nonholomorphic functions, enabling robust complex-valued gradient propagation in practical systems.

Abstract

Automatic differentiation is everywhere, but there exists only minimal documentation of how it works in complex arithmetic beyond stating "derivatives in " "derivatives in " and, at best, shallow references to Wirtinger calculus. Unfortunately, the equivalence becomes insufficient as soon as we need to derive custom gradient rules, e.g., to avoid differentiating "through" expensive linear algebra functions or differential equation simulators. To combat such a lack of documentation, this article surveys forward- and reverse-mode automatic differentiation with complex numbers, covering topics such as Wirtinger derivatives, a modified chain rule, and different gradient conventions while explicitly avoiding holomorphicity and the Cauchy--Riemann equations (which would be far too restrictive). To be precise, we will derive, explain, and implement a complex version of Jacobian-vector and vector-Jacobian products almost entirely with linear algebra without relying on complex analysis or differential geometry. This tutorial is a call to action, for users and developers alike, to take complex values seriously when implementing custom gradient propagation rules -- the manuscript explains how.
Paper Structure (34 sections, 37 equations, 5 figures, 1 table)

This paper contains 34 sections, 37 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Forward-mode differentiation via Jacobian-vector products in JAX. The inputs to the JVP are called "tangents" because technically, $v'$ is in the tangent space of $V$, not $V$ itself griewank2008evaluating.
  • Figure 2: Reverse-mode differentiation via vector-Jacobian products in JAX.
  • Figure 3: Complex forward-mode differentiation in JAX. Unlike the script in \ref{['figure-forward-mode-differentiation']}, this script works with complex numbers.
  • Figure 4: Evaluate the gradient of $f(z) = \frac{1}{2} z^2$ to reveal the gradient convention.
  • Figure 5: Complex reverse-mode differentiation in JAX. Note the gradient convention switch.