Robust Automatic Differentiation of Square-Root Kalman Filters via Gramian Differentials

Adrien Corenflos

Robust Automatic Differentiation of Square-Root Kalman Filters via Gramian Differentials

Adrien Corenflos

Abstract

Square-root Kalman filters propagate state covariances in Cholesky-factor form for numerical stability, and are a natural target for gradient-based parameter learning in state-space models. Their core operation, triangularization of a matrix $M \in \mathbb{R}^{n \times m}$, is computed via a QR decomposition in practice, but naively differentiating through it causes two problems: the semi-orthogonal factor is non-unique when $m > n$, yielding undefined gradients; and the standard Jacobian formula involves inverses, which diverges when $M$ is rank-deficient. Both are resolved by the observation that all filter outputs relevant to learning depend on the input matrix only through the Gramian $MM^\top$, so the composite loss is smooth in $M$ even where the triangularization is not. We derive a closed-form chain-rule directly from the differential of this Gramian identity, prove it exact for the Kalman log-marginal likelihood and filtered moments, and extend it to rank-deficient inputs via a two-component decomposition: a column-space term based on the Moore--Penrose pseudoinverse, and a null-space correction for perturbations outside the column space of $M$.

Robust Automatic Differentiation of Square-Root Kalman Filters via Gramian Differentials

Abstract

, is computed via a QR decomposition in practice, but naively differentiating through it causes two problems: the semi-orthogonal factor is non-unique when

, yielding undefined gradients; and the standard Jacobian formula involves inverses, which diverges when

is rank-deficient. Both are resolved by the observation that all filter outputs relevant to learning depend on the input matrix only through the Gramian

, so the composite loss is smooth in

even where the triangularization is not. We derive a closed-form chain-rule directly from the differential of this Gramian identity, prove it exact for the Kalman log-marginal likelihood and filtered moments, and extend it to rank-deficient inputs via a two-component decomposition: a column-space term based on the Moore--Penrose pseudoinverse, and a null-space correction for perturbations outside the column space of

Paper Structure (12 sections, 5 theorems, 15 equations, 1 figure)

This paper contains 12 sections, 5 theorems, 15 equations, 1 figure.

Introduction
Square-Root Kalman Filtering
Differentiation via Gramian Differentials
The Smooth-Factorization Argument
Gramian Sufficiency
Deriving the Surrogate Tangent
Column-space component
Null-space component
Application to a Single Predict-Update Step
Linearity and Reverse-Mode Differentiation
Empirical Validation
Conclusion

Key Result

Lemma 1

Let $\ell: \mathbb{R}^{n \times n}_{\mathrm{sym}} \to \mathbb{R}$ be smooth and suppose $\ell$ depends on $L = \mathcal{T}(M)$ only through the Gramian $\Sigma = LL^\top$. Then $M \mapsto \ell(\mathcal{T}(M))$ factors as the composition of the polynomial map $M \mapsto MM^\top$ with a smooth functio

Figures (1)

Figure 1: Log-marginal likelihood for the model in Section \ref{['sec:experiments']} as a function of $\alpha$ as well as AD-computed tangents represented by arrows on the curve. Curve and tangents agree.

Theorems & Definitions (12)

Definition 1: Triangularization Operator
Lemma 1: Smooth factorization
proof
Proposition 1: Gramian sufficiency
proof
Definition 2: Surrogate JVP
Proposition 2: Verification
proof
Remark 1
Proposition 3: Linearity
...and 2 more

Robust Automatic Differentiation of Square-Root Kalman Filters via Gramian Differentials

Abstract

Robust Automatic Differentiation of Square-Root Kalman Filters via Gramian Differentials

Authors

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (12)