Table of Contents
Fetching ...

Derivatives of the full QR factorisation and of the factored-form and compact WY representations

Stefanos-Aldo Papanicolopulos

TL;DR

This work addresses the gap in differentiating the QR factorisation for tall matrices by deriving derivatives through the compact WY representation, which in turn yields expressions for the full QR factorisation and for the derivative of the factored-form Q. The authors provide three novel, algorithm-aware results: (a) the derivative of the compact WY representation, (b) the derivative of the factored-form representation of Q, and (c) the derivative of the Q_{mp} block when Q_{mm} is expressed as a product of Householder reflections. These contributions enable accurate and efficient automatic differentiation and symbolic calculations for QR-based computations, with direct applications to variable projection in separable nonlinear least squares and to AD frameworks. The work clarifies that previous assumptions (such as Ω_{pp}=0) are inappropriate for the full factorisation under Householder implementations and offers practical formulas that can be integrated into optimized linear algebra libraries. Overall, the paper closes a significant gap by providing explicit, implementable derivatives for full QR factorisation and its common representations, broadening the applicability of QR in gradient-based optimization and differentiable programming.

Abstract

QR factorisation plays an important role in matrix computations. Within the context of optimisation and of automatic differentiation of such computations, we need to compute the derivative of this factorisation. For tall matrices, however, existing results only cover the so-called thin case. We provide for the first time expressions for the derivative of the full QR factorisation of a tall matrix, in the usual case where the Q factor is a product of Householder reflections. These expressions are obtained based on novel results for the derivative of the compact WY representation of Q, which also yield the derivative of the factored-form representation of Q, both of which are useful on their own. These three results can be used directly in applications such as variable projection for solving separable non-linear least squares problems, and can also extend the current linear algebra capabilities of automatic differentiation frameworks.

Derivatives of the full QR factorisation and of the factored-form and compact WY representations

TL;DR

This work addresses the gap in differentiating the QR factorisation for tall matrices by deriving derivatives through the compact WY representation, which in turn yields expressions for the full QR factorisation and for the derivative of the factored-form Q. The authors provide three novel, algorithm-aware results: (a) the derivative of the compact WY representation, (b) the derivative of the factored-form representation of Q, and (c) the derivative of the Q_{mp} block when Q_{mm} is expressed as a product of Householder reflections. These contributions enable accurate and efficient automatic differentiation and symbolic calculations for QR-based computations, with direct applications to variable projection in separable nonlinear least squares and to AD frameworks. The work clarifies that previous assumptions (such as Ω_{pp}=0) are inappropriate for the full factorisation under Householder implementations and offers practical formulas that can be integrated into optimized linear algebra libraries. Overall, the paper closes a significant gap by providing explicit, implementable derivatives for full QR factorisation and its common representations, broadening the applicability of QR in gradient-based optimization and differentiable programming.

Abstract

QR factorisation plays an important role in matrix computations. Within the context of optimisation and of automatic differentiation of such computations, we need to compute the derivative of this factorisation. For tall matrices, however, existing results only cover the so-called thin case. We provide for the first time expressions for the derivative of the full QR factorisation of a tall matrix, in the usual case where the Q factor is a product of Householder reflections. These expressions are obtained based on novel results for the derivative of the compact WY representation of Q, which also yield the derivative of the factored-form representation of Q, both of which are useful on their own. These three results can be used directly in applications such as variable projection for solving separable non-linear least squares problems, and can also extend the current linear algebra capabilities of automatic differentiation frameworks.
Paper Structure (11 sections, 42 equations)