Optimal approximation of a large matrix by a sum of projected linear mappings on prescribed subspaces

Phil Howlett; Anatoli Torokhti

Optimal approximation of a large matrix by a sum of projected linear mappings on prescribed subspaces

Phil Howlett, Anatoli Torokhti

TL;DR

This work addresses the problem of optimally approximating a large matrix $A$ by a sum of projected mappings on prescribed subspaces, expressed as $BXC$ with kernel $X$. The authors introduce the elementary block operations scheme (EBOS), which splits the problem into two reduced systems $YCC^{*}=AC^{*}$ and $B^{*}BX=B^{*}Y$, and then applies a sequence of block-wise transformations to obtain small, decoupled Moore–Penrose inverses. EBOS provides explicit formulas for the reduced solutions $Y_+$ and $X_+$, yielding the optimal approximation $B X_+ C$ with significantly lower computational cost than direct large-scale pseudo-inverse calculations; theoretical justification is grounded in a result by Baksalary and Baksalary and the reduction to block-diagonal Gram matrices. The practical impact is substantial for large-scale data processing tasks, where the method can achieve up to about 40% speedups, and is readily implementable in MATLAB with existing numerical linear-algebra tooling.

Abstract

We propose and justify a matrix reduction method for calculating the optimal approximation of an observed matrix $A \in {\mathbb C}^{m \times n}$ by a sum $\sum_{i=1}^p \sum_{j=1}^q B_iX_{ij}C_j$ of matrix products where each $B_i \in {\mathbb C}^{m \times g_i}$ and $C_j \in {\mathbb C}^{h_j \times n}$ is known and where the unknown matrix kernels $X_{ij}$ are determined by minimizing the Frobenius norm of the error. The sum can be represented as a bounded linear mapping $BXC$ with unknown kernel $X$ from a prescribed subspace ${\mathcal T} \subseteq {\mathbb C}^n$ onto a prescribed subspace ${\mathcal S} \subseteq {\mathbb C}^m$ defined respectively by the collective domains and ranges of the given matrices $C_1,\ldots,C_q$ and $B_1,\ldots,B_p$. We show that the optimal kernel is $X = B^†AC^†$ and that the optimal approximation $BB^†AC^†C$ is the projection of the observed mapping $A$ onto a mapping from ${\mathcal T}$ to ${\mathcal S}$. If $A$ is large $B$ and $C$ may also be large and direct calculation of $B^†$ and $C^†$ becomes unwieldy and inefficient. { The proposed method avoids} this difficulty by reducing the solution process to finding the pseudo-inverses of a collection of much smaller matrices. This significantly reduces the computational burden.

Optimal approximation of a large matrix by a sum of projected linear mappings on prescribed subspaces

TL;DR

This work addresses the problem of optimally approximating a large matrix

by a sum of projected mappings on prescribed subspaces, expressed as

with kernel

. The authors introduce the elementary block operations scheme (EBOS), which splits the problem into two reduced systems

and

, and then applies a sequence of block-wise transformations to obtain small, decoupled Moore–Penrose inverses. EBOS provides explicit formulas for the reduced solutions

and

, yielding the optimal approximation

with significantly lower computational cost than direct large-scale pseudo-inverse calculations; theoretical justification is grounded in a result by Baksalary and Baksalary and the reduction to block-diagonal Gram matrices. The practical impact is substantial for large-scale data processing tasks, where the method can achieve up to about 40% speedups, and is readily implementable in MATLAB with existing numerical linear-algebra tooling.

Abstract

We propose and justify a matrix reduction method for calculating the optimal approximation of an observed matrix

by a sum

of matrix products where each

and

is known and where the unknown matrix kernels

are determined by minimizing the Frobenius norm of the error. The sum can be represented as a bounded linear mapping

with unknown kernel

from a prescribed subspace

onto a prescribed subspace

defined respectively by the collective domains and ranges of the given matrices

and

. We show that the optimal kernel is

and that the optimal approximation

is the projection of the observed mapping

onto a mapping from

. If

is large

and

may also be large and direct calculation of

and

becomes unwieldy and inefficient. { The proposed method avoids} this difficulty by reducing the solution process to finding the pseudo-inverses of a collection of much smaller matrices. This significantly reduces the computational burden.

Optimal approximation of a large matrix by a sum of projected linear mappings on prescribed subspaces

TL;DR

Abstract

Optimal approximation of a large matrix by a sum of projected linear mappings on prescribed subspaces

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (7)