Table of Contents
Fetching ...

Better than best low-rank approximation with the singular value decomposition

David F. Gleich

TL;DR

The paper argues that the optimal low-rank approximation provided by the Eckart-Young framework depends critically on how data are organized into a matrix. By reorganizing data into alternative representations such as tiles or Kronecker-like structures, one can achieve markedly better approximations for the same number of parameters, as shown by image and temporal-data case studies. A theoretical bound demonstrates that, for certain structured matrices, the gains from reorganization can grow with dimension and even become unbounded. The work connects these empirical findings to Kronecker-product SVD and tensor-approximation literature, outlining practical implications for representation design and future research directions.

Abstract

The Eckhart-Young theorem states that the best low-rank approximation of a matrix can be constructed from the leading singular values and vectors of the matrix. Here, we illustrate that the practical implications of this result crucially depend on the organization of the matrix data. In particular, we will show examples where a rank 2 approximation of the matrix data in a different representation more accurately represents the entire matrix than a rank 5 approximation of the original matrix data -- even though both approximations have the same number of underlying parameters. Beyond images, we show examples of how flexible orientation enables better approximation of time series data, which suggests additional applicability of the findings. Finally, we conclude with a theoretical result that the effect of data organization can result in an unbounded improvement to the matrix approximation factor as the matrix dimension grows.

Better than best low-rank approximation with the singular value decomposition

TL;DR

The paper argues that the optimal low-rank approximation provided by the Eckart-Young framework depends critically on how data are organized into a matrix. By reorganizing data into alternative representations such as tiles or Kronecker-like structures, one can achieve markedly better approximations for the same number of parameters, as shown by image and temporal-data case studies. A theoretical bound demonstrates that, for certain structured matrices, the gains from reorganization can grow with dimension and even become unbounded. The work connects these empirical findings to Kronecker-product SVD and tensor-approximation literature, outlining practical implications for representation design and future research directions.

Abstract

The Eckhart-Young theorem states that the best low-rank approximation of a matrix can be constructed from the leading singular values and vectors of the matrix. Here, we illustrate that the practical implications of this result crucially depend on the organization of the matrix data. In particular, we will show examples where a rank 2 approximation of the matrix data in a different representation more accurately represents the entire matrix than a rank 5 approximation of the original matrix data -- even though both approximations have the same number of underlying parameters. Beyond images, we show examples of how flexible orientation enables better approximation of time series data, which suggests additional applicability of the findings. Finally, we conclude with a theoretical result that the effect of data organization can result in an unbounded improvement to the matrix approximation factor as the matrix dimension grows.
Paper Structure (15 sections, 5 theorems, 29 equations, 2 figures)

This paper contains 15 sections, 5 theorems, 29 equations, 2 figures.

Key Result

THEOREM 1

thm:main Let where $|\alpha| < 1$, $|\beta|< 1$, and $\gamma > 0$. Let the matrix size be $n \times n$. Then let $\boldsymbol{{X}}^{(1)}$ be the best rank-1 approximation of $\boldsymbol{{X}}$, we have $\sum_{ij} (X_{ij} - X^{(1)}_{ij})^2 \ge (\sum_{ij} X_{ij}^2) - \omega_X^2$ for a constant $\omega_X$ independe

Figures (2)

  • Figure 1: An example of how re-arranging the image data from a matrix (A) into a matrix of tiles (B) produces better approximations with fewer parameters. For instance (C) vs (F) shows a large reduction in error with the same parameters whereas (E) vs (H) shows a large reduction in parameters at the same error. Comparing (D) and (G) shows reduction in both error and parameters result from approximating the matrix in (B) compared with (A).
  • Figure 3: (A) Moving average COVID-19 positivity rates for the 50 US states for 150 days starting from May 17, 2020. (B) If we view this as a 50 $\times$ 150 matrix in terms of states-by-days and use the optimal rank-2 approximation, we get 6.7% error with 300 parameters and the approximation shown in blue. The original data is shown as shaded. If we view this as a 150 $\times$ 50 matrix by the reorganization described in the text and use the optimal rank-2 approximation, we get 2.7% error with the same number of parameters (300). Moreover, the approximation is qualitatively better -- consider Arizona (AZ), Hawaii (HI), Kentucky (KY), Montana (MT), North Dakota (ND), North Carolina (NC)

Theorems & Definitions (9)

  • THEOREM 1
  • LEMMA 2
  • Proof 1
  • LEMMA 3
  • Proof 2
  • LEMMA 4
  • Proof 3
  • LEMMA 5
  • Proof 4