Table of Contents
Fetching ...

Regression-aware decompositions

Mark Tygert

TL;DR

The paper develops regression-aware decompositions that integrate linear regression with classic dimensionality-reduction tools like ID and PCA. By projecting the target matrix through $S=(A^*A)^{-1/2}A^*$ and working with $SB$, the authors define RAID (regression-aware ID) and RAPCA (regression-aware PCA) to produce low-rank, regression-consistent representations and stable interpolation of LS solutions. They also provide computationally simpler variants based on pivoted QR and demonstrate, across multiple datasets, that these methods outperform traditional CCA and standard IDs in preserving regression structure while enabling efficient subset selection. The work offers a principled framework for supervised dimensionality reduction, with practical impact for efficient model selection and interpretation in linear-regression contexts.

Abstract

Linear least-squares regression with a "design" matrix A approximates a given matrix B via minimization of the spectral- or Frobenius-norm discrepancy ||AX-B|| over every conformingly sized matrix X. Another popular approximation is low-rank approximation via principal component analysis (PCA) -- which is essentially singular value decomposition (SVD) -- or interpolative decomposition (ID). Classically, PCA/SVD and ID operate solely with the matrix B being approximated, not supervised by any auxiliary matrix A. However, linear least-squares regression models can inform the ID, yielding regression-aware ID. As a bonus, this provides an interpretation as regression-aware PCA for a kind of canonical correlation analysis between A and B. The regression-aware decompositions effectively enable supervision to inform classical dimensionality reduction, which classically has been totally unsupervised. The regression-aware decompositions reveal the structure inherent in B that is relevant to regression against A.

Regression-aware decompositions

TL;DR

The paper develops regression-aware decompositions that integrate linear regression with classic dimensionality-reduction tools like ID and PCA. By projecting the target matrix through and working with , the authors define RAID (regression-aware ID) and RAPCA (regression-aware PCA) to produce low-rank, regression-consistent representations and stable interpolation of LS solutions. They also provide computationally simpler variants based on pivoted QR and demonstrate, across multiple datasets, that these methods outperform traditional CCA and standard IDs in preserving regression structure while enabling efficient subset selection. The work offers a principled framework for supervised dimensionality reduction, with practical impact for efficient model selection and interpretation in linear-regression contexts.

Abstract

Linear least-squares regression with a "design" matrix A approximates a given matrix B via minimization of the spectral- or Frobenius-norm discrepancy ||AX-B|| over every conformingly sized matrix X. Another popular approximation is low-rank approximation via principal component analysis (PCA) -- which is essentially singular value decomposition (SVD) -- or interpolative decomposition (ID). Classically, PCA/SVD and ID operate solely with the matrix B being approximated, not supervised by any auxiliary matrix A. However, linear least-squares regression models can inform the ID, yielding regression-aware ID. As a bonus, this provides an interpretation as regression-aware PCA for a kind of canonical correlation analysis between A and B. The regression-aware decompositions effectively enable supervision to inform classical dimensionality reduction, which classically has been totally unsupervised. The regression-aware decompositions reveal the structure inherent in B that is relevant to regression against A.

Paper Structure

This paper contains 17 sections, 1 theorem, 24 equations, 9 figures, 2 tables.

Key Result

Theorem 1

Suppose that $m$ and $n$ are positive integers, and $B$ is an $m \times n$ matrix. Then, for any positive integer $k$ with $k \le m$ and $k \le n$, there exist a $k \times n$ matrix $P$ and an $m \times k$ matrix $C$ whose columns constitute a subset of the columns of $B$, such that

Figures (9)

  • Figure 1: Example from Subsection \ref{['potential']}
  • Figure 2: Example from Subsection \ref{['synthseries']}
  • Figure 3: Example from Subsection \ref{['ld']} with $l = 100$
  • Figure 4: Example from Subsection \ref{['ld']} with $l = 200$
  • Figure 5: Example from Subsection \ref{['ld']} with $l = 300$
  • ...and 4 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Remark 2