Table of Contents
Fetching ...

Interlacing Polynomial Method for the Column Subset Selection Problem

Jian-Feng Cai, Zhiqiang Xu, Zili Xu

TL;DR

The paper tackles the spectral-norm column subset selection problem by deriving a new upper bound on the minimal residual when selecting $k$ columns from $A\in\mathbb{R}^{n\times d}$. It leverages the interlacing polynomials framework to bound the largest root of a polynomial derived from the spectrum of $A^T A$, yielding an explicit bound that interpolates between $\|A\|_2^2$ and a data-dependent quantity $\alpha$ and is sharp for spectral power-law decay. A deterministic polynomial-time algorithm is provided that achieves this bound up to a computational error, with precise complexity depending on $d$ relative to $n$. The work connects to volume sampling and the restricted invertibility principle, offering tighter, situation-specific guarantees than prior multiplicative bounds and enabling principled sketching of matrices via column-subset projections.

Abstract

This paper investigates the spectral norm version of the column subset selection problem. Given a matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ and a positive integer $k\leq\text{rank}(\mathbf{A})$, the objective is to select exactly $k$ columns of $\mathbf{A}$ that minimize the spectral norm of the residual matrix after projecting $\mathbf{A}$ onto the space spanned by the selected columns. We use the method of interlacing polynomials introduced by Marcus-Spielman-Srivastava to derive a new upper bound on the minimal approximation error. This new bound is asymptotically sharp when the matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ obeys a spectral power-law decay. The relevant expected characteristic polynomials can be written as an extension of the expected polynomial for the restricted invertibility problem, incorporating two extra variable substitution operators. Finally, we propose a deterministic polynomial-time algorithm that achieves this error bound up to a computational error.

Interlacing Polynomial Method for the Column Subset Selection Problem

TL;DR

The paper tackles the spectral-norm column subset selection problem by deriving a new upper bound on the minimal residual when selecting columns from . It leverages the interlacing polynomials framework to bound the largest root of a polynomial derived from the spectrum of , yielding an explicit bound that interpolates between and a data-dependent quantity and is sharp for spectral power-law decay. A deterministic polynomial-time algorithm is provided that achieves this bound up to a computational error, with precise complexity depending on relative to . The work connects to volume sampling and the restricted invertibility principle, offering tighter, situation-specific guarantees than prior multiplicative bounds and enabling principled sketching of matrices via column-subset projections.

Abstract

This paper investigates the spectral norm version of the column subset selection problem. Given a matrix and a positive integer , the objective is to select exactly columns of that minimize the spectral norm of the residual matrix after projecting onto the space spanned by the selected columns. We use the method of interlacing polynomials introduced by Marcus-Spielman-Srivastava to derive a new upper bound on the minimal approximation error. This new bound is asymptotically sharp when the matrix obeys a spectral power-law decay. The relevant expected characteristic polynomials can be written as an extension of the expected polynomial for the restricted invertibility problem, incorporating two extra variable substitution operators. Finally, we propose a deterministic polynomial-time algorithm that achieves this error bound up to a computational error.
Paper Structure (20 sections, 14 theorems, 88 equations, 1 algorithm)

This paper contains 20 sections, 14 theorems, 88 equations, 1 algorithm.

Key Result

Theorem 1.1

Let $\mathbf{A}=[\mathbf{a}_1,\ldots,\mathbf{a}_d]\in\mathbb{R}^{n\times d}$ be a matrix of rank $t\leq \min\{d,n\}$. For each $1\leq i\leq t$, let $\lambda_i$ be the $i$-th largest eigenvalue value of $\mathbf{A}^{\rm T}\mathbf{A}$. Assume that $\lambda_t<\lambda_1$. Let $\alpha$ and $\beta$ be two Then for any positive integer $k$ satisfying $\beta\cdot t\leq k< t$, there exists a subset $S\subs

Theorems & Definitions (30)

  • Theorem 1.1
  • Theorem 1.2
  • Theorem 1.3
  • Lemma 2.1
  • proof
  • Lemma 2.2
  • Proposition 2.1
  • proof
  • Remark 2.1
  • Remark 2.2
  • ...and 20 more