Table of Contents
Fetching ...

Many (most?) column subset selection criteria are NP hard

Ilse C. F. Ipsen, Arvind K. Saibaba

TL;DR

The paper proves that selecting $k$ representative columns from a matrix is NP-hard across a broad family of criteria, including volume, S-optimality, various matrix norms, pseudo-inverse norms, condition numbers, and stable ranks; it also shows there is no PTAS for these problems under standard complexity assumptions. By converting optimization problems into decision problems and deriving optimal values, the authors establish tight reductions from Exact Cover by 3-Sets (X3C) and derive explicit hardness thresholds. A central insight is that many optimality criteria attain their best value only when the chosen columns are orthonormal, and unit-norm constraints enable precise bounds and partitioned-pseudo-inverse formulas. They introduce the relative volume criterion, prove its NP-hardness and no-PTAS status, and provide comprehensive analyses and lower bounds for optimal values and partitioned inverses across all criteria, clarifying the computational limits of column subset selection in practical settings. These results have implications for algorithm design in numerical linear algebra, data analysis, and related fields where stable, well-conditioned, or maximally informative submatrices are sought.

Abstract

We consider a variety of criteria for selecting k representative columns from a real matrix A with rank(A)>=k. The criteria include the following optimization problems: absolute volume and S-optimality maximization; norm and condition minimization in the two-norm, Frobenius norm and Schatten p-norms for p>2; stable rank maximization; and the new criterion of relative volume maximization. We show that these criteria are NP hard and do not admit polynomial time approximation schemes (PTAS). To formulate the optimization problems as decision problems, we derive optimal values for the subset selection criteria, as well as expressions for partitioned pseudo-inverses.

Many (most?) column subset selection criteria are NP hard

TL;DR

The paper proves that selecting representative columns from a matrix is NP-hard across a broad family of criteria, including volume, S-optimality, various matrix norms, pseudo-inverse norms, condition numbers, and stable ranks; it also shows there is no PTAS for these problems under standard complexity assumptions. By converting optimization problems into decision problems and deriving optimal values, the authors establish tight reductions from Exact Cover by 3-Sets (X3C) and derive explicit hardness thresholds. A central insight is that many optimality criteria attain their best value only when the chosen columns are orthonormal, and unit-norm constraints enable precise bounds and partitioned-pseudo-inverse formulas. They introduce the relative volume criterion, prove its NP-hardness and no-PTAS status, and provide comprehensive analyses and lower bounds for optimal values and partitioned inverses across all criteria, clarifying the computational limits of column subset selection in practical settings. These results have implications for algorithm design in numerical linear algebra, data analysis, and related fields where stable, well-conditioned, or maximally informative submatrices are sought.

Abstract

We consider a variety of criteria for selecting k representative columns from a real matrix A with rank(A)>=k. The criteria include the following optimization problems: absolute volume and S-optimality maximization; norm and condition minimization in the two-norm, Frobenius norm and Schatten p-norms for p>2; stable rank maximization; and the new criterion of relative volume maximization. We show that these criteria are NP hard and do not admit polynomial time approximation schemes (PTAS). To formulate the optimization problems as decision problems, we derive optimal values for the subset selection criteria, as well as expressions for partitioned pseudo-inverses.

Paper Structure

This paper contains 42 sections, 28 theorems, 38 equations.

Key Result

Theorem 1

Relative volume maximization is NP hard.

Theorems & Definitions (57)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof
  • Theorem 4
  • proof
  • Theorem 5
  • proof
  • ...and 47 more