Table of Contents
Fetching ...

A Decomposition Framework for Certifiably Optimal Orthogonal Sparse PCA

Difei Cheng, Qiao Hu

TL;DR

This work introduces a novel Sparse Principal Component Analysis (SPCA) algorithm called GS-SPCA (SPCA with Gram-Schmidt Orthogonalization), which simultaneously enforces sparsity, orthogonality, and optimality, and proposes a decomposition framework for efficiently solving principal components.

Abstract

Sparse Principal Component Analysis (SPCA) is an important technique for high-dimensional data analysis, improving interpretability by imposing sparsity on principal components. However, existing methods often fail to simultaneously guarantee sparsity, orthogonality, and optimality of the principal components. To address this challenge, this work introduces a novel Sparse Principal Component Analysis (SPCA) algorithm called \textsc{GS-SPCA} (SPCA with Gram-Schmidt Orthogonalization), which simultaneously enforces sparsity, orthogonality, and optimality. However, the original GS-SPCA algorithm is computationally expensive due to the inherent $\ell_0$-norm constraint. To address this issue, we propose two acceleration strategies: First, we combine \textbf{Branch-and-Bound} with the GS-SPCA algorithm. By incorporating this strategy, we are able to obtain $\varepsilon$-optimal solutions with a trade-off between precision and efficiency, significantly improving computational speed. Second, we propose a \textbf{decomposition framework} for efficiently solving \textbf{multiple} principal components. This framework approximates the covariance matrix using a block-diagonal matrix through a thresholding method, reducing the original SPCA problem to a set of block-wise subproblems on approximately block-diagonal matrices.

A Decomposition Framework for Certifiably Optimal Orthogonal Sparse PCA

TL;DR

This work introduces a novel Sparse Principal Component Analysis (SPCA) algorithm called GS-SPCA (SPCA with Gram-Schmidt Orthogonalization), which simultaneously enforces sparsity, orthogonality, and optimality, and proposes a decomposition framework for efficiently solving principal components.

Abstract

Sparse Principal Component Analysis (SPCA) is an important technique for high-dimensional data analysis, improving interpretability by imposing sparsity on principal components. However, existing methods often fail to simultaneously guarantee sparsity, orthogonality, and optimality of the principal components. To address this challenge, this work introduces a novel Sparse Principal Component Analysis (SPCA) algorithm called \textsc{GS-SPCA} (SPCA with Gram-Schmidt Orthogonalization), which simultaneously enforces sparsity, orthogonality, and optimality. However, the original GS-SPCA algorithm is computationally expensive due to the inherent -norm constraint. To address this issue, we propose two acceleration strategies: First, we combine \textbf{Branch-and-Bound} with the GS-SPCA algorithm. By incorporating this strategy, we are able to obtain -optimal solutions with a trade-off between precision and efficiency, significantly improving computational speed. Second, we propose a \textbf{decomposition framework} for efficiently solving \textbf{multiple} principal components. This framework approximates the covariance matrix using a block-diagonal matrix through a thresholding method, reducing the original SPCA problem to a set of block-wise subproblems on approximately block-diagonal matrices.
Paper Structure (23 sections, 7 theorems, 28 equations, 1 figure, 4 algorithms)

This paper contains 23 sections, 7 theorems, 28 equations, 1 figure, 4 algorithms.

Key Result

Proposition 3.2

Let $\{x_1, x_2, \dots, x_n\}$ be a solution to the SPCA problem, i.e., a set of vectors satisfying Definition def:ospca. Then, these components form a complete orthonormal basis of $\mathbb{R}^n$ and satisfy the equality Moreover, the variance sequence $\{x_k^\top Q x_k\}_{k=1}^n$ is non-increasing.

Figures (1)

  • Figure 1: (a)-(c) Record the maximum angle between the first $r$ Sparse Principal Components; (d)-(f) Record the time to compute the $r$-th Sparse Principal Component for different sparsities, where each point represents the time to compute the $r$-th Sparse Principal Component; (g)-(i) Record the time for solving the first $r$ Sparse Principal Components as the sparsity changes; (j)-(l) Record the variance of the $r$-th Sparse Principal Component for different sparsities, where each point represents the variance of the $r$-th Sparse Principal Component.

Theorems & Definitions (12)

  • Definition 3.1: Orthogonal Sparse Principal Components
  • Proposition 3.2
  • Remark 3.3
  • Remark 4.1
  • Theorem 4.2
  • Definition 4.3: $\varepsilon$-optimal SPCA solution
  • Remark 4.4: Certification of $\varepsilon$-optimality
  • Theorem 5.1: Decomposition for Block-Diagonal SPCA
  • Theorem 5.2: Decomposition for $\varepsilon$-optimal Block-Diagonal SPCA
  • Theorem 6.1
  • ...and 2 more