Table of Contents
Fetching ...

Clustering-based Low Rank Approximation Method

Yujun Zhu, Jie Zhu, Hizba Arshad, Zhongming Wang, Ju Ming

TL;DR

This work tackles the challenge of scalable, accurate dimensionality reduction for high-dimensional matrix-structured data. It introduces CGLRAM, a clustering-enhanced generalization of GLRAM that learns cluster-specific left-right projections $(\mathbb{L}_j, \mathbb{R}_j)$ and assigns matrices to generalized clusters via a Frobenius-based distance. The method offers convergence guarantees and demonstrates superior reconstruction accuracy over GLRAM and competitive performance relative to SVD, as shown in image compression and SPDE simulations. The approach provides a practical, memory-conscious tool for reduced-order modeling in applications like image processing and stochastic PDEs.

Abstract

We propose a clustering-based generalized low rank approximation method, which takes advantage of appealing features from both the generalized low rank approximation of matrices (GLRAM) and cluster analysis. It exploits a more general form of clustering generators and similarity metrics so that it is more suitable for matrix-structured data relative to conventional partitioning methods. In our approach, we first pre-classify the initial matrix collection into several small subset clusters and then sequentially compress the matrices within the clusters. This strategy enhances the numerical precision of the low-rank approximation. In essence, we combine the ideas of GLRAM and clustering into a hybrid algorithm for dimensionality reduction. The proposed algorithm can be viewed as the generalization of both techniques. Theoretical analysis and numerical experiments are established to validate the feasibility and effectiveness of the proposed algorithm.

Clustering-based Low Rank Approximation Method

TL;DR

This work tackles the challenge of scalable, accurate dimensionality reduction for high-dimensional matrix-structured data. It introduces CGLRAM, a clustering-enhanced generalization of GLRAM that learns cluster-specific left-right projections and assigns matrices to generalized clusters via a Frobenius-based distance. The method offers convergence guarantees and demonstrates superior reconstruction accuracy over GLRAM and competitive performance relative to SVD, as shown in image compression and SPDE simulations. The approach provides a practical, memory-conscious tool for reduced-order modeling in applications like image processing and stochastic PDEs.

Abstract

We propose a clustering-based generalized low rank approximation method, which takes advantage of appealing features from both the generalized low rank approximation of matrices (GLRAM) and cluster analysis. It exploits a more general form of clustering generators and similarity metrics so that it is more suitable for matrix-structured data relative to conventional partitioning methods. In our approach, we first pre-classify the initial matrix collection into several small subset clusters and then sequentially compress the matrices within the clusters. This strategy enhances the numerical precision of the low-rank approximation. In essence, we combine the ideas of GLRAM and clustering into a hybrid algorithm for dimensionality reduction. The proposed algorithm can be viewed as the generalization of both techniques. Theoretical analysis and numerical experiments are established to validate the feasibility and effectiveness of the proposed algorithm.

Paper Structure

This paper contains 13 sections, 11 theorems, 59 equations, 10 figures, 5 tables, 3 algorithms.

Key Result

Theorem 2.1

Let $\mathbb{A} = \mathbb{U} \Sigma \mathbb{V}^T \in \mathbf{R}^{N \times N}$ be the SVD of $\mathbb{A}$, and let $\mathbb{U}, \Sigma$ and $\mathbb{V}$ partitioned as follows: where $\mathbb{U}_1, \mathbb{V}_1 \in \mathbf{R}^{N \times k}$ and $\Sigma_1 \in \mathbf{R}^{k \times k}$. Then the rank-k matrix, obtained from the TSVD, $\widetilde{\mathbb{A}^*} = \mathbb{U}_1 \Sigma_1 \mathbb{V}_1^T$, s

Figures (10)

  • Figure 2.1: Raw image from LFW dataset (left) and image compressed by GLRAM with reduction ratio $10\%$ (right).
  • Figure 3.1: Two centroidal Voronoi tessellations of a square. The points $\mu_1$ and $\mu_2$ are the centroids of the rectangles on the left or of the triangles on the right.
  • Figure 5.1: Handwritten figure examples in EMNIST-digits.
  • Figure 5.2: Comparison of reconstruction error (left) and error reduction rate (right) between GLRAM and CGLRAM.
  • Figure 5.3: Comparison of initial and final CGLRAM objective values (left) and its enhancement rate (right).
  • ...and 5 more figures

Theorems & Definitions (18)

  • Theorem 2.1: Eckart-Young Theorem eckart1936approximation
  • Theorem 2.2: ye2004generalized
  • Theorem 2.3: ye2004generalized
  • Theorem 3.1: burkardt2009k
  • Remark 3.1
  • Theorem 3.2: ingrassia2020cluster
  • Remark 3.2
  • Theorem 4.1
  • proof
  • Theorem 4.2
  • ...and 8 more