Table of Contents
Fetching ...

Group Sparse-based Tensor CP Decomposition: Model, Algorithms, and Applications in Chemometrics

Zihao Wang, Minru Bai, Liang Chen, Xueying Zhao

TL;DR

Addressing the CP rank estimation challenge in tensor CP decomposition, the paper proves that the CP rank can be exactly estimated by minimizing the group sparsity $\|\mathbf{A}^{(N)}\|_{2,0}$ under unit-norm constraints on the other factor matrices. It introduces a CPD_GSU model that couples rank estimation and decomposition, and solves it with a double-loop block-coordinate proximal gradient descent algorithm with extrapolation, with a proven accumulation-point convergence to stationary points. A rank-reduction strategy further improves computational efficiency by discarding zero columns once nonzero columns stabilize. Applied to chemometrics component separation with real data, the method demonstrates robustness to overestimated rank and accurate recovery of component profiles.

Abstract

The CANDECOMP/PARAFAC (or Canonical polyadic, CP) decomposition of tensors has numerous applications in various fields, such as chemometrics, signal processing, machine learning, etc. Tensor CP decomposition assumes the knowledge of the exact CP rank, i.e., the total number of rank-one components of a tensor. However, accurately estimating the CP rank is very challenging. In this work, to address this issue, we prove that the CP rank can be exactly estimated by minimizing the group sparsity of any one of the factor matrices under the unit length constraints on the columns of the other factor matrices. Based on this result, we propose a CP decomposition model with group sparse regularization, which integrates the rank estimation and the tensor decomposition as an optimization problem, whose set of optimal solutions is proved to be nonempty. To solve the proposed model, we propose a double-loop block-coordinate proximal gradient descent algorithm with extrapolation and prove that each accumulation point of the sequence generated by the algorithm is a stationary point of the proposed model. Furthermore, we incorporate a rank reduction strategy into the algorithm to reduce the computational complexity. Finally, we apply the proposed model and algorithms to the component separation problem in chemometrics using real data. Numerical experiments demonstrate the robustness and effectiveness of the proposed methods.

Group Sparse-based Tensor CP Decomposition: Model, Algorithms, and Applications in Chemometrics

TL;DR

Addressing the CP rank estimation challenge in tensor CP decomposition, the paper proves that the CP rank can be exactly estimated by minimizing the group sparsity under unit-norm constraints on the other factor matrices. It introduces a CPD_GSU model that couples rank estimation and decomposition, and solves it with a double-loop block-coordinate proximal gradient descent algorithm with extrapolation, with a proven accumulation-point convergence to stationary points. A rank-reduction strategy further improves computational efficiency by discarding zero columns once nonzero columns stabilize. Applied to chemometrics component separation with real data, the method demonstrates robustness to overestimated rank and accurate recovery of component profiles.

Abstract

The CANDECOMP/PARAFAC (or Canonical polyadic, CP) decomposition of tensors has numerous applications in various fields, such as chemometrics, signal processing, machine learning, etc. Tensor CP decomposition assumes the knowledge of the exact CP rank, i.e., the total number of rank-one components of a tensor. However, accurately estimating the CP rank is very challenging. In this work, to address this issue, we prove that the CP rank can be exactly estimated by minimizing the group sparsity of any one of the factor matrices under the unit length constraints on the columns of the other factor matrices. Based on this result, we propose a CP decomposition model with group sparse regularization, which integrates the rank estimation and the tensor decomposition as an optimization problem, whose set of optimal solutions is proved to be nonempty. To solve the proposed model, we propose a double-loop block-coordinate proximal gradient descent algorithm with extrapolation and prove that each accumulation point of the sequence generated by the algorithm is a stationary point of the proposed model. Furthermore, we incorporate a rank reduction strategy into the algorithm to reduce the computational complexity. Finally, we apply the proposed model and algorithms to the component separation problem in chemometrics using real data. Numerical experiments demonstrate the robustness and effectiveness of the proposed methods.
Paper Structure (13 sections, 5 theorems, 37 equations, 6 figures, 2 tables, 3 algorithms)

This paper contains 13 sections, 5 theorems, 37 equations, 6 figures, 2 tables, 3 algorithms.

Key Result

Theorem 3.1

For a integer ${R} \ge \rm rank_{cp}(\mathcal{X})$, we have where $[\![{\mathbf{A}}^{(1)},\cdots,{\mathbf{A}}^{(N)}]\!]=\sum\limits_{r=1}^R \mathbf{a}^{(1)}_r \circ \cdots \circ \mathbf{a}^{(N)}_r$.

Figures (6)

  • Figure 1: The computation time and number of outer-iterations of Algorithm \ref{['algorithm APGBCDRR']} for the two-component data with different inner-iterations ($m$).
  • Figure 2: Analytical results with $R=5$ for the two-component data. Real concentration profiles (a) and relative concentration profiles resolved by (b) AIBCD, (c) CP_ALS, (d) ATLD, (e) eDLBCPGD, and (f) eDLBCPGD_RR, respectively.
  • Figure 3: The number of nonzero columns of the factor matrix $\mathbf{A}^{(3)}$ concerning the iteration numbers.
  • Figure 4: Analytical results with $R=7$ for the macrocephalae rhizoma data. Real concentration profiles (a) and relative concentration profiles resolved by (b) AIBCD, (c) CP_ALS, (d) ATLD, (e) eDLBCPGD, and (f) eDLBCPGD_RR, respectively.
  • Figure 5: Analytical results with $R=7$ for the macrocephalae rhizoma data. Real chromatographic profiles (a) and normalized chromatographic profiles resolved by (b) AIBCD, (c) CP_ALS, (d) ATLD, (e) eDLBCPGD, and (f) eDLBCPGD_RR, respectively.
  • ...and 1 more figures

Theorems & Definitions (15)

  • Theorem 3.1
  • proof
  • Proposition 1
  • proof
  • Remark 4.1
  • Remark 4.2
  • Lemma 4.1
  • proof
  • Theorem 4.1
  • proof
  • ...and 5 more