Table of Contents
Fetching ...

Efficient QR-Based CP Decomposition Acceleration via Restructured Dimension Tree and Customized Extrapolation

Wenchao Xie, Jiawei Xu, Zheng Peng, Qingsong Wang

TL;DR

This work targets efficient CP decomposition by addressing numerical stability and computational cost. It combines QR-based CP-ALS with a restructured dimension tree to boost intermediate-tensor reuse and a customized extrapolation for the special $oldsymbol{Q}_0$ structure, yielding the ALS-QR-BRE algorithm. Theoretical analysis shows a 33% reduction in major TTM costs, and empirical results across synthetic and five real-world datasets demonstrate up to ~2x faster iterations and improved fitting accuracy. The results advance scalable CP decomposition for high-order and structured tensors, offering practical gains in both speed and precision.

Abstract

The canonical polyadic (CP) decomposition is one of the most widely used tensor decomposition techniques. The conventional CP decomposition algorithm combines alternating least squares (ALS) with the normal equation. However, the normal equation is susceptible to numerical ill-conditioning, which can adversely affect the decomposition results. To mitigate this issue, ALS combined with QR decomposition has been proposed as a more numerically stable alternative. Although this method enhances stability, its iterative process involves tensor-times-matrix (TTM) operations, which typically result in higher computational costs. To reduce this cost, we propose restructured dimension tree, which increases the reuse of intermediate tensors and reduces the number of TTM operations. Compared with the standard dimension tree structure, this dimension tree structure can reduce the computational complexity of TTM operations for tensors of any order by 33\%. Additionally, we introduce a customized extrapolation strategy in the CP-ALS-QR algorithm, leveraging the unique structure of the matrix $\mathbf{Q}_0$ to further accelerate convergence. By integrating these two techniques, we propose a novel CP decomposition algorithm that significantly improves iteration efficiency, achieving up to twofold acceleration on datasets with certain specific structures. Numerical experiments on five real-world datasets show that, compared with the baseline algorithm, our proposed algorithm improves iteration efficiency while simultaneously enhancing fitting accuracy.

Efficient QR-Based CP Decomposition Acceleration via Restructured Dimension Tree and Customized Extrapolation

TL;DR

This work targets efficient CP decomposition by addressing numerical stability and computational cost. It combines QR-based CP-ALS with a restructured dimension tree to boost intermediate-tensor reuse and a customized extrapolation for the special structure, yielding the ALS-QR-BRE algorithm. Theoretical analysis shows a 33% reduction in major TTM costs, and empirical results across synthetic and five real-world datasets demonstrate up to ~2x faster iterations and improved fitting accuracy. The results advance scalable CP decomposition for high-order and structured tensors, offering practical gains in both speed and precision.

Abstract

The canonical polyadic (CP) decomposition is one of the most widely used tensor decomposition techniques. The conventional CP decomposition algorithm combines alternating least squares (ALS) with the normal equation. However, the normal equation is susceptible to numerical ill-conditioning, which can adversely affect the decomposition results. To mitigate this issue, ALS combined with QR decomposition has been proposed as a more numerically stable alternative. Although this method enhances stability, its iterative process involves tensor-times-matrix (TTM) operations, which typically result in higher computational costs. To reduce this cost, we propose restructured dimension tree, which increases the reuse of intermediate tensors and reduces the number of TTM operations. Compared with the standard dimension tree structure, this dimension tree structure can reduce the computational complexity of TTM operations for tensors of any order by 33\%. Additionally, we introduce a customized extrapolation strategy in the CP-ALS-QR algorithm, leveraging the unique structure of the matrix to further accelerate convergence. By integrating these two techniques, we propose a novel CP decomposition algorithm that significantly improves iteration efficiency, achieving up to twofold acceleration on datasets with certain specific structures. Numerical experiments on five real-world datasets show that, compared with the baseline algorithm, our proposed algorithm improves iteration efficiency while simultaneously enhancing fitting accuracy.

Paper Structure

This paper contains 18 sections, 1 theorem, 26 equations, 12 figures, 8 tables, 3 algorithms.

Key Result

Corollary 1

Computational complexity of three scenarios of $N$th-order tensor $(N>4)$.

Figures (12)

  • Figure 1: The canonical dimension tree architecture of third-order and fourth-order tensors.
  • Figure 2: The tensor that remains underutilized is emphasized within the red-dashed box. (a) Shows the underutilized tensor $\mathcal{Y}^{(1,3)}$. (b) Shows the underutilized tensors $\mathcal{Y}^{(3,4)}$ and $\mathcal{Y}^{(2,3,4)}$.
  • Figure 3: The iterative structure of restructured dimension tree for third-order tensor.
  • Figure 4: The iterative structure of restructured dimension tree for fourth-order tensor.
  • Figure 5: The average runtime (in seconds) for a single iteration of ALS, PINV, QR, QR-SVD, QR-DT, and QR-BRE is presented as the rank increases, for a third-order tensor of size 700 (top left), a fourth-order tensor of size 150 (top right), a five-order tensor of size 50 (bottom left), and a six-order tensor of size 25 (bottom right).
  • ...and 7 more figures

Theorems & Definitions (3)

  • Remark 1
  • Remark 2
  • Corollary 1