Table of Contents
Fetching ...

A Muon-Accelerated Algorithm for Low Separation Rank Tensor Generalized Linear Models

Xiao Liang, Shuang Li

Abstract

Tensor-valued data arise naturally in multidimensional signal and imaging problems, such as biomedical imaging. When incorporated into generalized linear models (GLMs), naive vectorization can destroy their multi-way structure and lead to high-dimensional, ill-posed estimation. To address this challenge, Low Separation Rank (LSR) decompositions reduce model complexity by imposing low-rank multilinear structure on the coefficient tensor. A representative approach for estimating LSR-based tensor GLMs (LSR-TGLMs) is the Low Separation Rank Tensor Regression (LSRTR) algorithm, which adopts block coordinate descent and enforces orthogonality of the factor matrices through repeated QR-based projections. However, the repeated projection steps can be computationally demanding and slow convergence. Motivated by the need for scalable estimation and classification from such data, we propose LSRTR-M, which incorporates Muon (MomentUm Orthogonalized by Newton-Schulz) updates into the LSRTR framework. Specifically, LSRTR-M preserves the original block coordinate scheme while replacing the projection-based factor updates with Muon steps. Across synthetic linear, logistic, and Poisson LSR-TGLMs, LSRTR-M converges faster in both iteration count and wall-clock time, while achieving lower normalized estimation and prediction errors. On the Vessel MNIST 3D task, it further improves computational efficiency while maintaining competitive classification performance.

A Muon-Accelerated Algorithm for Low Separation Rank Tensor Generalized Linear Models

Abstract

Tensor-valued data arise naturally in multidimensional signal and imaging problems, such as biomedical imaging. When incorporated into generalized linear models (GLMs), naive vectorization can destroy their multi-way structure and lead to high-dimensional, ill-posed estimation. To address this challenge, Low Separation Rank (LSR) decompositions reduce model complexity by imposing low-rank multilinear structure on the coefficient tensor. A representative approach for estimating LSR-based tensor GLMs (LSR-TGLMs) is the Low Separation Rank Tensor Regression (LSRTR) algorithm, which adopts block coordinate descent and enforces orthogonality of the factor matrices through repeated QR-based projections. However, the repeated projection steps can be computationally demanding and slow convergence. Motivated by the need for scalable estimation and classification from such data, we propose LSRTR-M, which incorporates Muon (MomentUm Orthogonalized by Newton-Schulz) updates into the LSRTR framework. Specifically, LSRTR-M preserves the original block coordinate scheme while replacing the projection-based factor updates with Muon steps. Across synthetic linear, logistic, and Poisson LSR-TGLMs, LSRTR-M converges faster in both iteration count and wall-clock time, while achieving lower normalized estimation and prediction errors. On the Vessel MNIST 3D task, it further improves computational efficiency while maintaining competitive classification performance.

Paper Structure

This paper contains 13 sections, 15 equations, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: A third-order tensor under the LSR decomposition.
  • Figure 2: Performance comparison in linear regression. Top row: results versus iterations. Bottom row: results versus running time. Columns correspond to training loss, normalized estimation error, and normalized prediction error, respectively.
  • Figure 3: Performance comparison across training sample sizes in linear regression. (a) Normalized estimation error and (b) normalized prediction error.
  • Figure 4: Performance comparison in logistic regression. Top row: results versus iterations. Bottom row: results versus running time. Columns correspond to training loss, normalized estimation error, and normalized prediction error, respectively.
  • Figure 5: Performance comparison across training sample sizes in logistic regression. (a) Normalized estimation error and (b) normalized prediction error.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Definition 2.1: Vector-structured GLMs taki2023structured
  • Definition 2.2: Matrix Separation Rank tsiligkaridis2013covariance
  • Definition 2.3: LSR Tensor Decomposition