Table of Contents
Fetching ...

Efficient Sketching-Based Summation of Tucker Tensors

Rudi Smith, Mirjeta Pasha, Andrés Galindo-Olarte, Hussam Al Daas, Grey Ballard, Joseph Nakao, Jing-Mei Qiu, William Taitano

Abstract

We present efficient, sketching-based methods for the summation of tensors in Tucker format. Leveraging the algebraic structure of Khatri-Rao and Kronecker products, our approach enables compressed arithmetic on Tucker tensors while controlling rank growth and computational cost. The proposed sketching framework avoids the explicit formation of large intermediate tensors, instead operating directly on the factor matrices and core tensors to produce accurate low-rank approximations of tensor sums. Furthermore, we analyze the computational complexity and the theoretical approximation properties of the proposed methodology. Numerical experiments demonstrate the effectiveness of our approach on four problems: two synthetic test cases, a parameter-dependent elliptic equation (commonly referred to as the cookie problem) solved via GMRES, and a one-dimensional linear transport problem discretized via high-order discontinuous Galerkin methods, where repeated tensor summation arises as a core computational bottleneck. Across these examples, the sketching-based summation achieves substantial computational savings while preserving high accuracy relative to direct summation and re-compression.

Efficient Sketching-Based Summation of Tucker Tensors

Abstract

We present efficient, sketching-based methods for the summation of tensors in Tucker format. Leveraging the algebraic structure of Khatri-Rao and Kronecker products, our approach enables compressed arithmetic on Tucker tensors while controlling rank growth and computational cost. The proposed sketching framework avoids the explicit formation of large intermediate tensors, instead operating directly on the factor matrices and core tensors to produce accurate low-rank approximations of tensor sums. Furthermore, we analyze the computational complexity and the theoretical approximation properties of the proposed methodology. Numerical experiments demonstrate the effectiveness of our approach on four problems: two synthetic test cases, a parameter-dependent elliptic equation (commonly referred to as the cookie problem) solved via GMRES, and a one-dimensional linear transport problem discretized via high-order discontinuous Galerkin methods, where repeated tensor summation arises as a core computational bottleneck. Across these examples, the sketching-based summation achieves substantial computational savings while preserving high accuracy relative to direct summation and re-compression.
Paper Structure (18 sections, 1 theorem, 42 equations, 6 figures, 6 algorithms)

This paper contains 18 sections, 1 theorem, 42 equations, 6 figures, 6 algorithms.

Key Result

Proposition 1

Let $\boldsymbol{\mathcal{X}} \in \mathbb{R}^{n_1 \times \dots \times n_N}$ be represented by the exact Tucker decomposition $\boldsymbol{\mathcal{X}} = \llbracket \boldsymbol{\mathcal{G}}; \boldsymbol{U}_{1}, \dots, \boldsymbol{U}_{N} \rrbracket$, then the following identities hold

Figures (6)

  • Figure 1: Performance comparison for the synthetic low-rank summation example: The runtime of the TuckerAxby+TuckerRounding method grows rapidly with the number of summands, whereas the randomized sketching approaches scale highly efficiently. At $d=100$, the randomized KRP method is nearly two orders of magnitude faster than the deterministic baseline. Crucially, this computational speedup does not compromise accuracy; both the Kronecker and KRP sketching methods maintain errors relative to the deterministic baseline near machine precision ($\sim 4 \cdot 10^{-15}$) across all tested summation sizes.
  • Figure 2: Performance comparison of sequential truncation versus holistic summation: Eager deterministic rounding suffers from severe intermediate rank swelling and accuracy loss (relative error of $6.72 \cdot 10^{0}$) due to delayed mathematical cancellation. By contrast, the randomized sketching approaches inherently bypass this intermediate growth with a low error relative to the true exact tensor ($\sim 2 \cdot 10^{-11}$) while running an order of magnitude faster than the eager baseline and maintaining a distinct computational advantage over the lazy baseline.
  • Figure 3: Schematic of the spatial domain $\Omega = [0,L]^2$ for the 4-parameter cookie problem. The domain features a background region where the diffusion coefficient is constant ($\sigma = 1$), and $P=4$ mutually disjoint circular subdomains arranged in a grid. Inside each subdomain $\mathcal{D}_\mu$, the diffusion coefficient is modeled as a parameter-dependent constant $\sigma(x, \boldsymbol{\xi}) = 1 + \xi_\mu$. This geometry gives rise to the spatial stiffness matrices $\mathbf{A}_0, \dots, \mathbf{A}_3$ utilized in the tensor-structured linear system.
  • Figure 4: Performance evaluation of the randomized Tucker-GMRES solver for the parameter-dependent cookie problem: KRPSum-Tucker demonstrates superior efficiency and robustness compared to both KronSum-Tucker and TuckerAxby+TuckerRounding. Across varying parameter sample sizes ($N$), the KRPSum-Tucker achieves up to a $11\times$ computational speedup while preserving high accuracy with relative errors resting near $10^{-5}$. In contrast, KronSum-Tucker struggles in this specific setting, yielding higher relative errors ($\sim 10^{-2}$) and minimal runtime improvements over the standard rounding baseline due to a maximum 30 iteration cap. \ref{['fig:GMRES_Ranks']} demonstrates that the highly skewed mode in the Kronecker case has inflated rank compared to KRPSum-Tucker and TuckerAxby+TuckerRounding.
  • Figure 5: Convergence of the Nodal Discontinuous Galerkin (NDG) scheme: The empirical order of accuracy closely aligns with the theoretical expectations for varying polynomial bases (NDG1, NDG2, and NDG3) as the spatial grid resolution ($N_x$) increases. This optimal convergence validates the accuracy of the baseline solver implementation prior to introducing low-rank tensor approximations.
  • ...and 1 more figures

Theorems & Definitions (3)

  • Definition 1: Kronecker and Khatri-Rao products
  • Proposition 1: Key identities for Tucker Tensors BalG15
  • Remark 1