Table of Contents
Fetching ...

Incremental Hierarchical Tucker Decomposition

Doruk Aksoy, Alex A. Gorodetsky

TL;DR

The paper tackles online, batch-aware tensor decomposition for streaming data by introducing Batch Hierarchical Tucker (BHT-l2r) and HT-RISE, the first incremental algorithm for the hierarchical Tucker format. BHT-l2r compresses entire batches by absorbing batch mode into the HT root, with provable error bounds; HT-RISE incrementally updates the HT representation as new batches arrive through projected residual expansions, preserving past reconstructions. Across scientific (PDE and gel simulations) and image (Minecraft frames and multispectral imagery) datasets, HT-RISE delivers competitive or superior relative test error and faster updates, while BHT-l2r achieves notable reductions in storage and runtime compared to standard HT. The work demonstrates that a batch-aware HT framework can yield expressive latent representations with practical online applicability and scalability for high-dimensional data.

Abstract

We present two new algorithms for approximating and updating the hierarchical Tucker decomposition of tensor streams. The first algorithm, Batch Hierarchical Tucker - leaf to root (BHT-l2r), proposes an alternative and more efficient way of approximating a batch of similar tensors in hierarchical Tucker format. The second algorithm, Hierarchical Tucker - Rapid Incremental Subspace Expansion (HT-RISE), updates the batch hierarchical Tucker representation of an accumulated tensor as new batches of tensors become available. The HT-RISE algorithm is suitable for the online setting and never requires full storage or reconstruction of all data while providing a solution to the incremental Tucker decomposition problem. We provide theoretical guarantees for both algorithms and demonstrate their effectiveness on physical and cyber-physical data. The proposed BHT-l2r algorithm and the batch hierarchical Tucker format offers up to $6.2\times$ compression and $3.7\times$ reduction in time over the hierarchical Tucker format. The proposed HT-RISE algorithm also offers up to $3.1\times$ compression and $3.2\times$ reduction in time over a state of the art incremental tensor train decomposition algorithm.

Incremental Hierarchical Tucker Decomposition

TL;DR

The paper tackles online, batch-aware tensor decomposition for streaming data by introducing Batch Hierarchical Tucker (BHT-l2r) and HT-RISE, the first incremental algorithm for the hierarchical Tucker format. BHT-l2r compresses entire batches by absorbing batch mode into the HT root, with provable error bounds; HT-RISE incrementally updates the HT representation as new batches arrive through projected residual expansions, preserving past reconstructions. Across scientific (PDE and gel simulations) and image (Minecraft frames and multispectral imagery) datasets, HT-RISE delivers competitive or superior relative test error and faster updates, while BHT-l2r achieves notable reductions in storage and runtime compared to standard HT. The work demonstrates that a batch-aware HT framework can yield expressive latent representations with practical online applicability and scalability for high-dimensional data.

Abstract

We present two new algorithms for approximating and updating the hierarchical Tucker decomposition of tensor streams. The first algorithm, Batch Hierarchical Tucker - leaf to root (BHT-l2r), proposes an alternative and more efficient way of approximating a batch of similar tensors in hierarchical Tucker format. The second algorithm, Hierarchical Tucker - Rapid Incremental Subspace Expansion (HT-RISE), updates the batch hierarchical Tucker representation of an accumulated tensor as new batches of tensors become available. The HT-RISE algorithm is suitable for the online setting and never requires full storage or reconstruction of all data while providing a solution to the incremental Tucker decomposition problem. We provide theoretical guarantees for both algorithms and demonstrate their effectiveness on physical and cyber-physical data. The proposed BHT-l2r algorithm and the batch hierarchical Tucker format offers up to compression and reduction in time over the hierarchical Tucker format. The proposed HT-RISE algorithm also offers up to compression and reduction in time over a state of the art incremental tensor train decomposition algorithm.

Paper Structure

This paper contains 38 sections, 6 theorems, 45 equations, 32 figures, 8 tables, 2 algorithms.

Key Result

Theorem 1

For a $d$-dimensional tensor $\mathcal{Y}\in\mathbb{R}^{n_{1}\times\cdots\times n_{d}}$, the best HT approximation $\tilde{\mathcal{Y}}$ with an absolute approximation error $\|\mathcal{Y}-\tilde{\mathcal{Y}}\|_{F} \leq \varepsilon_{abs}$ can be obtained by prescribing a node-wise truncation error $ for all truncated SVD computations.

Figures (32)

  • Figure 1: Flow of the proposed HT-RISE algorithm. The algorithm updates the hierarchical Tucker representation of an accumulated tensor in batch hierarchical Tucker form as new batches of tensors become available. Green represents the input data and red represents the output data.
  • Figure 2: Three possible configurations of the dimensions for a five dimensional tensor with shape $n_{1}\times n_{2}\times n_{3}\times n_{4}\times n_{5}$. Each dimension tree describes a different interaction between dimensions and therefore yields different compression performance. For a more detailed study on the effect of the axis reordering, please refer to \ref{['app:axis_ordering']}. Root, transfer, and leaf nodes are colored in red, blue, and green, respectively.
  • Figure 3: Tensor network diagrams of an 8D tensor and its representations in Tucker format, hierarchical Tucker format, and batch hierarchical Tucker format. The streaming/batch dimension is labeled $N$
  • Figure 4: Step-by-step decomposition of an $N$-batch of 5-D tensors with the BHT-l2r algorithm (\ref{['alg:batch_htucker']}). The decomposition starts with the leaves of the last layer. Since the dimension tree is constructed beforehand, \ref{['alg:batch_htucker']} has the information about which dimensions' leaves will be on which layer through the dimension tree $\mathbf{T}$. Note that the batch dimension ($N$) remains intact throughout the entire decomposition process.
  • Figure 5: Example snapshots from scientific datasets.
  • ...and 27 more figures

Theorems & Definitions (7)

  • Theorem 1: Adapted from kressner2014algorithm Lemma B.2
  • Lemma 2: Layerwise approximation error
  • Theorem 3: Total approximation error grasedyck2010hierarchical
  • Corollary 4: BHT-l2r approximation error
  • Claim 1: Orthogonal reconstructions
  • Theorem 5: HT-RISE approximation error
  • Theorem 6: Error guarantees for the past stream