Table of Contents
Fetching ...

Streaming data recovery via Bayesian tensor train decomposition

Yunyu Huang, Yani Feng, Qifeng Liao

TL;DR

We address streaming data recovery for high-order tensors by learning a Bayesian Tensor Train (TT) decomposition, where a tensor element $x_{\boldsymbol{j}}$ is approximated as $x_{\boldsymbol{j}} \approx \prod_{d=1}^D \mathscr{G}^{(d)}_{j_d}$ with TT-cores and TT-ranks. The method SPTT uses streaming variational Bayes to update the posterior over TT-cores and the noise precision online, given batches $B_t$ without revisiting past data. A Gaussian prior on TT-cores, a Gamma prior on the noise precision, and closed-form variational updates yield efficient per-batch updates with time $\mathcal{O}(S D L^4)$ and space $\mathcal{O}(N D L^2)$, enabling uncertainty quantification in streaming settings. Empirical results on synthetic and real-world datasets show SPTT outperforms existing Bayesian streaming methods and static TT/Tucker/CP baselines in reconstruction accuracy and predictive performance, demonstrating effective online, uncertainty-aware recovery for high-order streaming data.

Abstract

In this paper, we study a Bayesian tensor train (TT) decomposition method to recover streaming data by approximating the latent structure in high-order streaming data. Drawing on the streaming variational Bayes method, we introduce the TT format into Bayesian tensor decomposition methods for streaming data, and formulate posteriors of TT cores. Thanks to the Bayesian framework of the TT format, the proposed algorithm (SPTT) excels in recovering streaming data with high-order, incomplete, and noisy properties. The experiments in synthetic and real-world datasets show the accuracy of our method compared to state-of-the-art Bayesian tensor decomposition methods for streaming data.

Streaming data recovery via Bayesian tensor train decomposition

TL;DR

We address streaming data recovery for high-order tensors by learning a Bayesian Tensor Train (TT) decomposition, where a tensor element is approximated as with TT-cores and TT-ranks. The method SPTT uses streaming variational Bayes to update the posterior over TT-cores and the noise precision online, given batches without revisiting past data. A Gaussian prior on TT-cores, a Gamma prior on the noise precision, and closed-form variational updates yield efficient per-batch updates with time and space , enabling uncertainty quantification in streaming settings. Empirical results on synthetic and real-world datasets show SPTT outperforms existing Bayesian streaming methods and static TT/Tucker/CP baselines in reconstruction accuracy and predictive performance, demonstrating effective online, uncertainty-aware recovery for high-order streaming data.

Abstract

In this paper, we study a Bayesian tensor train (TT) decomposition method to recover streaming data by approximating the latent structure in high-order streaming data. Drawing on the streaming variational Bayes method, we introduce the TT format into Bayesian tensor decomposition methods for streaming data, and formulate posteriors of TT cores. Thanks to the Bayesian framework of the TT format, the proposed algorithm (SPTT) excels in recovering streaming data with high-order, incomplete, and noisy properties. The experiments in synthetic and real-world datasets show the accuracy of our method compared to state-of-the-art Bayesian tensor decomposition methods for streaming data.
Paper Structure (20 sections, 2 theorems, 41 equations, 5 figures, 1 algorithm)

This paper contains 20 sections, 2 theorems, 41 equations, 5 figures, 1 algorithm.

Key Result

Proposition 3.1

The expectation of Kronecker product $\pmb{\mathscr{G}}^{(i)}_{j_{i}}\otimes \pmb{\mathscr{G}}^{(i)}_{j_{i}}$ in (eq:bd) can be calculated by

Figures (5)

  • Figure 1: TT decomposition for an element $x_{\mathbf{j}}$ (TT-format).
  • Figure 2: Predictive performance of synthetic data under different conditions.
  • Figure 3: The running predictive performance of synthetic data.
  • Figure 4: Predictive performance with different rank(top row) and streaming batch size (bottom row).
  • Figure 5: The running predictive performance in real-world applications.

Theorems & Definitions (4)

  • Proposition 3.1
  • proof
  • Proposition 3.2
  • proof