Table of Contents
Fetching ...

Efficient Nonparametric Tensor Decomposition for Binary and Count Data

Zerui Tao, Toshihisa Tanaka, Qibin Zhao

TL;DR

ENTED introduces a scalable, nonparametric tensor decomposition for binary and count data by replacing fixed multilinear contractions with Gaussian process mappings. Through Polya-Gamma augmentation, ENTED unifies binary and count likelihoods, enabling conjugacy and natural-gradient-based variational inference, while SOLVE-based orthogonal decoupling yields efficient covariance approximations. Empirical results across multiple binary and count datasets show that ENTED improves predictive performance and distributes estimation over baselines, with favorable computational characteristics. The work broadens the applicability of tensor decompositions to discrete data and demonstrates robust performance with large-scale tensors in realistic settings.

Abstract

In numerous applications, binary reactions or event counts are observed and stored within high-order tensors. Tensor decompositions (TDs) serve as a powerful tool to handle such high-dimensional and sparse data. However, many traditional TDs are explicitly or implicitly designed based on the Gaussian distribution, which is unsuitable for discrete data. Moreover, most TDs rely on predefined multi-linear structures, such as CP and Tucker formats. Therefore, they may not be effective enough to handle complex real-world datasets. To address these issues, we propose ENTED, an \underline{E}fficient \underline{N}onparametric \underline{TE}nsor \underline{D}ecomposition for binary and count tensors. Specifically, we first employ a nonparametric Gaussian process (GP) to replace traditional multi-linear structures. Next, we utilize the \pg augmentation which provides a unified framework to establish conjugate models for binary and count distributions. Finally, to address the computational issue of GPs, we enhance the model by incorporating sparse orthogonal variational inference of inducing points, which offers a more effective covariance approximation within GPs and stochastic natural gradient updates for nonparametric models. We evaluate our model on several real-world tensor completion tasks, considering binary and count datasets. The results manifest both better performance and computational advantages of the proposed model.

Efficient Nonparametric Tensor Decomposition for Binary and Count Data

TL;DR

ENTED introduces a scalable, nonparametric tensor decomposition for binary and count data by replacing fixed multilinear contractions with Gaussian process mappings. Through Polya-Gamma augmentation, ENTED unifies binary and count likelihoods, enabling conjugacy and natural-gradient-based variational inference, while SOLVE-based orthogonal decoupling yields efficient covariance approximations. Empirical results across multiple binary and count datasets show that ENTED improves predictive performance and distributes estimation over baselines, with favorable computational characteristics. The work broadens the applicability of tensor decompositions to discrete data and demonstrates robust performance with large-scale tensors in realistic settings.

Abstract

In numerous applications, binary reactions or event counts are observed and stored within high-order tensors. Tensor decompositions (TDs) serve as a powerful tool to handle such high-dimensional and sparse data. However, many traditional TDs are explicitly or implicitly designed based on the Gaussian distribution, which is unsuitable for discrete data. Moreover, most TDs rely on predefined multi-linear structures, such as CP and Tucker formats. Therefore, they may not be effective enough to handle complex real-world datasets. To address these issues, we propose ENTED, an \underline{E}fficient \underline{N}onparametric \underline{TE}nsor \underline{D}ecomposition for binary and count tensors. Specifically, we first employ a nonparametric Gaussian process (GP) to replace traditional multi-linear structures. Next, we utilize the \pg augmentation which provides a unified framework to establish conjugate models for binary and count distributions. Finally, to address the computational issue of GPs, we enhance the model by incorporating sparse orthogonal variational inference of inducing points, which offers a more effective covariance approximation within GPs and stochastic natural gradient updates for nonparametric models. We evaluate our model on several real-world tensor completion tasks, considering binary and count datasets. The results manifest both better performance and computational advantages of the proposed model.
Paper Structure (38 sections, 60 equations, 2 figures, 3 tables, 1 algorithm)

This paper contains 38 sections, 60 equations, 2 figures, 3 tables, 1 algorithm.

Figures (2)

  • Figure 1: Convergence results. The x-axes are epochs and y-axes are AUC values.
  • Figure 2: Results on different numbers of inducing points. The x-axes denote the number of inducing points. In subfigure (a), the y-axis is the RMSE and in subfigure (b), the y-axis is the computing time (in seconds) for each epoch.

Theorems & Definitions (1)

  • Definition 1: Pólya-Gamma distribution