Table of Contents
Fetching ...

The AL$\ell_0$CORE Tensor Decomposition for Sparse Count Data

John Hood, Aaron Schein

TL;DR

AL$\\ell_0$CORE introduces a Tucker-like tensor decomposition with a sparsity budget on the core, enabling a rich latent structure while avoiding Tucker's exponential parameter growth. By allocating at most $Q$ non-zero core entries and sampling their locations during inference, the method scales with $||\\boldsymbol{\\Lambda}||_0$ and is well-suited for sparse count data under a Poisson likelihood. In experiments on large dynamic multilayer networks (e.g., TERRIER and ICEWS), AL$\\ell_0$CORE achieves predictive performance on par with or better than full Tucker at a fraction of the cost and reveals interpretable latent patterns. The work provides a complete Bayesian inference scheme with complete-conditionals and open-source code, demonstrating practical viability for exploring vast latent spaces without exponential blow-up.

Abstract

This paper introduces AL$\ell_0$CORE, a new form of probabilistic non-negative tensor decomposition. AL$\ell_0$CORE is a Tucker decomposition where the number of non-zero elements (i.e., the $\ell_0$-norm) of the core tensor is constrained to a preset value $Q$ much smaller than the size of the core. While the user dictates the total budget $Q$, the locations and values of the non-zero elements are latent variables and allocated across the core tensor during inference. AL$\ell_0$CORE -- i.e., $allo$cated $\ell_0$-$co$nstrained $core$-- thus enjoys both the computational tractability of CP decomposition and the qualitatively appealing latent structure of Tucker. In a suite of real-data experiments, we demonstrate that AL$\ell_0$CORE typically requires only tiny fractions (e.g.,~1%) of the full core to achieve the same results as full Tucker decomposition at only a correspondingly tiny fraction of the cost.

The AL$\ell_0$CORE Tensor Decomposition for Sparse Count Data

TL;DR

ALCORE introduces a Tucker-like tensor decomposition with a sparsity budget on the core, enabling a rich latent structure while avoiding Tucker's exponential parameter growth. By allocating at most non-zero core entries and sampling their locations during inference, the method scales with and is well-suited for sparse count data under a Poisson likelihood. In experiments on large dynamic multilayer networks (e.g., TERRIER and ICEWS), ALCORE achieves predictive performance on par with or better than full Tucker at a fraction of the cost and reveals interpretable latent patterns. The work provides a complete Bayesian inference scheme with complete-conditionals and open-source code, demonstrating practical viability for exploring vast latent spaces without exponential blow-up.

Abstract

This paper introduces ALCORE, a new form of probabilistic non-negative tensor decomposition. ALCORE is a Tucker decomposition where the number of non-zero elements (i.e., the -norm) of the core tensor is constrained to a preset value much smaller than the size of the core. While the user dictates the total budget , the locations and values of the non-zero elements are latent variables and allocated across the core tensor during inference. ALCORE -- i.e., cated -nstrained -- thus enjoys both the computational tractability of CP decomposition and the qualitatively appealing latent structure of Tucker. In a suite of real-data experiments, we demonstrate that ALCORE typically requires only tiny fractions (e.g.,~1%) of the full core to achieve the same results as full Tucker decomposition at only a correspondingly tiny fraction of the cost.
Paper Structure (18 sections, 25 equations, 11 figures)

This paper contains 18 sections, 25 equations, 11 figures.

Figures (11)

  • Figure 1: The core tensor $\boldsymbol{\Lambda}$ in three related tensor decompositions. Transparent versus red values denote zero versus non-zeros. Al$\ell_0$core relies on sparsity to achieve the representational richness of Tucker without suffering its "exponential blowup" in parameters.
  • Figure 2: PPD by wall-clock time. Each point is an Al$\ell_0$core model run for 5,000 iterations, with a certain $Q$, denoted by color, and either the large or small core shape, denoted by $\diamond$ or $+$, respectively. Performance plateaus early, suggesting $\textsc{Al$\ell_0$core}$ can match full Tucker at a small fraction of cost.
  • Figure 3: PPD across models on the full heldout set (top) and the positive-only heldout (bottom) across $Q$. Error bars span the interquartile range across masks.
  • Figure 4: Example of the latent class structure inferred by Al$\ell_0$core with $Q=400$ on the ICEWS dataset. Left: The inferred non-zero locations in the core tensor, with rows sorted according to the inferred values $\lambda_{c \xrightarrow{k} d}^{\mathsmaller{r}}$. Right: Two sets of three inferred classes, each corresponding to a major war, and each following the same cadence---i.e., $c \xrightarrow{19} d'$, $c \xrightarrow{9} d'$, $c' \xrightarrow{14} d$---described in \ref{['sec:qual']}. The blue stem plots depict all elements in a given time-step factor, while red, green, and purple stem plots depict only the largest elements in a given sender, receiver, and action factor, respectively; in these, stems are greyed out when normalized values fall below 0.02.
  • Figure 5: Posterior estimates of effective dimensionalities $K_1^*, K_2^*$ and $K_3^*$, and number of classes $Q^*$ in a synthetic setting. Left: Trace plot from the Gibbs sampler. Right: Histograms of posterior samples. The red line denotes the ground truth value.
  • ...and 6 more figures