Tensor Decompositions for Count Data that Leverage Stochastic and Deterministic Optimization

Jeremy M. Myers; Daniel M. Dunlavy

Tensor Decompositions for Count Data that Leverage Stochastic and Deterministic Optimization

Jeremy M. Myers, Daniel M. Dunlavy

TL;DR

The paper tackles Poisson CPD for count data by introducing two complementary strategies that blend stochastic and deterministic optimization to increase the probability of converging to the maximum likelihood estimator (MLE). Hybrid GCP-CPAPR (HybridGC) uses a stochastic GCP-Adam stage to rapidly approach a good basin, then refines with CPAPR to reach high accuracy, while Restarted CPAPR with SVDrop detects rank-deficient paths via mode unfoldings and restarts within the feasible domain to curb wasted computation. Empirical results on synthetic tensors show higher chances of converging to the empirical MLE and better alignment of algebraic structure (via FMS) with the MLE, at a moderate increase in computational cost. Overall, the work provides practical, spectroscopy-informed techniques to improve reliability and efficiency of Poisson CPD in large-scale, sparse count-data applications and offers guidance on parameter choices and diagnostic metrics.

Abstract

There is growing interest to extend low-rank matrix decompositions to multi-way arrays, or tensors. One fundamental low-rank tensor decomposition is the canonical polyadic decomposition (CPD). The challenge of fitting a low-rank, nonnegative CPD model to Poisson-distributed count data is of particular interest. Several popular algorithms use local search methods to approximate the maximum likelihood estimator (MLE) of the Poisson CPD model. This work presents two new algorithms that extend state-of-the-art local methods for Poisson CPD. Hybrid GCP-CPAPR combines Generalized Canonical Decomposition (GCP) with stochastic optimization and CP Alternating Poisson Regression (CPAPR), a deterministic algorithm, to increase the probability of converging to the MLE over either method used alone. Restarted CPAPR with SVDrop uses a heuristic based on the singular values of the CPD model unfoldings to identify convergence toward optimizers that are not the MLE and restarts within the feasible domain of the optimization problem, thus reducing overall computational cost when using a multi-start strategy. We provide empirical evidence that indicates our approaches outperform existing methods with respect to converging to the Poisson CPD MLE.

Tensor Decompositions for Count Data that Leverage Stochastic and Deterministic Optimization

TL;DR

Abstract

Paper Structure (50 sections, 13 equations, 11 figures, 3 tables, 4 algorithms)

This paper contains 50 sections, 13 equations, 11 figures, 3 tables, 4 algorithms.

Introduction
Hybrid GCP-CPAPR (HybridGC)
Restarted CPAPR with SVDrop
Organization
Background and related work
Notation and conventions
Matricization: transforming a tensor into a matrix
Canonical polyadic decomposition
Low-rank CP tensor model
Computing the Poisson CPD for count data
Error in computing the CPD using multi-start
An error estimator on the loss function
An error estimator on the algebraic structures
Factor match score (FMS)
Probability of similarity
...and 35 more sections

Figures (11)

Figure 1: Examples of two types of behaviors of traces of loss function values for GCP, CPAPR, and HybridGC on LowRankSmall. The MLE and the second local minimum are shown in both plots for direct comparison.
Figure 1: Traces of objective function values for the exemplar trial: two decompositions computed by CPAPR starting from the same initial guess but with different numbers of maximum allowable inner iterations per mode. The $x$-axis is given in terms of the number of outer iterations, so optimizations by mode are differentiated with vertical blocks of color. Only the first 8 outer iterations are shown: CPAPR with $l_{max}=4$ (top) converges to the MLE; CPAPR with $l_{max}=5$ (bottom) has settled in the basin of attraction of a different minimizer. The first Poisson loss value in each mode is emphasized with a marker.
Figure 1: Factor match scores between CP models computed with HybridGC, CPAPR-MU, and GCP-Adam and the approximate global optimizer, $\boldsymbol{\mathscr{\widehat{M}}}_{\mathcal{S}}^*$. The dash-dot gray vertical lines and dotted black vertical lines denote the levels of "similar" and "equal" described in Lorenzo-Seva06TuckerCongruenceCoefficient. Colormaps scaled for clarity.
Figure 2: Performance of HybridGC versus CPAPR as a standalone solver where both methods converge to the MLE. The $x$-axis indicates how many iterations were run in the GCP stage before starting the CPAPR stage of HybridGC.
Figure 2: The contour plot illustrates how convergence depends on the number of inner iterations in the search direction for a 2D problem. Blue represents minima and brown represents maxima; darker shades are more extreme values than lighter shades. From the same starting initialization, CPAPR is run with three different values for inner iterations $l_{max}$: 1) the "Goldilocks" amount that leads to the MLE; 2) too few or 3) too many inner iterations, which both lead to different minimizers.
...and 6 more figures

Tensor Decompositions for Count Data that Leverage Stochastic and Deterministic Optimization

TL;DR

Abstract

Tensor Decompositions for Count Data that Leverage Stochastic and Deterministic Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (11)