A Latent-Variable Formulation of the Poisson Canonical Polyadic Tensor Model: Maximum Likelihood Estimation and Fisher Information
Carlos Llosa-Vite, Daniel M. Dunlavy, Richard B. Lehoucq, Oscar López, Arvind Prasadan
TL;DR
This work reframes the Poisson canonical polyadic (PCP) tensor model for count data as a latent-variable model, enabling classical likelihood-based inference via EM-type algorithms. By introducing a complete loglikelihood on a higher-dimensional latent tensor, the authors show that many existing factorization methods (e.g., NMF and CP-APR) arise as instances of EM/ECM/MCECM, and they derive both the observed and expected Fisher information via Oakes’ theorem and the missing-information principle. They provide a closed-form, rank-one solution with provable identifiability properties and furnish general results for the Fisher information in the multi-way setting, including a conjectured rank relation that elucidates identifiability and redundancy in CP-like decompositions. Numerical experiments validate the Fisher-information expressions and support the rank-conjecture across various tensor orders, sizes, and ranks. The latent-variable approach thus yields concrete tools for parameter inference, rank selection, and diagnostic capabilities in counting-tensor applications such as networks, text mining, and geospatial data analysis.
Abstract
We establish parameter inference for the Poisson canonical polyadic (PCP) model of tensor count data through a latent-variable formulation. Our approach exploits the property that any random tensor that follows the PCP model can be derived by marginalizing an unobservable random tensor of one dimension larger. The loglikelihood of this larger dimensional tensor, referred to as the "complete" loglikelihood, is comprised of multiple loglikelihoods corresponding to rank one PCP models. Using this methodology, we first demonstrate that several existing algorithms for fitting non-negative matrix and tensor factorizations are Expectation-Maximization algorithms. Next, we derive the observed and expected Fisher information matrices for the PCP model by leveraging its latent-variable formulation. The Fisher information provides us crucial insights into the well-posedness of the tensor model, such as the role that the rank of parameter tensor plays in identifiability and indeterminacy. For the special case of PCP models with rank one parameter tensors, we demonstrate that these results are greatly simplified.
