Noisy Nonnegative Tucker Decomposition with Sparse Factors and Missing Data
Xiongjun Zhang, Michael K. Ng
TL;DR
This work tackles recovering nonnegative tensors from incomplete, noisy measurements by proposing a sparse nonnegative Tucker decomposition with a maximum-likelihood loss augmented by $\ell_0$ sparsity on factor matrices. The authors derive general error bounds under broad noise models and specialize them to Gaussian, Laplace, and Poisson observations, establishing near-optimal minimax rates. An ADMM-based algorithm is developed to solve the nonconvex, discretization-amenable optimization problem, and the method is validated on synthetic and real data, where it consistently outperforms matrix-based and tensor-tensor product baselines. The results highlight the effectiveness of enforcing sparsity in the factor matrices while maintaining nonnegativity, yielding accurate tensor completion and meaningful latent factors in practical settings.
Abstract
Tensor decomposition is a powerful tool for extracting physically meaningful latent factors from multi-dimensional nonnegative data, and has been an increasing interest in a variety of fields such as image processing, machine learning, and computer vision. In this paper, we propose a sparse nonnegative Tucker decomposition and completion method for the recovery of underlying nonnegative data under noisy observations. Here the underlying nonnegative data tensor is decomposed into a core tensor and several factor matrices with all entries being nonnegative and the factor matrices being sparse. The loss function is derived by the maximum likelihood estimation of the noisy observations, and the $\ell_0$ norm is employed to enhance the sparsity of the factor matrices. We establish the error bound of the estimator of the proposed model under generic noise scenarios, which is then specified to the observations with additive Gaussian noise, additive Laplace noise, and Poisson observations, respectively. Our theoretical results are better than those by existing tensor-based or matrix-based methods. Moreover, the minimax lower bounds are shown to be matched with the derived upper bounds up to logarithmic factors. Numerical examples on both synthetic and real-world data sets demonstrate the superiority of the proposed method for nonnegative tensor data completion.
