Negative Binomial Matrix Completion
Yu Lu, Kevin Bui, Roummel F. Marcia
TL;DR
This work addresses matrix completion for count data with overdispersion by modeling observation noise with a negative binomial distribution and employing a nuclear-norm regularized MAP objective. The proposed NB matrix completion is solved via proximal gradient descent with singular-value thresholding, deriving the NB-specific gradient and leveraging an efficient proximal operator. Empirical results across bike-sharing, vehicle traffic, and microscopy datasets show NB matrix completion outperforming Poisson matrix completion under NB noise, while remaining competitive when data are Poisson-like (large dispersion parameter). Overall, the approach provides a robust tool for recovering low-rank count matrices in realistic noisy and missing-data settings, with practical implications for imaging, traffic analysis, and related domains.
Abstract
Matrix completion focuses on recovering missing or incomplete information in matrices. This problem arises in various applications, including image processing and network analysis. Previous research proposed Poisson matrix completion for count data with noise that follows a Poisson distribution, which assumes that the mean and variance are equal. Since overdispersed count data, whose variance is greater than the mean, is more likely to occur in realistic settings, we assume that the noise follows the negative binomial (NB) distribution, which can be more general than the Poisson distribution. In this paper, we introduce NB matrix completion by proposing a nuclear-norm regularized model that can be solved by proximal gradient descent. In our experiments, we demonstrate that the NB model outperforms Poisson matrix completion in various noise and missing data settings on real data.
