Asymptotic Bayes Optimality for Sparse Count Data
Sayantan Paul, Arijit Chakrabarti
TL;DR
The paper addresses simultaneous testing of whether Poisson means $\theta_i$ are near zero or large in highly high-dimensional, quasi-sparse count data, where $Y_i\sim\mathrm{Poi}(\theta_i)$. It compares a Bayes Oracle under a two-group Gamma mixture prior with decision rules based on a broad class of one-group global-local shrinkage priors, characterized by local scales $\lambda_i$ and a global parameter $\tau$, using the posterior shrinkage weight $1-\mathbb{E}(\kappa_i|Y_i,\tau)$. The authors prove that, as sparsity increases ($p\to0$) and under mild regularity, the one-group rule achieves the Bayes risk of the two-group oracle up to a multiplicative constant, with both known-sparsity and empirical Bayes variants (where $\tau$ is estimated from data) exhibiting this optimality. They also establish posterior concentration inequalities and derive finite-sample bounds on Type I and II errors, supported by simulation studies and a real-data analysis on terrorism counts. The results substantiate using global-local priors as practical, near-optimal alternatives to two-group models for sparse count problems and offer a path for robust, scalable decision-theoretic inference in non-Gaussian settings.
Abstract
Consider a situation of analyzing high-dimensional count data containing an excess of near-zero counts with a small number of moderate or large counts. Assuming that the observations are modeled by a Poisson distribution, we are interested in simultaneous testing of whether the mean of the $i^{\text{th}}$ observation is small or large. In this work, we study some optimal properties (in terms of Bayes risk) of multiple-testing rules when the mean parameter is modeled by both two-group and a general class of one-group shrinkage priors, proposed by Polson and Scott (2010). Here, first, we model each mean by a two-group prior, and under additive $0-1$ loss function, obtain an expression for the optimal Bayes risk under some assumption similar in the spirit of Bogdan et al. (2011). Next, assuming that the observations are truly generated from a two-group mixture model and modelling each mean parameter by the broad class of one-group priors, we study the Bayes risk induced by our chosen class of priors. We have been able to show that, when the underlying level of sparsity is known, under some proposed assumptions, the Bayes risk corresponding to our broad class of priors attains the optimal Bayes risk, upto a multiplicative constant. When this sparsity pattern is unknown, motivated by Yano et al. (2021), we use an empirical Bayes estimate of the global shrinkage parameter. In this case, also, we show that the modified decision rule attains the optimal Bayes risk, upto a multiplicative constant. In this way, as an alternative solution for two-group prior, we propose a broad class of global-local priors having similar optimal properties in terms of Bayes risk for quasi-sparse count data. Finally, the theoretical results are verified using simulation studies followed by a real data analysis.
