Asymptotic Bayes Optimality for Sparse Count Data

Sayantan Paul; Arijit Chakrabarti

Asymptotic Bayes Optimality for Sparse Count Data

Sayantan Paul, Arijit Chakrabarti

TL;DR

The paper addresses simultaneous testing of whether Poisson means $\theta_i$ are near zero or large in highly high-dimensional, quasi-sparse count data, where $Y_i\sim\mathrm{Poi}(\theta_i)$. It compares a Bayes Oracle under a two-group Gamma mixture prior with decision rules based on a broad class of one-group global-local shrinkage priors, characterized by local scales $\lambda_i$ and a global parameter $\tau$, using the posterior shrinkage weight $1-\mathbb{E}(\kappa_i|Y_i,\tau)$. The authors prove that, as sparsity increases ($p\to0$) and under mild regularity, the one-group rule achieves the Bayes risk of the two-group oracle up to a multiplicative constant, with both known-sparsity and empirical Bayes variants (where $\tau$ is estimated from data) exhibiting this optimality. They also establish posterior concentration inequalities and derive finite-sample bounds on Type I and II errors, supported by simulation studies and a real-data analysis on terrorism counts. The results substantiate using global-local priors as practical, near-optimal alternatives to two-group models for sparse count problems and offer a path for robust, scalable decision-theoretic inference in non-Gaussian settings.

Abstract

Consider a situation of analyzing high-dimensional count data containing an excess of near-zero counts with a small number of moderate or large counts. Assuming that the observations are modeled by a Poisson distribution, we are interested in simultaneous testing of whether the mean of the $i^{\text{th}}$ observation is small or large. In this work, we study some optimal properties (in terms of Bayes risk) of multiple-testing rules when the mean parameter is modeled by both two-group and a general class of one-group shrinkage priors, proposed by Polson and Scott (2010). Here, first, we model each mean by a two-group prior, and under additive $0-1$ loss function, obtain an expression for the optimal Bayes risk under some assumption similar in the spirit of Bogdan et al. (2011). Next, assuming that the observations are truly generated from a two-group mixture model and modelling each mean parameter by the broad class of one-group priors, we study the Bayes risk induced by our chosen class of priors. We have been able to show that, when the underlying level of sparsity is known, under some proposed assumptions, the Bayes risk corresponding to our broad class of priors attains the optimal Bayes risk, upto a multiplicative constant. When this sparsity pattern is unknown, motivated by Yano et al. (2021), we use an empirical Bayes estimate of the global shrinkage parameter. In this case, also, we show that the modified decision rule attains the optimal Bayes risk, upto a multiplicative constant. In this way, as an alternative solution for two-group prior, we propose a broad class of global-local priors having similar optimal properties in terms of Bayes risk for quasi-sparse count data. Finally, the theoretical results are verified using simulation studies followed by a real data analysis.

Asymptotic Bayes Optimality for Sparse Count Data

TL;DR

The paper addresses simultaneous testing of whether Poisson means

are near zero or large in highly high-dimensional, quasi-sparse count data, where

. It compares a Bayes Oracle under a two-group Gamma mixture prior with decision rules based on a broad class of one-group global-local shrinkage priors, characterized by local scales

and a global parameter

, using the posterior shrinkage weight

. The authors prove that, as sparsity increases (

) and under mild regularity, the one-group rule achieves the Bayes risk of the two-group oracle up to a multiplicative constant, with both known-sparsity and empirical Bayes variants (where

is estimated from data) exhibiting this optimality. They also establish posterior concentration inequalities and derive finite-sample bounds on Type I and II errors, supported by simulation studies and a real-data analysis on terrorism counts. The results substantiate using global-local priors as practical, near-optimal alternatives to two-group models for sparse count problems and offer a path for robust, scalable decision-theoretic inference in non-Gaussian settings.

Abstract

observation is small or large. In this work, we study some optimal properties (in terms of Bayes risk) of multiple-testing rules when the mean parameter is modeled by both two-group and a general class of one-group shrinkage priors, proposed by Polson and Scott (2010). Here, first, we model each mean by a two-group prior, and under additive

loss function, obtain an expression for the optimal Bayes risk under some assumption similar in the spirit of Bogdan et al. (2011). Next, assuming that the observations are truly generated from a two-group mixture model and modelling each mean parameter by the broad class of one-group priors, we study the Bayes risk induced by our chosen class of priors. We have been able to show that, when the underlying level of sparsity is known, under some proposed assumptions, the Bayes risk corresponding to our broad class of priors attains the optimal Bayes risk, upto a multiplicative constant. When this sparsity pattern is unknown, motivated by Yano et al. (2021), we use an empirical Bayes estimate of the global shrinkage parameter. In this case, also, we show that the modified decision rule attains the optimal Bayes risk, upto a multiplicative constant. In this way, as an alternative solution for two-group prior, we propose a broad class of global-local priors having similar optimal properties in terms of Bayes risk for quasi-sparse count data. Finally, the theoretical results are verified using simulation studies followed by a real data analysis.

Paper Structure (15 sections, 14 theorems, 95 equations, 1 figure, 5 tables)

This paper contains 15 sections, 14 theorems, 95 equations, 1 figure, 5 tables.

Introduction
Notation
The Two-group prior and the Bayes Oracle
Multiple testing rules using one-group priors
Theoretical Results
Results on Asymptotic Bayes Risk under sparsity using one-group priors
Posterior Concentration inequalities
Type I and Type II error bounds for testing the rule \ref{['eq:4.3.6']}
Type I and Type II error bounds corresponding to an empirical Bayes procedure
Simulation Results
Real data analysis
Concluding remarks and scope for future work
Proofs
Proofs of Theorems
Distributions of different priors of the form \ref{['eq:4.1.4']}

Key Result

Theorem 1

Let $Y_i \sim Poi(\theta_i)$ independently for $i=1,2,\cdots,n$ and suppose each $\theta_i$ is generated from eq:4.1.1. Suppose we want to test $H_{0i}:\nu_i=0$ against $H_{1i}:\nu_i=1$, for $i=1,2,\cdots,n$ using decision rule eq:4.3.6 induced by the class of priors eq:4.1.3 satisfying eq:4.1.4, wh where $Y \sim NB(\alpha,\frac{1}{\delta+1})$.

Figures (1)

Figure 1: Performance of our method on real dataset

Theorems & Definitions (31)

Theorem 1
Remark 2
Theorem 3
Remark 4
Theorem 5
Theorem 6
Remark 7
Theorem 8
Corollary 9
Corollary 10
...and 21 more

Asymptotic Bayes Optimality for Sparse Count Data

TL;DR

Abstract

Asymptotic Bayes Optimality for Sparse Count Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (31)