Table of Contents
Fetching ...

Consistent Group selection using Global-local prior in High dimensional setup

Sayantan Paul, Prasenjit Ghosh, Arijit Chakrabarti

TL;DR

This work addresses high-dimensional regression with inherent group structure by proposing a Bayesian framework based on one-group global-local shrinkage priors, including a modified polynomial-tail class. It introduces the Half-Thresholding rule, declaring a group active when the posterior shrinkage-adjusted signal exceeds a fixed threshold, and proves oracle properties under both known and data-driven treatments of the global shrinkage parameter. Theoretical results show selection consistency and optimal estimation rates for HT under block-orthogonal designs and extend to empirical Bayes and full Bayes implementations, even as the number of active groups grows with the sample size. Extensive simulations and real-data analyses demonstrate competitive performance against spike-and-slab and group-LASSO methods, highlighting practical robustness in non-orthogonal designs and sparse regimes. The framework offers a scalable, theoretically sound alternative to spike-and-slab priors for sparse grouped variable selection in complex data settings.

Abstract

We consider the problem of model selection when grouping structure is inherent within the regressors. Using a Bayesian approach, we model the mean vector by a one-group global-local shrinkage prior belonging to a broad class of such priors that includes the horseshoe prior. In the context of variable selection, this class of priors was studied by Tang et al. (2018). A modified form of the usual class of global-local shrinkage priors with polynomial tail on the group regression coefficients is proposed. The resulting threshold rule selects the active group if within a group, the ratio of the $L_2$ norm of the posterior mean of its group coefficient to that of the corresponding ordinary least square group estimate is greater than a half. In the theoretical part of this article, we have used the global shrinkage parameter either as a tuning one or an empirical Bayes estimate of it depending on the knowledge regarding the underlying sparsity of the model. When the proportion of active groups is known, using $τ$ as a tuning parameter, we have proved that our method is oracle. In case this proportion is unknown, we propose an empirical Bayes estimate of $τ$. Even if this empirical Bayes estimate is used, then also our half-thresholding rule captures the truly important groups and obtains optimal estimation rate of the group coefficients simultaneously. Though our theoretical works rely on a special form of the design matrix, for general design matrices also, our simulation results show that the half-thresholding rule yields results similar to that of Yang and Narisetty (2020). As a consequence of this, in a high dimensional sparse group selection problem, instead of using the so-called `gold standard' spike and slab prior, one can use the one-group global-local shrinkage priors with polynomial tail to obtain similar results.

Consistent Group selection using Global-local prior in High dimensional setup

TL;DR

This work addresses high-dimensional regression with inherent group structure by proposing a Bayesian framework based on one-group global-local shrinkage priors, including a modified polynomial-tail class. It introduces the Half-Thresholding rule, declaring a group active when the posterior shrinkage-adjusted signal exceeds a fixed threshold, and proves oracle properties under both known and data-driven treatments of the global shrinkage parameter. Theoretical results show selection consistency and optimal estimation rates for HT under block-orthogonal designs and extend to empirical Bayes and full Bayes implementations, even as the number of active groups grows with the sample size. Extensive simulations and real-data analyses demonstrate competitive performance against spike-and-slab and group-LASSO methods, highlighting practical robustness in non-orthogonal designs and sparse regimes. The framework offers a scalable, theoretically sound alternative to spike-and-slab priors for sparse grouped variable selection in complex data settings.

Abstract

We consider the problem of model selection when grouping structure is inherent within the regressors. Using a Bayesian approach, we model the mean vector by a one-group global-local shrinkage prior belonging to a broad class of such priors that includes the horseshoe prior. In the context of variable selection, this class of priors was studied by Tang et al. (2018). A modified form of the usual class of global-local shrinkage priors with polynomial tail on the group regression coefficients is proposed. The resulting threshold rule selects the active group if within a group, the ratio of the norm of the posterior mean of its group coefficient to that of the corresponding ordinary least square group estimate is greater than a half. In the theoretical part of this article, we have used the global shrinkage parameter either as a tuning one or an empirical Bayes estimate of it depending on the knowledge regarding the underlying sparsity of the model. When the proportion of active groups is known, using as a tuning parameter, we have proved that our method is oracle. In case this proportion is unknown, we propose an empirical Bayes estimate of . Even if this empirical Bayes estimate is used, then also our half-thresholding rule captures the truly important groups and obtains optimal estimation rate of the group coefficients simultaneously. Though our theoretical works rely on a special form of the design matrix, for general design matrices also, our simulation results show that the half-thresholding rule yields results similar to that of Yang and Narisetty (2020). As a consequence of this, in a high dimensional sparse group selection problem, instead of using the so-called `gold standard' spike and slab prior, one can use the one-group global-local shrinkage priors with polynomial tail to obtain similar results.
Paper Structure (19 sections, 15 theorems, 185 equations, 12 tables)

This paper contains 19 sections, 15 theorems, 185 equations, 12 tables.

Key Result

Proposition 1

Suppose that the $g^{th}$ group is inactive, that is, $\boldsymbol{\beta}_{g}^{0}=\mathbf{0}$. If $\tau_n \to 0$ as $n \to \infty$, then $E(1-\kappa_g\mid\tau_n,\sigma^2,\mathcal{D}) \xrightarrow{P} 0$ as $n \to \infty$.

Theorems & Definitions (34)

  • Proposition 1
  • Proposition 2
  • Remark 1
  • Remark 2
  • Theorem 1: Variable Selection Consistency
  • Remark 3
  • Remark 4
  • Theorem 2
  • Remark 5
  • Remark 6
  • ...and 24 more