Table of Contents
Fetching ...

Bayesian $L_{\frac{1}{2}}$ regression

Xiongwen Ke, Yanan Fan

Abstract

It is well known that Bridge regression enjoys superior theoretical properties when compared to traditional LASSO. However, the current latent variable representation of its Bayesian counterpart, based on the exponential power prior, is computationally expensive in higher dimensions. In this paper, we show that the exponential power prior has a closed form scale mixture of normal decomposition for $α=(\frac{1}{2})^γ, γ\in \{1, 2,\ldots\}$. We call these types of priors $L_{\frac{1}{2}}$ prior for short. We develop an efficient partially collapsed Gibbs sampling scheme for computation using the $L_{\frac{1}{2}}$ prior and study theoretical properties when $p>n$. In addition, we introduce a non-separable Bridge penalty function inspired by the fully Bayesian formulation and a novel, efficient coordinate descent algorithm. We prove the algorithm's convergence and show that the local minimizer from our optimisation algorithm has an oracle property. Finally, simulation studies were carried out to illustrate the performance of the new algorithms. Supplementary materials for this article are available online.

Bayesian $L_{\frac{1}{2}}$ regression

Abstract

It is well known that Bridge regression enjoys superior theoretical properties when compared to traditional LASSO. However, the current latent variable representation of its Bayesian counterpart, based on the exponential power prior, is computationally expensive in higher dimensions. In this paper, we show that the exponential power prior has a closed form scale mixture of normal decomposition for . We call these types of priors prior for short. We develop an efficient partially collapsed Gibbs sampling scheme for computation using the prior and study theoretical properties when . In addition, we introduce a non-separable Bridge penalty function inspired by the fully Bayesian formulation and a novel, efficient coordinate descent algorithm. We prove the algorithm's convergence and show that the local minimizer from our optimisation algorithm has an oracle property. Finally, simulation studies were carried out to illustrate the performance of the new algorithms. Supplementary materials for this article are available online.

Paper Structure

This paper contains 14 sections, 10 theorems, 38 equations, 1 figure, 6 tables, 2 algorithms.

Key Result

Lemma \oldthetheorem

The exponential power distribution of the form $\pi(\beta)= \frac{\lambda^{2^{\gamma}}}{2 (2^{\gamma}!)} \exp \left(-\lambda\left|\beta\right|^{\frac{1}{2^{\gamma}}}\right),$ with $\alpha=(\frac{1}{2})^{\gamma}$ with $\gamma \in \{1,2,3\ldots\}$, can be decomposed as or equivalently

Figures (1)

  • Figure 1: Solution paths according to two initialisation strategies, $\boldsymbol{\beta}_{initial}=0$ (top row) and random starting values $\boldsymbol{\beta}_{initial} \sim N_{p}(0,I_{p})$ (bottom row). The left panel shows the solution paths for the 10 nonzero elements of $\boldsymbol{\beta}$, the middle panel shows the solution paths for the (990) zero elements of $\boldsymbol{\beta}$ and the right panel shows the path for the loss function $L(\boldsymbol{\beta})$.

Theorems & Definitions (10)

  • Lemma \oldthetheorem
  • Theorem \oldthetheorem
  • Theorem \oldthetheorem
  • Theorem \oldthetheorem
  • Theorem \oldthetheorem
  • Corollary \oldthetheorem.1
  • Lemma \oldthetheorem
  • Theorem \oldthetheorem
  • Theorem \oldthetheorem
  • Theorem \oldthetheorem