Heavy-tailed and Horseshoe priors for regression and sparse Besov rates
Sergios Agapiou, Ismaël Castillo, Paul Egels
TL;DR
This work introduces Oversmoothed heavy-Tailed (OT) priors and Horseshoe priors on wavelet coefficients to achieve adaptive posterior contraction in nonparametric regression across Sobolev and Besov classes, under a range of $L_p$ losses. The authors establish upper contraction bounds showing OT priors attain near-minimax rates in $L_2$ for Sobolev balls and extend these results to Besov spaces with sparse regimes, including sharp lower bounds that demonstrate the necessity of the OT-scale decay and the non-adaptivity of polynomially decaying HT priors in under-smoothing. They provide the first posterior contraction results for Horseshoe priors in this nonparametric context and demonstrate via simulations that OT priors perform competitively with, and sometimes exceed, traditional wavelet-thresholding methods across various signals and losses. The findings highlight heavy-tailed priors as a flexible, computationally tractable approach for adaptive, sparsity-aware function estimation with practical implications for Bayesian nonparametric regression and Besov-rate recovery.
Abstract
The large variety of functions encountered in nonparametric statistics, calls for methods that are flexible enough to achieve optimal or near-optimal performance over a wide variety of functional classes, such as Besov balls, as well as over a large array of loss functions. In this work, we show that a class of heavy-tailed prior distributions on basis function coefficients introduced in \cite{AC} and called Oversmoothed heavy-Tailed (OT) priors, leads to Bayesian posterior distributions that satisfy these requirements; the case of horseshoe distributions is also investigated, for the first time in the context of nonparametrics, and we show that they fit into this framework. Posterior contraction rates are derived in two settings. The case of Sobolev--smooth signals and $L_2$--risk is considered first, along with a lower bound result showing that the imposed form of the scalings on prior coefficients by the OT prior is necessary to get full adaptation to smoothness. Second, the broader case of Besov-smooth signals with $L_{p'}$--risks, $p' \geq 1$, is considered, and minimax posterior contraction rates, adaptive to the underlying smoothness, and including rates in the so-called {\em sparse} zone, are derived. We provide an implementation of the proposed method and illustrate our results through a simulation study.
