Skew-symmetric approximations of posterior distributions

Francesco Pozza; Daniele Durante; Botond Szabo

Skew-symmetric approximations of posterior distributions

Francesco Pozza, Daniele Durante, Botond Szabo

TL;DR

The paper addresses the bias introduced by symmetric posterior approximations in Bayesian inference by introducing a general, computationally light skew-symmetric perturbation that can be applied to any existing symmetric approximation (e.g., Laplace, VB, EP). It proves a skewing factor exists and is optimal within the skew-symmetric class, yielding provable improvements in accuracy across TV, KL, reverse-KL, and alpha-divergences, with rates that can beat the standard Gaussian benchmarks asymptotically. The approach preserves tractability through a closed-form skewing factor and a simple sampling scheme, and empirical results on simulated and real data demonstrate substantial finite-sample and high-dimensional gains, including large ESS improvements in importance sampling. This work broadens the practical toolkit for deterministic posterior approximations, offering a broadly applicable, theory-grounded method to capture posterior skewness without incurring heavy computational costs. The proposed framework also opens avenues to integrate skewness into existing VB/EP optimization schemes and to extend the ideas to higher-order or alternative symmetric families.

Abstract

Popular deterministic approximations of posterior distributions from, e.g. the Laplace method, variational Bayes and expectation-propagation, generally rely on symmetric approximating families, often taken to be Gaussian. This choice facilitates optimization and inference, but typically affects the quality of the overall approximation. In fact, even in basic parametric models, the posterior distribution often displays asymmetries that yield bias and a reduced accuracy when considering symmetric approximations. Recent research has moved towards more flexible approximating families which incorporate skewness. However, current solutions are often model specific, lack a general supporting theory, increase the computational complexity of the optimization problem, and do not provide a broadly applicable solution to incorporate skewness in any symmetric approximation. This article addresses such a gap by introducing a general and provably optimal strategy to perturb any off-the-shelf symmetric approximation of a generic posterior distribution. This novel perturbation scheme is derived without additional optimization steps, and yields a similarly tractable approximation within the class of skew-symmetric densities that provably enhances the finite sample accuracy of the original symmetric counterpart. Furthermore, under suitable assumptions, it improves the convergence rate to the exact posterior by at least a $\sqrt{n}$ factor, in asymptotic regimes. These advancements are illustrated in numerical studies focusing on skewed perturbations of state-of-the-art Gaussian approximations.

Skew-symmetric approximations of posterior distributions

TL;DR

Abstract

factor, in asymptotic regimes. These advancements are illustrated in numerical studies focusing on skewed perturbations of state-of-the-art Gaussian approximations.

Paper Structure (25 sections, 2 theorems, 84 equations, 1 figure, 3 tables, 2 algorithms)

This paper contains 25 sections, 2 theorems, 84 equations, 1 figure, 3 tables, 2 algorithms.

1. Introduction
2. Skew-symmetric approximations of posterior distributions
2.1. An overview of symmetric approximations of posterior distributions
2.2. An overview of skew-symmetric distributions
2.3 Skew-symmetric perturbation of symmetric approximations
2.4 Efficient evaluation of the skewing factor
3. Theoretical properties of skew-symmetric approximations
3.1 Finite sample properties and optimality
3.2 Asymptotic properties
4. Simulation studies
4.1 One-dimensional Poisson model
4.2 High-dimensional Poisson regression
5. Real-data applications
5.1 Zero-inflated negative binomial regression
5.2 Hierarchical semi-parametric logistic regression
...and 10 more sections

Key Result

Lemma B.1

Let $K_n = \{ {\boldsymbol \theta} \in \Theta \, : \, \| {\boldsymbol \theta} - {\boldsymbol \theta} _0\| < M_n \sqrt{d/n}\}$. Then, under Assumptions cond:4, cond:m1 and cond:m3, we have, for $c_0>0$ sufficiently large (not depending on $n$ and $d$) and $M_n = \sqrt{c_0 \log n}$, that where $K_n^c$ denotes the complement of $K_n$.

Figures (1)

Figure 1: Empirical comparison of the accuracy achieved by three state-of-the-art Gaussian approximations from the Laplace method, black-box VB and EP, versus the corresponding skew-symmetric perturbations. For three routinely-employed divergences $\mathcal{D}$ (TV, KL, reverse-KL) the first three panels display the boxplots of $\mathcal{D}(\pi_{j,n} \mid \mid q_{j,n})$, $j=1, \ldots, 62$, where $q_{j,n}$ is the $j$th marginal of either $\bar{q}_{n,{ {\boldsymbol \theta} }^*}$ (Gaussian) or ${q}_{n,{ {\boldsymbol \theta} }^*}$ (Skew-symmetric). The fourth panel shows instead the boxplot of the absolute differences between the approximated and actual posterior means of the $d=62$ standardized parameters (standardization proceeds as discussed in the caption of Table \ref{['tab_high_marg_0']}).

Theorems & Definitions (8)

proof
proof
proof
proof
Lemma B.1: Posterior contraction
proof
Lemma B.2
proof

Skew-symmetric approximations of posterior distributions

TL;DR

Abstract

Skew-symmetric approximations of posterior distributions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (8)