Geometric ergodicity of Gibbs samplers for linear latent models with GIG variance mixtures

Elsiddig Awadelkarim; David Bolin; Xiaotian Jin; Alexandre B. Simas; Jonas Wallin

Geometric ergodicity of Gibbs samplers for linear latent models with GIG variance mixtures

Elsiddig Awadelkarim, David Bolin, Xiaotian Jin, Alexandre B. Simas, Jonas Wallin

TL;DR

This work establishes robust geometric ergodicity for Gibbs samplers in LLnGMs with GIG variance mixtures by proving trace-class properties in interior regimes and deploying drift–minorization in boundary regimes, including explicit null-smallness conditions. The results cover the full GIG parameter space and special GH cases such as NIG and GAL, ensuring the validity of Gibbs-based stochastic-gradient estimators for maximum likelihood. A non-centered parameterization is shown to improve integrability of score functions, enabling principled Rao–Blackwellized SGD with ergodic guarantees. Numerical experiments corroborate the theory, illustrating how mixing varies across parameter regimes and how null-smallness affects convergence along null directions. This framework provides a scalable, theoretically grounded path for inference in a broad class of latent non-Gaussian models while offering practical guidance for algorithm design.

Abstract

We study geometric ergodicity of the Gibbs sampler for linear latent non-Gaussian models (LLnGMs), a class of hierarchical models in which conditional Gaussian structure is preserved through generalized inverse Gaussian (GIG) variance-mixture augmentation. Two complementary routes to geometric ergodicity are developed for the marginal chain on the mixing variables. First, we show that the associated Markov operator is trace-class, and hence admits a spectral gap, over a large portion of the GIG parameter space. Second, for the remaining boundary and heavy-tail regimes, we establish geometric ergodicity via drift and minorization, subject to an explicit null-smallness condition that quantifies how the drift interacts with the null space of the observation operator. Together, these results cover the full GIG parameter space, including the normal-inverse Gaussian, generalized asymmetric Laplace, and Student-$t$ special cases. The geometric ergodicity of this chain underpins the consistency of Gibbs-based stochastic-gradient estimators for maximum likelihood estimation, and we provide conditions that make the required integrability checks transparent. Numerical experiments illustrate the theoretical findings, contrasting mixing efficiency across parameter regimes and probing the role of the null-smallness constant.

Geometric ergodicity of Gibbs samplers for linear latent models with GIG variance mixtures

TL;DR

Abstract

special cases. The geometric ergodicity of this chain underpins the consistency of Gibbs-based stochastic-gradient estimators for maximum likelihood estimation, and we provide conditions that make the required integrability checks transparent. Numerical experiments illustrate the theoretical findings, contrasting mixing efficiency across parameter regimes and probing the role of the null-smallness constant.

Paper Structure (35 sections, 33 theorems, 269 equations, 1 figure, 4 tables, 1 algorithm)

This paper contains 35 sections, 33 theorems, 269 equations, 1 figure, 4 tables, 1 algorithm.

Introduction
Geometric Ergodicity of Markov Chains
Latent Gaussian models and the non-Gaussian extensions
Stochastic Gradient Descent
Main contributions and outline
The Gibbs sampler and main results of geometric ergodicity
The Gibbs sampler
Gibbs sampling from linear operator's viewpoint
Main results on trace-class and geometric ergodicity
Stochastic gradient estimation
Likelihood gradient via Fisher's identity
Rao--Blackwellized gradient estimator and Markov chain ergodicity
Integrability of score functions and the role of parametrization
Summary
Proof of the main result
...and 20 more sections

Key Result

Proposition 2.1

Under the centered parameterization eq:centered-combined, the conditional distribution of $W|V, Y$ is where $Q = \sigma^{-2} K^{\top} D_{V}^{-1} K + \sigma_{\epsilon}^{-2} A^\top A$ and $D_{V} = \text{diag}(V)$. Under the non-centered parameterization eq:non-centered-combined, the conditional distribution of $M|V, Y$ is where $Q = \sigma_{\epsilon}^{-2} B^\top B + \sigma^{-2} D_{V}^{-1}$, $B = A

Figures (1)

Figure 1: Simulation S2: mixing efficiency along the null-smallness scan induced by varying $\mu$ while keeping $A$ fixed. The figure reports IACT for four monitored summaries: $S_+$ (right-tail), $S_-$ and $S_{\log}$ (boundary-sensitive), and the null-direction statistic $T_{\mathrm{null}}$.

Theorems & Definitions (62)

Proposition 2.1
Proposition 2.2: Properties of the $V$-marginal Markov operator
proof
Proposition 2.3: Aperiodicity and Harris recurrence
proof
Proposition 2.4: Trace-class operators have an $L_2(\pi)$ spectral gap
proof
Proposition 2.5: Equivalence of transition kernels
proof
Theorem 2.1: Trace-class property
...and 52 more

Geometric ergodicity of Gibbs samplers for linear latent models with GIG variance mixtures

TL;DR

Abstract

Geometric ergodicity of Gibbs samplers for linear latent models with GIG variance mixtures

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (62)