Robust Bayesian Inference on Riemannian Submanifold

Rong Tang; Anirban Bhattacharya; Debdeep Pati; Yun Yang

Robust Bayesian Inference on Riemannian Submanifold

Rong Tang, Anirban Bhattacharya, Debdeep Pati, Yun Yang

TL;DR

This work addresses uncertainty quantification for parameters constrained to Riemannian submanifolds by embedding manifold structure into Bayesian analysis through manifold-supported priors and a robust RPETEL posterior. It introduces a manifold Bernstein–von Mises framework for projected posteriors and demonstrates three robustness benefits: calibrated uncertainty without fully specified likelihoods, meaningful inference when the unconstrained minimizer lies off the manifold, and preserved efficiency under correct specification. Computationally, the authors develop the Riemannian Random-Walk Metropolis (RRWM) sampler and prove its mixing time scales almost linearly with the intrinsic dimension $d$, independent of the ambient dimension $D$. Numerical experiments on multiple quantile regression, spectral projector estimation, and diffusion-tensor mean inference validate the approach, showing improved uncertainty calibration and sample-efficiency when exploiting the manifold structure. Collectively, the paper provides a rigorous, robust, and computationally efficient framework for Bayesian inference on non-Euclidean parameter spaces with broad applicability in statistics and data science.

Abstract

Manifold-valued parameters routinely arise in modern statistical applications such as in medical imaging, robotics, and computer vision, to name a few. While traditional Bayesian approaches are applicable to such settings by considering an ambient Euclidean space as the parameter space, we demonstrate the benefits of integrating manifold structure into the Bayesian framework, both theoretically and computationally. Moreover, existing Bayesian approaches which are designed specifically for manifold-valued parameters are primarily model-based, which are typically subject to inaccurate uncertainty quantification under model misspecification. In this article, we propose a robust model-free Bayesian inference for parameters defined on a Riemannian submanifold, which is shown to provide valid uncertainty quantification from a frequentist perspective. Computationally, we propose a Markov chain Monte Carlo to sample from the posterior on the Riemannian submanifold, where the mixing time, in the large sample regime, is shown to depend only on the intrinsic dimension of the parameter space instead of the potentially muchlarger ambient dimension. Our numerical results demonstrate the effectiveness of our approach on a variety of problems, such as multiple quantile regression, reduced-rank regression, and Fréchet mean estimation.

Robust Bayesian Inference on Riemannian Submanifold

TL;DR

, independent of the ambient dimension

. Numerical experiments on multiple quantile regression, spectral projector estimation, and diffusion-tensor mean inference validate the approach, showing improved uncertainty calibration and sample-efficiency when exploiting the manifold structure. Collectively, the paper provides a rigorous, robust, and computationally efficient framework for Bayesian inference on non-Euclidean parameter spaces with broad applicability in statistics and data science.

Abstract

Paper Structure (47 sections, 24 theorems, 427 equations, 5 figures, 4 tables, 6 algorithms)

This paper contains 47 sections, 24 theorems, 427 equations, 5 figures, 4 tables, 6 algorithms.

Introduction
Notation
Preliminary
Bayesian Inference with Manifold-supported Priors
Robust Bayesian Inference on Manifold
Posterior Sampling on Riemannian Submanifold
Riemannian random-walk Metropolis (RRWM) algorithm
Mixing time analysis of RRWM algorithm for Bayesian RPETEL sampling
Numerical Illustration
Multiple quantile modeling with common slopes
Spectral projectors of covariance matrices
Mean parameter inference for diffusion tensors
Conclusion and Discussion
Notions in Riemannian Submanifold
Mixing time analysis
...and 32 more sections

Key Result

Theorem 1

Suppose Assumptions 1-4 hold. Let $\widehat{\theta}:\mathcal{X}^n\to S_{\Pi}$ denote the empirical risk minimizer, that is, $\widehat{\theta}(X^{(n)})\in {\arg\min}_{\theta\in S_{\Pi}}\frac{1}{n}\sum_{i=1}^n\ell(X_i,\theta)$. Then, there exists a set $\mathcal{A}\subset \mathcal{X}^{n}$ with $\mathb

Figures (5)

Figure 2: The figures compare Bayesian RPETEL and Bayesian PETEL on a toy example with loss function $\ell(x,\theta)=\sum_{j=1}^2 \exp(-(\theta_j-0.5)^2)\cdot x_j-\exp(-\theta_j^2)$ for $x=(x_1,x_2)^T$ and $\theta=(\theta_1,\theta_2)^T$. The parameter is constrained to the line-shaped manifold $\mathcal{M}=\{(\theta_1,\theta_2)\in \mathbb R^2\,:\,\theta_2=0.9\theta_1+0.1\}$. We draw $n=1000$ i.i.d. samples from $\mathcal{N}(0_2,I_2)$ and use a uniform prior on $\mathcal{M}\cap[-2,2]^2$. The Bayesian RPETEL posterior (BRPETEL) builds the ETEL function from the Riemannian gradient on $\mathcal{M}$, while the Bayesian PETEL posterior (BPETEL) uses the Euclidean gradient in $\mathbb R^2$ to construct the ETEL. Figure (a) shows the population risk $\mathcal{R}(\theta)=\mathbb{E}[\ell(X,\theta)]$ for $\theta\in[-2,2]^2$. The blue dot marks the Euclidean risk minimizer on $[-2,2]^2$, which lies very close to but not exactly on $\mathcal{M}$. The green and orange dots indicate the posterior modes of BRPETEL and BPETEL, respectively, when $\alpha_n=\log n$; the BRPETEL mode sits close to the true risk minimizer, whereas the BPETEL mode is noticeably farther away. Figure (b) plots the marginal densities of $\theta_1$ for both posteriors under $\alpha_n=\log n$ and $\alpha_n=0$. The gray line marks $\theta_1^*$, the $\theta_1$-coordinate of the risk minimizer on $\mathcal{M}$. When $\alpha_n=0$, both posteriors are multimodal because the risk function is nonconvex. In particular, ${\rm grad}_{\theta}\mathcal{R}(\theta)=0$ has two solutions on $\mathcal{M}$: one near $(-0.545,0.391)$ corresponding to the risk minimizer on $\mathcal{M}$, and another near $(0.975,0.977)$ corresponding to a local risk maximizer. With $\alpha_n=0$, the BRPETEL exhibits two comparable modes around these points, while the BPETEL with $\alpha_n=0$ shows a poorly interpretable shape and allocates little mass near $\theta_1^*$. When $\alpha_n=\log n$, the BRPETEL concentrates around $\theta_1^*$, whereas the BPETEL still fails to place meaningful mass there.
Figure 3: Density plots of the fractional anisotropy (FA) computed from the posterior samples obtained using Bayesian RPETEL and Wishart Modeling in the extrinsic mean example. The plot includes overlays from ten runs of experiments. The red vertical line indicates the FA of the empirical risk minimizer $\widehat{\theta}$ computed from the original dataset.
Figure : (a) Density plot: Correctly specified likelihood
Figure : (a) Density plot: Correctly specified likelihood
Figure : (b) Density plot: Misspecified likelihood

Theorems & Definitions (36)

Definition 1: Submanifold
Definition 2: Local $C^3$-Smoothness
Theorem 1: Manifold BvM Theorem for Gibbs Posterior
Corollary 1: BvM Result under Correct Model Specification
Theorem 2: Manifold BvM theorem for Bayesian RPETEL
Corollary 2: Validity of Wald-type Credible Region
Corollary 3: Bayesian RPETEL under Correct Model Specification
Remark 1
Corollary 4
Remark 2
...and 26 more

Robust Bayesian Inference on Riemannian Submanifold

TL;DR

Abstract

Robust Bayesian Inference on Riemannian Submanifold

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (36)