Nonparametric Automatic Differentiation Variational Inference with Spline Approximation

Yuda Shao; Shan Yu; Tianshu Feng

Nonparametric Automatic Differentiation Variational Inference with Spline Approximation

Yuda Shao, Shan Yu, Tianshu Feng

TL;DR

This paper introduces S-ADVI, a spline-based nonparametric variational inference framework that replaces parametric posteriors with spline mixtures to capture complex posterior shapes, including skewness, multimodality, and bounded support. It derives a spline representation for the posterior, provides a theoretical analysis establishing a lower bound for $IWAE$ and bounds on the KL divergence between the spline approximation and the true posterior, and discusses adaptive boundary handling and regularization via a roughness penalty. The authors implement a practical training procedure using a concrete distribution and annealing to sample from spline mixtures and apply the reparameterization trick for backpropagation. Empirical results on simulated cases and real data (e.g., FMNIST, MNIST, CIFAR-10) show S-ADVI can outperform Gaussian-ADVI and GM-ADVI in posterior recovery and, in many settings, achieve competitive or superior reconstruction and classification performance, with interpretable spline coefficients offering insight into latent-variable shapes. Overall, S-ADVI advances variational inference by combining flexibility, interpretability, and theoretical guarantees, enabling effective Bayesian modeling of complex posteriors and incomplete-data generative tasks.

Abstract

Automatic Differentiation Variational Inference (ADVI) is efficient in learning probabilistic models. Classic ADVI relies on the parametric approach to approximate the posterior. In this paper, we develop a spline-based nonparametric approximation approach that enables flexible posterior approximation for distributions with complicated structures, such as skewness, multimodality, and bounded support. Compared with widely-used nonparametric variational inference methods, the proposed method is easy to implement and adaptive to various data structures. By adopting the spline approximation, we derive a lower bound of the importance weighted autoencoder and establish the asymptotic consistency. Experiments demonstrate the efficiency of the proposed method in approximating complex posterior distributions and improving the performance of generative models with incomplete data.

Nonparametric Automatic Differentiation Variational Inference with Spline Approximation

TL;DR

and bounds on the KL divergence between the spline approximation and the true posterior, and discusses adaptive boundary handling and regularization via a roughness penalty. The authors implement a practical training procedure using a concrete distribution and annealing to sample from spline mixtures and apply the reparameterization trick for backpropagation. Empirical results on simulated cases and real data (e.g., FMNIST, MNIST, CIFAR-10) show S-ADVI can outperform Gaussian-ADVI and GM-ADVI in posterior recovery and, in many settings, achieve competitive or superior reconstruction and classification performance, with interpretable spline coefficients offering insight into latent-variable shapes. Overall, S-ADVI advances variational inference by combining flexibility, interpretability, and theoretical guarantees, enabling effective Bayesian modeling of complex posteriors and incomplete-data generative tasks.

Abstract

Paper Structure (28 sections, 6 theorems, 18 equations, 18 figures, 2 tables)

This paper contains 28 sections, 6 theorems, 18 equations, 18 figures, 2 tables.

INTRODUCTION
BACKGROUND
Variational Inferences
Spline Approximation
NONPARAMETRIC POSTERIOR APPROXIMATION WITH SPLINE
Spline Automatic Differentiation Variational Inference (S-ADVI)
Model Estimation
PROPERTIES OF S-ADVI
RELATED WORKS
RESULTS
Posterior Approximation
Real Data Applications
DISCUSSION
Supplementary Materials for "Nonparametric Automatic Differentiation Variational Inference with Spline Approximation"
ADDITIONAL LEMMAS AND PROOF DETAILS
...and 13 more sections

Key Result

Lemma 2.1

For any function $\psi \in \mathcal{H}^{(\varrho)}(\mathcal{T})$, there exists a spline $\psi^{\ast} \in \mathcal{U}$, such that $\sup_{z \in \mathcal{T}}|\psi^{\ast}(z)-\psi(z)|\leq C H^{-(\varrho+1)}$ for some positive constant $C$.

Figures (18)

Figure 1: illustration of the density functions based on different linear combinations of spline basis functions
Figure 2: Posterior approximation results with S-ADVI, Gaussian-ADVI, and GM-ADVI for Cases 1--5
Figure 3: Performance comparison for single column classification for the FMNIST dataset with a range of latent variables $J$ and interior knots $H$ (Error bars denote the standard error)
Figure 4: Computational budget comparison for single column classification for the FMNIST dataset with a range of latent variables $J$ and interior knots $H$(Shadows represent the standard error)
Figure 5: Analysis of the approximated shape of posterior from S-VAE, where Brighter pixels correspond to higher absolute attributes, and Regions containing highly attributed pixels are highlighted in red circles
...and 13 more figures

Theorems & Definitions (15)

Lemma 2.1: schumaker2007spline
Remark 2.2
Remark 3.1
Remark 4.1
Lemma 4.1
Theorem 4.2
Theorem 4.3
Remark 4.4
Remark 4.5
proof : Proof of Lemma \ref{['THE:ELBO']}
...and 5 more

Nonparametric Automatic Differentiation Variational Inference with Spline Approximation

TL;DR

Abstract

Nonparametric Automatic Differentiation Variational Inference with Spline Approximation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (18)

Theorems & Definitions (15)