Adaptive Heterogeneous Mixtures of Normalising Flows for Robust Variational Inference

Benjamin Wiriyapong; Oktay Karakuş; Kirill Sidorov

Adaptive Heterogeneous Mixtures of Normalising Flows for Robust Variational Inference

Benjamin Wiriyapong, Oktay Karakuş, Kirill Sidorov

TL;DR

The paper tackles brittle multimodal posterior inference in variational methods by proposing Adaptive Mixture Flow Variational Inference (AMF-VI), a two-stage framework that combines heterogeneous normalising flows (MAF, RealNVP, RBIG) with likelihood-driven moving-average weights on fresh data. AMF-VI trains diverse experts independently, then adapts their mixture weights without per-sample gating, effectively performing a data-driven Bayesian model averaging over architectural priors. Across six canonical 2D posterior families, AMF-VI achieves consistently lower negative log-likelihood ($NLL$) and robust transport and discrepancy metrics (e.g., $W_2$, MMD, and $KL(p||q)$) while maintaining non-collapsed, interpretable weight allocations ($N_{eff} \in [2.1,2.99]$). This approach provides a practical, architecture-agnostic path to robust multimodal variational inference that preserves each expert's inductive bias with minimal training overhead.

Abstract

Normalising-flow variational inference (VI) can approximate complex posteriors, yet single-flow models often behave inconsistently across qualitatively different distributions. We propose Adaptive Mixture Flow Variational Inference (AMF-VI), a heterogeneous mixture of complementary flows (MAF, RealNVP, RBIG) trained in two stages: (i) sequential expert training of individual flows, and (ii) adaptive global weight estimation via likelihood-driven updates, without per-sample gating or architectural changes. Evaluated on six canonical posterior families of banana, X-shape, two-moons, rings, a bimodal, and a five-mode mixture, AMF-VI achieves consistently lower negative log-likelihood than each single-flow baseline and delivers stable gains in transport metrics (Wasserstein-2) and maximum mean discrepancy (MDD), indicating improved robustness across shapes and modalities. The procedure is efficient and architecture-agnostic, incurring minimal overhead relative to standard flow training, and demonstrates that adaptive mixtures of diverse flows provide a reliable route to robust VI across diverse posterior families whilst preserving each expert's inductive bias.

Adaptive Heterogeneous Mixtures of Normalising Flows for Robust Variational Inference

TL;DR

Abstract

Adaptive Heterogeneous Mixtures of Normalising Flows for Robust Variational Inference

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)