Amortized Bayesian Mixture Models
Šimon Kucharský, Paul Christian Bürkner
TL;DR
This work addresses fast, joint Bayesian inference for finite mixture models in settings where likelihoods are intractable. It extends Amortized Bayesian Inference by factorizing the posterior into $p( heta|y)$ and $p(z|y, heta)$ and training neural surrogates to estimate both continuous parameters and discrete mixture indicators. Using normalizing flows for parameter posteriors and classification networks for mixture memberships, the approach supports both independent and dependent mixtures with filtering and smoothing, trained end-to-end on simulated data. Case studies across synthetic Gaussian mixtures, Gaussian HMMs, and cognitive-switch data demonstrate posterior and classification results that closely match Stan/MCMC while offering substantial speedups, with the BayesFlow implementation publicly available.
Abstract
Finite mixtures are a broad class of models useful in scenarios where observed data is generated by multiple distinct processes but without explicit information about the responsible process for each data point. Estimating Bayesian mixture models is computationally challenging due to issues such as high-dimensional posterior inference and label switching. Furthermore, traditional methods such as MCMC are applicable only if the likelihoods for each mixture component are analytically tractable. Amortized Bayesian Inference (ABI) is a simulation-based framework for estimating Bayesian models using generative neural networks. This allows the fitting of models without explicit likelihoods, and provides fast inference. ABI is therefore an attractive framework for estimating mixture models. This paper introduces a novel extension of ABI tailored to mixture models. We factorize the posterior into a distribution of the parameters and a distribution of (categorical) mixture indicators, which allows us to use a combination of generative neural networks for parameter inference, and classification networks for mixture membership identification. The proposed framework accommodates both independent and dependent mixture models, enabling filtering and smoothing. We validate and demonstrate our approach through synthetic and real-world datasets.
