Brain-like Variational Inference
Hadi Vafaii, Dekel Galor, Jacob L. Yates
TL;DR
This work presents FOND, a principled framework that derives brain-like inference dynamics by performing online natural-gradient descent on variational free energy, unifying neural and machine learning perspectives on inference. It applies FOND to derive iterative VAEs, including the iP-VAE, a spiking model that uses Poisson latents and membrane-potential dynamics to perform online inference with lateral competition. Empirically, iterative VAEs demonstrate stronger reconstruction-sparsity trade-offs, learn cortex-like features, and generalize robustly to out-of-distribution data, while remaining scalable to high-dimensional color images. The results highlight both theoretical coherence with the free energy principle and practical advantages in efficiency and generalization, with promising hardware implications for neuromorphic deployment.
Abstract
Inference in both brains and machines can be formalized by optimizing a shared objective: maximizing the evidence lower bound (ELBO) in machine learning, or minimizing variational free energy (F) in neuroscience (ELBO = -F). While this equivalence suggests a unifying framework, it leaves open how inference is implemented in neural systems. Here, we introduce FOND (Free energy Online Natural-gradient Dynamics), a framework that derives neural inference dynamics from three principles: (1) natural gradients on F, (2) online belief updating, and (3) iterative refinement. We apply FOND to derive iP-VAE (iterative Poisson variational autoencoder), a recurrent spiking neural network that performs variational inference through membrane potential dynamics, replacing amortized encoders with iterative inference updates. Theoretically, iP-VAE yields several desirable features such as emergent normalization via lateral competition, and hardware-efficient integer spike count representations. Empirically, iP-VAE outperforms both standard VAEs and Gaussian-based predictive coding models in sparsity, reconstruction, and biological plausibility, and scales to complex color image datasets such as CelebA. iP-VAE also exhibits strong generalization to out-of-distribution inputs, exceeding hybrid iterative-amortized VAEs. These results demonstrate how deriving inference algorithms from first principles can yield concrete architectures that are simultaneously biologically plausible and empirically effective.
