Poisson Variational Autoencoder

Hadi Vafaii; Dekel Galor; Jacob L. Yates

Poisson Variational Autoencoder

Hadi Vafaii, Dekel Galor, Jacob L. Yates

TL;DR

The Poisson Variational Autoencoder (P-VAE) addresses the gap between traditional continuous latent VAEs and the discrete, spike-based coding observed in biology by encoding inputs as discrete Poisson spike counts. It combines predictive coding with a Poisson latent space, introducing a Poisson reparameterization trick and deriving an ELBO where the KL term becomes a sparsity-promoting penalty, naturally linking to amortized sparse coding when the decoder is linear. Empirically, the P-VAE avoids posterior collapse, learns sparse, Gabor-like basis vectors comparable to sparse coding, and yields higher-dimensional input representations that improve downstream linear separability and sample efficiency (e.g., about 5x in MNIST classification with limited labels). These results suggest a brain-inspired, interpretable framework for sensory processing that enhances robustness and data efficiency, while highlighting areas for future exploration such as hierarchical architectures and refined inference methods. Mathematically, the model uses posterior rates $oldsymbol{r}oldsymbol{ hd}oldsymbol{ hd}(m{x})$ and prior rates $oldsymbol{r}$ with a loss component involving $f(y)=1-y+y\, ext{log}\,y$, yielding a KL term $ oldsymbol{r}\,oldsymbol{ hd} f(oldsymbol{ hd}(m{x}))$ that enforces sparsity, and, under a linear decoder, reduces to amortized sparse coding objectives with reconstruction $ orm{m{x}-oldsymbol{oldsymbol{oldsymbol{\

Abstract

Variational autoencoders (VAEs) employ Bayesian inference to interpret sensory inputs, mirroring processes that occur in primate vision across both ventral (Higgins et al., 2021) and dorsal (Vafaii et al., 2023) pathways. Despite their success, traditional VAEs rely on continuous latent variables, which deviates sharply from the discrete nature of biological neurons. Here, we developed the Poisson VAE (P-VAE), a novel architecture that combines principles of predictive coding with a VAE that encodes inputs into discrete spike counts. Combining Poisson-distributed latent variables with predictive coding introduces a metabolic cost term in the model loss function, suggesting a relationship with sparse coding which we verify empirically. Additionally, we analyze the geometry of learned representations, contrasting the P-VAE to alternative VAE models. We find that the P-VAE encodes its inputs in relatively higher dimensions, facilitating linear separability of categories in a downstream classification task with a much better (5x) sample efficiency. Our work provides an interpretable computational framework to study brain-like sensory processing and paves the way for a deeper understanding of perception as an inferential process.

Poisson Variational Autoencoder

TL;DR

and prior rates

with a loss component involving

, yielding a KL term

that enforces sparsity, and, under a linear decoder, reduces to amortized sparse coding objectives with reconstruction $ orm{m{x}-oldsymbol{oldsymbol{oldsymbol{\

Abstract

Paper Structure (52 sections, 24 equations, 12 figures, 6 tables, 1 algorithm)

This paper contains 52 sections, 24 equations, 12 figures, 6 tables, 1 algorithm.

Introduction
Contributions.
Background & Related work
Perception as inference: connections to neuroscience and machine learning.
Efficient, predictive, and sparse coding.
VAE objective.
Amortized inference in VAEs.
VAEs connection to biology.
Discrete VAEs.
VAEs connection to sparse coding.
Introducing the Poisson Variational Autoencoder (P-VAE)
Poisson reparameterization trick.
$\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE architecture and residual parameterization.
$\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE loss function.
$\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE relationship to sparse coding.
...and 37 more sections

Figures (12)

Figure 1: Graphical abstract. Introducing the Poisson Variational Autoencoder ($\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE), which draws on key concepts in neuroscience. When trained on natural image patches, $\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE with a linear decoder develops Gabor-like feature selectivity, reminiscent of Sparse Coding olshausen1996emergence. In sharp contrast, the standard Gaussian VAE learns the principal components tipping1999prob. Our code, data, and model checkpoints are available at this repository: https://github.com/hadivafaii/PoissonVAE
Figure 2: (a) Model architecture. Colored shapes indicate learnable model parameters, including the prior firing rates, ${\color{color_dec}{\bm{r}}}\xspace$. We color code the model's inference and generative components using red and blue, respectively. The $\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE encodes its inputs in discrete spike counts, $\bm{z}$, significantly enhancing its biological realism. (b) "Amortized Sparse Coding" is a special case within the $\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE model family: it's a $\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE with a linear decoder and an overcomplete latent space.
Figure 3: Relaxed Poisson distribution. Samples are drawn using \ref{['algo:rsample']} for $\lambda = 1$. At non-zero temperatures, samples are non-integer, but approach the true Poisson distribution as $T \rightarrow 0$.
Figure 4: Learned basis elements for various $\left< {\color{color_enc}\mathtt{{lin}}} \vert {\color{color_dec}\mathtt{{lin}}} \right>$ VAEs (first two columns) and standard sparse coding models (last column). There are a total of $K=512$ elements, each made of $16 \times\! 16 = 256$ pixels (i.e., ${\color{color_dec}{\bm{\Phi}}}\xspace \in \mathop{\mathrm{\mathbb{R}}}\nolimits^{256 \times 512}$). Features are ordered from top-left to bottom-right, in ascending order of their associated $\mathtt{KL}$ divergence ($\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE, $\mathop{\mathrm{\mathcal{G}}}\nolimits$-VAE, $\mathop{\mathrm{\mathcal{L}}}\nolimits$-VAE), or the magnitude of posterior $\mathrm{logits}$ ($\mathop{\mathrm{\mathcal{C}}}\nolimits$-VAE). The sparse coding results (LCA and ISTA) are ordered randomly.
Figure 5: Reconstruction performance vs. sparsity of representations. (a) Results for the VAE model family. The curves are sigmoid fit to $\left< {\color{color_enc}\mathtt{{lin}}} \vert {\color{color_dec}\mathtt{{lin}}} \right>$ and $\left< {\color{color_enc}\mathtt{{conv}}} \vert {\color{color_dec}\mathtt{{lin}}} \right>$$\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE results across varying $\beta$ values ($\beta$ from \ref{['eq:sc-pvae-nelbo']}). Empty circles correspond to $\left< {\color{color_enc}\mathtt{{conv}}} \vert {\color{color_dec}\mathtt{{lin}}} \right>$ architectures. (b) Amortization gap for $\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE (blue open circle) compared to sparse coding (LCA/ISTA). Solid points show results from applying the LCA inference algorithm to $\mathop{\mathrm{\mathcal{P}}}\nolimits$-VAE basis vectors at different sparsity levels ($\beta_\text{LCA}$ from \ref{['eq:sparse-coding']}). The purple curve is a sigmoid fit, and curves from part (a) are also included for comparison.
...and 7 more figures

Poisson Variational Autoencoder

TL;DR

Abstract

Poisson Variational Autoencoder

Authors

TL;DR

Abstract

Table of Contents

Figures (12)