Density-Informed VAE (DiVAE): Reliable Log-Prior Probability via Density Alignment Regularization
Michele Alessi, Alessio Ansuini, Alex Rodriguez
TL;DR
DiVAE addresses the mismatch between latent prior density and data-space density in VAEs by adding a lightweight, data-driven density-alignment regularizer. It uses a data-derived log-density proxy $\rho$, projected in a PCA subspace, to steer the encoder’s posterior and, when the prior is learnable, to nudge the prior toward high-density regions, via direct or flow-corrected aligners. The method yields improved latent-density calibration, prior coverage, and OOD uncertainty on synthetic data, with stable improvements on MNIST when using a learnable prior; flow alignment offers the strongest density separation but may over-correct. Overall, DiVAE provides a practical and interpretable way to integrate data-density structure into latent priors with negligible computational overhead, enhancing anomaly detection and uncertainty estimation.
Abstract
We introduce Density-Informed VAE (DiVAE), a lightweight, data-driven regularizer that aligns the VAE log-prior probability $\log p_Z(z)$ with a log-density estimated from data. Standard VAEs match latents to a simple prior, overlooking density structure in the data-space. DiVAE encourages the encoder to allocate posterior mass in proportion to data-space density and, when the prior is learnable, nudges the prior toward high-density regions. This is realized by adding a robust, precision-weighted penalty to the ELBO, incurring negligible computational overhead. On synthetic datasets, DiVAE (i) improves distributional alignment of latent log-densities to its ground truth counterpart, (ii) improves prior coverage, and (iii) yields better OOD uncertainty calibration. On MNIST, DiVAE improves alignment of the prior with external estimates of the density, providing better interpretability, and improves OOD detection for learnable priors.
