Neural Autoregressive Distribution Estimation
Benigno Uria, Marc-Alexandre Côté, Karol Gregor, Iain Murray, Hugo Larochelle
TL;DR
This work introduces Neural Autoregressive Distribution Estimation (NADE), a tractable neural autoregressive density model that factors p(x) into conditionals with shared parameters for efficient learning. It extends NADE to real-valued data (RNADE), develops orderless/deep variants (DeepNADE) with ensemble techniques, and adapts the approach to 2D data via ConvNADE to exploit image structure. Empirical results across binary vectors, binarized images, and real-valued domains show competitive or superior performance to traditional directed/undirected models, with ensembles and convolutional architectures providing substantial gains. The framework unifies tractable likelihoods, exact sampling, and flexible conditioning, enabling robust unsupervised density estimation across diverse data types.
Abstract
We present Neural Autoregressive Distribution Estimation (NADE) models, which are neural network architectures applied to the problem of unsupervised distribution and density estimation. They leverage the probability product rule and a weight sharing scheme inspired from restricted Boltzmann machines, to yield an estimator that is both tractable and has good generalization performance. We discuss how they achieve competitive performance in modeling both binary and real-valued observations. We also present how deep NADE models can be trained to be agnostic to the ordering of input dimensions used by the autoregressive product rule decomposition. Finally, we also show how to exploit the topological structure of pixels in images using a deep convolutional architecture for NADE.
