Discrete Distribution Networks
Lei Yang
TL;DR
Discrete Distribution Networks (DDN) address the challenge of modeling complex data distributions by generating multiple discrete samples per layer and stacking $L$ layers to form a $K^L$-sized latent space. A novel Split-and-Prune optimization mitigates dead nodes and density shift, guiding the hierarchical discrete outputs toward the ground truth and enabling zero-shot conditional generation across both pixel and non-pixel domains using black-box discriminators without gradient information, with a data-compression-capable latent of $L \times \log_2 K$ bits. The approach supports conditioning via Guided Samplers (e.g., CLIP, classifiers) and can perform image-to-image tasks, while offering flexible training paradigms (Single Shot vs Recurrence) and techniques like Chain Dropout and Learning Residual to improve performance. Empirical results on CIFAR-10, FFHQ, and CelebA-HQ demonstrate competitive generation quality and compelling zero-shot conditioning capabilities, suggesting a novel direction for discrete, hierarchical generative modeling with compact, semantically meaningful latents.
Abstract
We introduce a novel generative model, the Discrete Distribution Networks (DDN), that approximates data distribution using hierarchical discrete distributions. We posit that since the features within a network inherently capture distributional information, enabling the network to generate multiple samples simultaneously, rather than a single output, may offer an effective way to represent distributions. Therefore, DDN fits the target distribution, including continuous ones, by generating multiple discrete sample points. To capture finer details of the target data, DDN selects the output that is closest to the Ground Truth (GT) from the coarse results generated in the first layer. This selected output is then fed back into the network as a condition for the second layer, thereby generating new outputs more similar to the GT. As the number of DDN layers increases, the representational space of the outputs expands exponentially, and the generated samples become increasingly similar to the GT. This hierarchical output pattern of discrete distributions endows DDN with unique properties: more general zero-shot conditional generation and 1D latent representation. We demonstrate the efficacy of DDN and its intriguing properties through experiments on CIFAR-10 and FFHQ. The code is available at https://discrete-distribution-networks.github.io/
