A Non-negative VAE:the Generalized Gamma Belief Network
Zhibin Duan, Tiansheng Wen, Muyao Wang, Bo Chen, Mingyuan Zhou
TL;DR
This paper tackles the limited expressivity of Gamma Belief Networks due to their linear decoder by introducing the Generalized Gamma Belief Network, which uses a hierarchical non-linear generative model and sparse, non-negative gamma latents. It develops an upward-downward variational inference framework based on a Weibull posterior to approximate intractable gamma conditionals and jointly optimizes the generative model and inference network. Empirical results show expressivity on par with state-of-the-art Gaussian VAEs and strong disentanglement without extra regularizers, highlighting the benefits of sparse gamma latents for interpretability. The approach demonstrates robust performance across text and image benchmarks, suggesting broad applicability to complex data while preserving interpretability advantages of gamma latent variables.
Abstract
The gamma belief network (GBN), often regarded as a deep topic model, has demonstrated its potential for uncovering multi-layer interpretable latent representations in text data. Its notable capability to acquire interpretable latent factors is partially attributed to sparse and non-negative gamma-distributed latent variables. However, the existing GBN and its variations are constrained by the linear generative model, thereby limiting their expressiveness and applicability. To address this limitation, we introduce the generalized gamma belief network (Generalized GBN) in this paper, which extends the original linear generative model to a more expressive non-linear generative model. Since the parameters of the Generalized GBN no longer possess an analytic conditional posterior, we further propose an upward-downward Weibull inference network to approximate the posterior distribution of the latent variables. The parameters of both the generative model and the inference network are jointly trained within the variational inference framework. Finally, we conduct comprehensive experiments on both expressivity and disentangled representation learning tasks to evaluate the performance of the Generalized GBN against state-of-the-art Gaussian variational autoencoders serving as baselines.
