The Deep Generative Decoder: MAP estimation of representations improves modeling of single-cell RNA data
Viktoria Schuster, Anders Krogh
TL;DR
The paper presents the Deep Generative Decoder (DGD), a encoder-free, MAP-based framework that learns latent representations and decoder parameters jointly by maximizing the posterior $P(X,Z,\phi,\theta)$. By modeling the latent space with a parameterized Gaussian Mixture Model and using priors such as a softball prior for component means, the DGD achieves flexible, interpretable latent structure with smaller dimensionality than typical VAEs. Demonstrations on Fashion-MNIST and a broad collection of single-cell RNA-seq datasets show that DGD yields meaningful sub-clustering, competitive reconstruction, and superior or comparable clustering performance with substantially fewer latent dimensions. While inference for new data points is slower than encoder-based methods, the approach offers simplicity, scalability, and easy extension to more complex latent distributions, making it well-suited for biological data analysis and potential multi-omics integration. The provided code bases enable replication and further development of encoder-free generative modeling in biomedical contexts.
Abstract
Learning low-dimensional representations of single-cell transcriptomics has become instrumental to its downstream analysis. The state of the art is currently represented by neural network models such as variational autoencoders (VAEs) which use a variational approximation of the likelihood for inference. We here present the Deep Generative Decoder (DGD), a simple generative model that computes model parameters and representations directly via maximum a posteriori (MAP) estimation. The DGD handles complex parameterized latent distributions naturally unlike VAEs which typically use a fixed Gaussian distribution, because of the complexity of adding other types. We first show its general functionality on a commonly used benchmark set, Fashion-MNIST. Secondly, we apply the model to multiple single-cell data sets. Here the DGD learns low-dimensional, meaningful and well-structured latent representations with sub-clustering beyond the provided labels. The advantages of this approach are its simplicity and its capability to provide representations of much smaller dimensionality than a comparable VAE.
