Generative Topological Networks
Alona Levy-Jurgenson, Zohar Yakhini
TL;DR
This paper introduces Generative Topological Networks (GTNs), a topology-grounded approach to generative modeling that learns a continuous bijection $h$ from a latent source $Y$ to a data representation $X$ using standard supervised training. By proving that $h$ can be a homeomorphism under suitable conditions and by learning an approximate $\,hat{h}$ via empirical CDF-based labeling, GTNs enable stable, mode-collapse-free generation in the latent space and offer a theoretical lens on the benefits of latent representations. The authors demonstrate 1D and higher-dimensional constructions, including a Swiss-Roll and a 2D uniform example, and validate GTNs on MNIST, CelebA, CIFAR-10, and Hands and Palm data, showing improvements over VAEs and fast convergence. A key insight is the link between the intrinsic dimension of the data and the latent space used for generation, which helps explain diffusion-model outliers when operating in high ambient dimensions and suggests practical guidelines for designing generative models.
Abstract
Generative methods have recently seen significant improvements by generating in a lower-dimensional latent representation of the data. However, many of the generative methods applied in the latent space remain complex and difficult to train. Further, it is not entirely clear why transitioning to a lower-dimensional latent space can improve generative quality. In this work, we introduce a new and simple generative method grounded in topology theory -- Generative Topological Networks (GTNs) -- which also provides insights into why lower-dimensional latent-space representations might be better-suited for data generation. GTNs are simple to train -- they employ a standard supervised learning approach and do not suffer from common generative pitfalls such as mode collapse, posterior collapse or the need to pose constraints on the neural network architecture. We demonstrate the use of GTNs on several datasets, including MNIST, CelebA, CIFAR-10 and the Hands and Palm Images dataset by training GTNs on a lower-dimensional latent representation of the data. We show that GTNs can improve upon VAEs and that they are quick to converge, generating realistic samples in early epochs. Further, we use the topological considerations behind the development of GTNs to offer insights into why generative models may benefit from operating on a lower-dimensional latent space, highlighting the important link between the intrinsic dimension of the data and the dimension in which the data is generated. Particularly, we demonstrate that generating in high dimensional ambient spaces may be a contributing factor to out-of-distribution samples generated by diffusion models. We also highlight other topological properties that are important to consider when using and designing generative models. Our code is available at: https://github.com/alonalj/GTN
