Table of Contents
Fetching ...

A Non-Adversarial Approach to Idempotent Generative Modelling

Mohammed Al-Jaff, Giovanni Luca Marchetti, Michael C Welle, Jens Lundell, Mats G. Gustafsson, Gustav Eje Henter, Hossein Azizpour, Danica Kragic

TL;DR

NAIGN replaces the adversarial backbone of IGNs with a non-adversarial objective that combines reconstruction with Implicit Maximum Likelihood Estimation to achieve stable training and improved mode coverage. By enforcing fixed points on the data manifold and push-forward behavior, NAIGN simultaneously acts as a robust manifold projector and a generator, while implicitly learning a local manifold-distance field that can underpin an energy-based density surrogate. Empirical results on synthetic 2D data, MNIST, and FFHQ-100 show NAIGN achieves superior generation and restoration, reduces mode collapse, and remains stable across training, including in few-shot regimes. This approach offers a compute-efficient alternative to diffusion and GAN-based methods, with practical implications for robust data restoration, out-of-distribution detection, and density estimation through an unnormalized energy model.

Abstract

Idempotent Generative Networks (IGNs) are deep generative models that also function as local data manifold projectors, mapping arbitrary inputs back onto the manifold. They are trained to act as identity operators on the data and as idempotent operators off the data manifold. However, IGNs suffer from mode collapse, mode dropping, and training instability due to their objectives, which contain adversarial components and can cause the model to cover the data manifold only partially -- an issue shared with generative adversarial networks. We introduce Non-Adversarial Idempotent Generative Networks (NAIGNs) to address these issues. Our loss function combines reconstruction with the non-adversarial generative objective of Implicit Maximum Likelihood Estimation (IMLE). This improves on IGN's ability to restore corrupted data and generate new samples that closely match the data distribution. We moreover demonstrate that NAIGNs implicitly learn the distance field to the data manifold, as well as an energy-based model.

A Non-Adversarial Approach to Idempotent Generative Modelling

TL;DR

NAIGN replaces the adversarial backbone of IGNs with a non-adversarial objective that combines reconstruction with Implicit Maximum Likelihood Estimation to achieve stable training and improved mode coverage. By enforcing fixed points on the data manifold and push-forward behavior, NAIGN simultaneously acts as a robust manifold projector and a generator, while implicitly learning a local manifold-distance field that can underpin an energy-based density surrogate. Empirical results on synthetic 2D data, MNIST, and FFHQ-100 show NAIGN achieves superior generation and restoration, reduces mode collapse, and remains stable across training, including in few-shot regimes. This approach offers a compute-efficient alternative to diffusion and GAN-based methods, with practical implications for robust data restoration, out-of-distribution detection, and density estimation through an unnormalized energy model.

Abstract

Idempotent Generative Networks (IGNs) are deep generative models that also function as local data manifold projectors, mapping arbitrary inputs back onto the manifold. They are trained to act as identity operators on the data and as idempotent operators off the data manifold. However, IGNs suffer from mode collapse, mode dropping, and training instability due to their objectives, which contain adversarial components and can cause the model to cover the data manifold only partially -- an issue shared with generative adversarial networks. We introduce Non-Adversarial Idempotent Generative Networks (NAIGNs) to address these issues. Our loss function combines reconstruction with the non-adversarial generative objective of Implicit Maximum Likelihood Estimation (IMLE). This improves on IGN's ability to restore corrupted data and generate new samples that closely match the data distribution. We moreover demonstrate that NAIGNs implicitly learn the distance field to the data manifold, as well as an energy-based model.

Paper Structure

This paper contains 21 sections, 1 theorem, 13 equations, 13 figures, 1 table, 1 algorithm.

Key Result

Lemma 1

If $f_\theta(\mathbf{x}) = \mathbf{x}$ for all $\mathbf{x} \in \mathcal{M}$ and $f_\theta(\mathbb{R}^D) = \mathcal{M}$, then $f_\theta$ is idempotent, i.e., $f_\theta(f_\theta(\mathbf{z})) = f_\theta(\mathbf{z})$ for all $\mathbf{z} \in \mathbb{R}^D$.

Figures (13)

  • Figure 1: NAIGNs $f$ are trained to fix points ($f(\mathbf{x})~=~\mathbf{x}$) on the data manifold $\mathcal{M}$ and to map arbitrary points to $\mathcal{M}$ ($f(\mathbf{x}) \in \mathcal{M}$) via IMLE, which in turn imply idempotency ($f(f(\mathbf{x}))=f(\mathbf{x})$).
  • Figure 2: Comparison between NAIGN (top row) and IGN (bottom row) trained on a simple tri-modal one-dimensional distribution. Our proposed method, NAIGN, is better at mitigating mode collapse and mode dropping issues that IGN is susceptible to. For reference, the light gray histograms in the three rightmost columns are the target distribution from the first column.
  • Figure 3: Mode coverage for IGN (left) and NAIGN (right).
  • Figure 4: FLD scores for methods during training. Shaded region contains min and max.
  • Figure 5: Reconstructed and generated samples from NAIGN (left) and IGN (right), both trained on the FFHQ-100 dataset. While NAIGN demonstrates faithful reconstruction and generative diversity, IGN consistently exhibits mode collapse.
  • ...and 8 more figures

Theorems & Definitions (2)

  • Lemma 1
  • proof