Table of Contents
Fetching ...

AdvNF: Reducing Mode Collapse in Conditional Normalising Flows using Adversarial Learning

Vikas Kanaujia, Mathias S. Scheurer, Vipul Arora

TL;DR

AdvNF introduces adversarially trained conditional normalizing flows to combat mode collapse in parameter-conditioned sampling tasks. By combining an adversarial loss with forward and/or reverse KL objectives and applying an independent Metropolis-Hastings correction, AdvNF achieves better mode coverage and more faithful observables than existing CNF approaches and several GAN/VAE baselines, particularly on XY-models and synthetic multimodal distributions. The approach demonstrates that adversarial guidance helps CNFs explore all modes while IMH mitigates residual bias, enabling efficient, unbiased sampling across parameter settings. This yields substantial improvements in density fidelity and thermodynamic observables, suggesting strong potential for scalable, accurate sampling in lattice models and related physics applications.

Abstract

Deep generative models complement Markov-chain-Monte-Carlo methods for efficiently sampling from high-dimensional distributions. Among these methods, explicit generators, such as Normalising Flows (NFs), in combination with the Metropolis Hastings algorithm have been extensively applied to get unbiased samples from target distributions. We systematically study central problems in conditional NFs, such as high variance, mode collapse and data efficiency. We propose adversarial training for NFs to ameliorate these problems. Experiments are conducted with low-dimensional synthetic datasets and XY spin models in two spatial dimensions.

AdvNF: Reducing Mode Collapse in Conditional Normalising Flows using Adversarial Learning

TL;DR

AdvNF introduces adversarially trained conditional normalizing flows to combat mode collapse in parameter-conditioned sampling tasks. By combining an adversarial loss with forward and/or reverse KL objectives and applying an independent Metropolis-Hastings correction, AdvNF achieves better mode coverage and more faithful observables than existing CNF approaches and several GAN/VAE baselines, particularly on XY-models and synthetic multimodal distributions. The approach demonstrates that adversarial guidance helps CNFs explore all modes while IMH mitigates residual bias, enabling efficient, unbiased sampling across parameter settings. This yields substantial improvements in density fidelity and thermodynamic observables, suggesting strong potential for scalable, accurate sampling in lattice models and related physics applications.

Abstract

Deep generative models complement Markov-chain-Monte-Carlo methods for efficiently sampling from high-dimensional distributions. Among these methods, explicit generators, such as Normalising Flows (NFs), in combination with the Metropolis Hastings algorithm have been extensively applied to get unbiased samples from target distributions. We systematically study central problems in conditional NFs, such as high variance, mode collapse and data efficiency. We propose adversarial training for NFs to ameliorate these problems. Experiments are conducted with low-dimensional synthetic datasets and XY spin models in two spatial dimensions.
Paper Structure (21 sections, 20 equations, 9 figures, 9 tables, 1 algorithm)

This paper contains 21 sections, 20 equations, 9 figures, 9 tables, 1 algorithm.

Figures (9)

  • Figure 1: For the illustration of (a) mode-covering (FKL) and (b) mode-seeking behaviour (RKL), we show a comparison of toy density plots. Here $p(x)$ represents the univariate multi-modal target distribution and $q(x)$ represents the modeled distribution.
  • Figure 2: Schematics of AdvNF (Conditional Flow-Adversarial Model).
  • Figure 3: Sample plot for MOG-4, MOG-8 and Rings-4 distribution drawn by generating samples from AdvNF and CNF-MH variants. Mode collapse can be observed on all synthetic datasets for CNF-MH(RKL) variants
  • Figure 4: Sample plots for the Rings-4 distribution highlight the effect of adversarial loss as training progresses and illustrate how it comes out of mode collapse and converges to the desired target distribution. (A) the model distribution (trained through CNF-MH (RKL)) has collapsed to a few modes; (B) shows the effect of adding adversarial loss with a high adversarial loss weight $\lambda_{1}$. (C)-(G) show the model distribution gradually converging to the target distribution as $\lambda_{1}$ is decreased with epochs. (H) shows the sample plot when the model AdvNF (RKL) has been fully trained or converged to the target distribution.
  • Figure 5: Comparison plot of observables (mean energy and mean magnetization) for AdvNF and various other baseline models referred in Sec. \ref{['section:5.2']}, with MCMC acting as ground truth. The line represents the mean value, and the shaded area represents the standard deviation. 10000 samples are generated at each temperature to compute observables for all the models. (A) XY model dataset ($16\times 16$ lattice size) at setting $J=1$. (B) Extended XY model dataset ($16\times 16$ lattice size) at setting $K/J=1$.
  • ...and 4 more figures