Table of Contents
Fetching ...

Energy-based generator matching: A neural sampler for general state space

Dongyeop Woo, Minsu Kim, Minkyu Kim, Kiyoung Seong, Sungsoo Ahn

TL;DR

Energy-based generator matching (EGM) tackles sampling from energy-based targets $p_{target}(x) \propto \exp(-\\mathcal{E}(x))$ without equilibrium samples by learning neural samplers for general CTMPs. It extends generator matching to energy-driven training, using self-normalized importance sampling (SNIS) to estimate the marginal generator and a bootstrapping scheme with intermediate energies to reduce variance. The framework supports diffusion, flow, and discrete jumps across continuous, discrete, and mixed state spaces, and is validated on Ising models and multimodal discrete-continuous tasks, showing robust mode coverage and competitive energy-based Wasserstein metrics. This approach broadens neural-sampler applicability beyond diffusion models, enabling efficient, simulation-free training for complex energy landscapes with practical impact in physics-inspired learning and multimodal generation.

Abstract

We propose Energy-based generator matching (EGM), a modality-agnostic approach to train generative models from energy functions in the absence of data. Extending the recently proposed generator matching, EGM enables training of arbitrary continuous-time Markov processes, e.g., diffusion, flow, and jump, and can generate data from continuous, discrete, and a mixture of two modalities. To this end, we propose estimating the generator matching loss using self-normalized importance sampling with an additional bootstrapping trick to reduce variance in the importance weight. We validate EGM on both discrete and multimodal tasks up to 100 and 20 dimensions, respectively.

Energy-based generator matching: A neural sampler for general state space

TL;DR

Energy-based generator matching (EGM) tackles sampling from energy-based targets without equilibrium samples by learning neural samplers for general CTMPs. It extends generator matching to energy-driven training, using self-normalized importance sampling (SNIS) to estimate the marginal generator and a bootstrapping scheme with intermediate energies to reduce variance. The framework supports diffusion, flow, and discrete jumps across continuous, discrete, and mixed state spaces, and is validated on Ising models and multimodal discrete-continuous tasks, showing robust mode coverage and competitive energy-based Wasserstein metrics. This approach broadens neural-sampler applicability beyond diffusion models, enabling efficient, simulation-free training for complex energy landscapes with practical impact in physics-inspired learning and multimodal generation.

Abstract

We propose Energy-based generator matching (EGM), a modality-agnostic approach to train generative models from energy functions in the absence of data. Extending the recently proposed generator matching, EGM enables training of arbitrary continuous-time Markov processes, e.g., diffusion, flow, and jump, and can generate data from continuous, discrete, and a mixture of two modalities. To this end, we propose estimating the generator matching loss using self-normalized importance sampling with an additional bootstrapping trick to reduce variance in the importance weight. We validate EGM on both discrete and multimodal tasks up to 100 and 20 dimensions, respectively.

Paper Structure

This paper contains 38 sections, 4 theorems, 126 equations, 8 figures, 9 tables, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{L}_{t|r}^{x_r}$ denote the conditional generator for conditional probability path $p_{t|r}(\cdot | x_r)$ for $0\leq t < r\leq1$. If the backward transition kernels $p_{t|r}$ satisfy the eq:chapman_kolmogorov and the conditional generators satisfy eq:consistency_of_generator, then the m where $p_{r|t}(dx_r|x)$ is the posterior distribution (i.e., the conditional distribution over inte

Figures (8)

  • Figure 1: Overview of energy-based generator matching (EGM). (a) The target probability path that interpolates between the prior and the target distribution; we aim to estimate the $F_{t}(x)$ as a (weighted) average of conditional generators. (b) GM draws $x_1 \sim p_{1|t}(\cdot|x)$ with uniformly weighted $F_{t|1}^{x_{1}}(x)$. (c) EGM draws $x_1 \sim q_{1|t}(\cdot|x)$ with importance weighted $F_{t|1}^{x_{1}}(x)$. (d) EGM w/ bootstrapping draws $x_r \sim q_{r|t}(\cdot|x)$ with importance weighted $F_{t|r}^{x_r}(x)$.
  • Figure 2: Comparison of energy (top) and magnetization (bottom) histograms for ground-truth samples and various sampling methods.
  • Figure 3: Sample plots of GB-RBM (a-c) and JointMoG (d-f). Samples are projected onto the first two continuous dimensions. BS stands for bootstrapping. Contour lines represent the target distribution, and colored points indicate samples from each method.
  • Figure 4: Energy histograms on the samples from multiple samplers vs. ground truth of JointMOG.
  • Figure 5: Ground truth sample plot of GB-RBM (left) and JointMoG (right). Samples are projected onto the first two continuous dimensions.
  • ...and 3 more figures

Theorems & Definitions (7)

  • Theorem 1
  • Theorem 2: Restatement of \ref{['thm:bootstrap']}
  • proof
  • Proposition 1
  • proof
  • Proposition 2
  • proof