Table of Contents
Fetching ...

Neural Approximate Mirror Maps for Constrained Diffusion Models

Berthy T. Feng, Ricardo Baptista, Katherine L. Bouman

TL;DR

This work tackles the challenge of enforcing subtle, possibly non-convex constraints in diffusion models by learning neural approximate mirror maps (NAMMs) that transport constrained data from a manifold $\mathcal{M}$ to an unconstrained mirror space via a forward map ${\bm{b}} g_\phi$ and an approximate inverse ${\bm{b}} f_\psi$. By optimizing a cycle-consistency objective, a differentiable constraint-distance loss, and a regularizer, NAMMs enable training a diffusion model in the mirror space (an MDM) and restoring samples to the constrained set through the inverse map $\bm{b} f_\psi$, with robustness to noise up to $\sigma_{\max}$. The approach is demonstrated on physics-based, geometric, and semantic constraints, showing significant improvements in constraint satisfaction over unconstrained diffusion models and enabling constrained inverse-problem solvers (mirror DPS) in the learned mirror space. Overall, NAMMs broaden the applicability of constrained generation to general, including non-convex, constraints while maintaining diffusion-model benefits and enabling efficient, constraint-aware data synthesis and inference.

Abstract

Diffusion models excel at creating visually-convincing images, but they often struggle to meet subtle constraints inherent in the training data. Such constraints could be physics-based (e.g., satisfying a PDE), geometric (e.g., respecting symmetry), or semantic (e.g., including a particular number of objects). When the training data all satisfy a certain constraint, enforcing this constraint on a diffusion model makes it more reliable for generating valid synthetic data and solving constrained inverse problems. However, existing methods for constrained diffusion models are restricted in the constraints they can handle. For instance, recent work proposed to learn mirror diffusion models (MDMs), but analytical mirror maps only exist for convex constraints and can be challenging to derive. We propose neural approximate mirror maps (NAMMs) for general, possibly non-convex constraints. Our approach only requires a differentiable distance function from the constraint set. We learn an approximate mirror map that transforms data into an unconstrained space and a corresponding approximate inverse that maps data back to the constraint set. A generative model, such as an MDM, can then be trained in the learned mirror space and its samples restored to the constraint set by the inverse map. We validate our approach on a variety of constraints, showing that compared to an unconstrained diffusion model, a NAMM-based MDM substantially improves constraint satisfaction. We also demonstrate how existing diffusion-based inverse-problem solvers can be easily applied in the learned mirror space to solve constrained inverse problems.

Neural Approximate Mirror Maps for Constrained Diffusion Models

TL;DR

This work tackles the challenge of enforcing subtle, possibly non-convex constraints in diffusion models by learning neural approximate mirror maps (NAMMs) that transport constrained data from a manifold to an unconstrained mirror space via a forward map and an approximate inverse . By optimizing a cycle-consistency objective, a differentiable constraint-distance loss, and a regularizer, NAMMs enable training a diffusion model in the mirror space (an MDM) and restoring samples to the constrained set through the inverse map , with robustness to noise up to . The approach is demonstrated on physics-based, geometric, and semantic constraints, showing significant improvements in constraint satisfaction over unconstrained diffusion models and enabling constrained inverse-problem solvers (mirror DPS) in the learned mirror space. Overall, NAMMs broaden the applicability of constrained generation to general, including non-convex, constraints while maintaining diffusion-model benefits and enabling efficient, constraint-aware data synthesis and inference.

Abstract

Diffusion models excel at creating visually-convincing images, but they often struggle to meet subtle constraints inherent in the training data. Such constraints could be physics-based (e.g., satisfying a PDE), geometric (e.g., respecting symmetry), or semantic (e.g., including a particular number of objects). When the training data all satisfy a certain constraint, enforcing this constraint on a diffusion model makes it more reliable for generating valid synthetic data and solving constrained inverse problems. However, existing methods for constrained diffusion models are restricted in the constraints they can handle. For instance, recent work proposed to learn mirror diffusion models (MDMs), but analytical mirror maps only exist for convex constraints and can be challenging to derive. We propose neural approximate mirror maps (NAMMs) for general, possibly non-convex constraints. Our approach only requires a differentiable distance function from the constraint set. We learn an approximate mirror map that transforms data into an unconstrained space and a corresponding approximate inverse that maps data back to the constraint set. A generative model, such as an MDM, can then be trained in the learned mirror space and its samples restored to the constraint set by the inverse map. We validate our approach on a variety of constraints, showing that compared to an unconstrained diffusion model, a NAMM-based MDM substantially improves constraint satisfaction. We also demonstrate how existing diffusion-based inverse-problem solvers can be easily applied in the learned mirror space to solve constrained inverse problems.
Paper Structure (45 sections, 17 equations, 11 figures, 5 tables)

This paper contains 45 sections, 17 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Conceptual illustration. (a) Despite being trained on a data distribution constrained to $\mathcal{M}$, a regular diffusion model (DM) may generate samples that violate the constraint. (b) We propose to learn a neural approximate mirror map (NAMM) that entails a forward map ${\bm{b}} g_\phi$ and inverse map ${\bm{b}} f_\psi$. The forward map transforms the constrained space into an unconstrained ("mirror") space. Once ${\bm{b}} g_\phi$ and ${\bm{b}} f_\psi$ are learned, a mirror diffusion model (MDM) can be trained on the pushforward of the data distribution through ${\bm{b}} g_\phi$ and its samples mapped back to the constrained space through ${\bm{b}} f_\psi$.
  • Figure 2: NAMM training illustration. Given data that lie on a constraint manifold $\mathcal{M}$ (e.g., the hyperplane of images with the same total brightness), we jointly train an approximate mirror map ${\bm{b}} g_\phi$ and its approximate inverse ${\bm{b}} f_\psi$. After mapping data ${\bm{b}} x\sim p_\text{data}$ to the mirror space as ${\bm{b}} g_\phi({\bm{b}} x)$, we perturb them with additive Gaussian noise whose standard deviation can be anywhere between $0$ and $\sigma_\text{max}$. The inverse map ${\bm{b}} f_\psi$ is trained to map these perturbed samples back onto $\mathcal{M}$.
  • Figure 3: Improved constraint satisfaction. Samples from our approach are nearly indistinguishable from baseline samples, yet there is a significant difference in their distances from the constraint set. The baseline is a DM trained on the original constrained dataset. Our approach is to train a NAMM and then an MDM in the mirror space induced by ${\bm{b}} g_\phi$. Samples are obtained by sampling from the MDM and then passing samples through ${\bm{b}} f_\psi$. The histograms show normalized constraint distances $\bar{\ell}$ of $128$ samples (normalized so that each constraint has a maximum of $1$ across the samples from both methods). Our results are from the finetuned NAMM. For each constraint, we made sure that the DM was trained for at least as long as the NAMM, MDM, and finetuned NAMM combined.
  • Figure 4: Training efficiency. For each method, we clocked the total compute time during training (ignoring validation and I/O operations) and here plot the mean $\pm$ std. dev. of the constraint distances $\ell$ of $128$ generated samples at each checkpoint. The MDM training curve ("Ours w/o FT") is offset by the time it took to train the NAMM. The finetuning curve ("Ours") is offset by the time it took to train the NAMM and MDM and generate finetuning data. For most constraints, the DM has consistently higher constraint distance without any sign of converging to the same performance as that of the MDM. For the count constraint, the MDM performs on par with the DM, but finetuning noticeably accelerates constraint satisfaction. Each run was done on the same hardware ($4\times$ A100 GPUs).
  • Figure 5: Data assimilation. We used the same finetuned NAMM, MDM, and DM checkpoints as in Fig. \ref{['fig:constraint_improvement']}. (a) Given noisy observations of the first eight states, we sampled possible full trajectories of a 1D Burgers' system. Our solutions have smaller deviation from the PDE than samples obtained with DPS, even those of constraint-guided DPS (CG-DPS). (b) The task is to infer the full Kolmogorov flow from noisy observations of the first and last states. Our solution has significantly less divergence.
  • ...and 6 more figures