Table of Contents
Fetching ...

Generative Lines Matching Models

Ori Matityahu, Raanan Fattal

TL;DR

The paper addresses sampling inefficiencies in denoising diffusion and flow-matching models caused by ambiguous pairings between latent noise and data. It introduces the Lines Matching Model (LMM), which uses deterministic ODE-based samplers to transport probability mass along globally straight lines between the source and target distributions, with an endpoint-based training objective $L_{\text{lines}}$ and a deterministic pairing $x_1 = N^*\text{Sampler}(x_0)$. LMM delivers state-of-the-art FID with minimal NFEs across CIFAR-10, ImageNet 64×64, and AFHQ 64×64, and can leverage domain-specific reconstruction and adversarial losses to boost sample fidelity; the work also shows that OT-based pairing suffers exponential scaling with dimension, motivating the line-based approach. The results highlight both practical gains in sampling efficiency and fidelity and a theoretical understanding of pairing limitations, with future work aimed at ab initio pairing methods and broader applicability.

Abstract

In this paper we identify the source of a singularity in the training loss of key denoising models, that causes the denoiser's predictions to collapse towards the mean of the source or target distributions. This degeneracy creates false basins of attraction, distorting the denoising trajectories and ultimately increasing the number of steps required to sample these models. We circumvent this artifact by leveraging the deterministic ODE-based samplers, offered by certain denoising diffusion and score-matching models, which establish a well-defined change-of-variables between the source and target distributions. Given this correspondence, we propose a new probability flow model, the Lines Matching Model (LMM), which matches globally straight lines interpolating the two distributions. We demonstrate that the flow fields produced by the LMM exhibit notable temporal consistency, resulting in trajectories with excellent straightness scores. Beyond its sampling efficiency, the LMM formulation allows us to enhance the fidelity of the generated samples by integrating domain-specific reconstruction and adversarial losses, and by optimizing its training for the sampling procedure used. Overall, the LMM achieves state-of-the-art FID scores with minimal NFEs on established benchmark datasets: 1.57/1.39 (NFE=1/2) on CIFAR-10, 1.47/1.17 on ImageNet 64x64, and 2.68/1.54 on AFHQ 64x64. Finally, we provide a theoretical analysis showing that the use of optimal transport to relate the two distributions suffers from a curse of dimensionality, where the pairing set size (mini-batch) must scale exponentially with the signal dimension.

Generative Lines Matching Models

TL;DR

The paper addresses sampling inefficiencies in denoising diffusion and flow-matching models caused by ambiguous pairings between latent noise and data. It introduces the Lines Matching Model (LMM), which uses deterministic ODE-based samplers to transport probability mass along globally straight lines between the source and target distributions, with an endpoint-based training objective and a deterministic pairing . LMM delivers state-of-the-art FID with minimal NFEs across CIFAR-10, ImageNet 64×64, and AFHQ 64×64, and can leverage domain-specific reconstruction and adversarial losses to boost sample fidelity; the work also shows that OT-based pairing suffers exponential scaling with dimension, motivating the line-based approach. The results highlight both practical gains in sampling efficiency and fidelity and a theoretical understanding of pairing limitations, with future work aimed at ab initio pairing methods and broader applicability.

Abstract

In this paper we identify the source of a singularity in the training loss of key denoising models, that causes the denoiser's predictions to collapse towards the mean of the source or target distributions. This degeneracy creates false basins of attraction, distorting the denoising trajectories and ultimately increasing the number of steps required to sample these models. We circumvent this artifact by leveraging the deterministic ODE-based samplers, offered by certain denoising diffusion and score-matching models, which establish a well-defined change-of-variables between the source and target distributions. Given this correspondence, we propose a new probability flow model, the Lines Matching Model (LMM), which matches globally straight lines interpolating the two distributions. We demonstrate that the flow fields produced by the LMM exhibit notable temporal consistency, resulting in trajectories with excellent straightness scores. Beyond its sampling efficiency, the LMM formulation allows us to enhance the fidelity of the generated samples by integrating domain-specific reconstruction and adversarial losses, and by optimizing its training for the sampling procedure used. Overall, the LMM achieves state-of-the-art FID scores with minimal NFEs on established benchmark datasets: 1.57/1.39 (NFE=1/2) on CIFAR-10, 1.47/1.17 on ImageNet 64x64, and 2.68/1.54 on AFHQ 64x64. Finally, we provide a theoretical analysis showing that the use of optimal transport to relate the two distributions suffers from a curse of dimensionality, where the pairing set size (mini-batch) must scale exponentially with the signal dimension.

Paper Structure

This paper contains 10 sections, 20 equations, 3 figures, 10 tables.

Figures (3)

  • Figure 1: LMM Generated CIFAR-10 Samples. Class unconditional on the left, and conditional on the right. Rows correspond to different classes.
  • Figure 2: LMM Generated Conditional ImageNet 64$\times$64 Samples. Rows correspond to different classes.
  • Figure 3: LMM Generated AFHQ 64$\times$64 Samples.