Table of Contents
Fetching ...

TFTF: Training-Free Targeted Flow for Conditional Sampling

Qianqian Qu, Jun S. Liu

TL;DR

A training-free conditional sampling method for flow matching models based on importance sampling that significantly outperforms existing approaches on conditional sampling tasks for MNIST and CIFAR-10 and demonstrates the applicability in higher-dimensional, multimodal settings.

Abstract

We propose a training-free conditional sampling method for flow matching models based on importance sampling. Because a naïve application of importance sampling suffers from weight degeneracy in high-dimensional settings, we modify and incorporate a resampling technique in sequential Monte Carlo (SMC) during intermediate stages of the generation process. To encourage generated samples to diverge along distinct trajectories, we derive a stochastic flow with adjustable noise strength to replace the deterministic flow at the intermediate stage. Our framework requires no additional training, while providing theoretical guarantees of asymptotic accuracy. Experimentally, our method significantly outperforms existing approaches on conditional sampling tasks for MNIST and CIFAR-10. We further demonstrate the applicability of our approach in higher-dimensional, multimodal settings through text-to-image generation experiments on CelebA-HQ.

TFTF: Training-Free Targeted Flow for Conditional Sampling

TL;DR

A training-free conditional sampling method for flow matching models based on importance sampling that significantly outperforms existing approaches on conditional sampling tasks for MNIST and CIFAR-10 and demonstrates the applicability in higher-dimensional, multimodal settings.

Abstract

We propose a training-free conditional sampling method for flow matching models based on importance sampling. Because a naïve application of importance sampling suffers from weight degeneracy in high-dimensional settings, we modify and incorporate a resampling technique in sequential Monte Carlo (SMC) during intermediate stages of the generation process. To encourage generated samples to diverge along distinct trajectories, we derive a stochastic flow with adjustable noise strength to replace the deterministic flow at the intermediate stage. Our framework requires no additional training, while providing theoretical guarantees of asymptotic accuracy. Experimentally, our method significantly outperforms existing approaches on conditional sampling tasks for MNIST and CIFAR-10. We further demonstrate the applicability of our approach in higher-dimensional, multimodal settings through text-to-image generation experiments on CelebA-HQ.
Paper Structure (39 sections, 4 theorems, 96 equations, 9 figures, 5 tables, 2 algorithms)

This paper contains 39 sections, 4 theorems, 96 equations, 9 figures, 5 tables, 2 algorithms.

Key Result

Proposition 1

For $t \in (0,1]$, the optimal conditional velocity field $v^*_c(x(t),t)$ can be decomposed asAt the boundary, $v^*_c(x,0) = \lim\limits_{t \to 0^+} v^*_c(x,t)$.

Figures (9)

  • Figure 1: Effect of incorporating resampling at intermediate stages of the generation process.
  • Figure 2: Samples generated by different methods on the toy example.
  • Figure 3: Samples generated by different methods on MNIST for class-conditional sampling targeting class 3.
  • Figure 4: Ablation studies on the stochasticity parameter $\alpha(t)$ and particle count $K$.
  • Figure 5: Wasserstein-2 distance versus number of nodes $M$ with $K=4$ fixed.
  • ...and 4 more figures

Theorems & Definitions (13)

  • Remark 1
  • Proposition 1: Proof in \ref{['appendix:proof_prop1']}
  • Proposition 2: Proof in \ref{['appendix:proof_prop2']}
  • Proposition 3: Proof in \ref{['appendix:proof_prop3']}
  • Remark 2
  • proof
  • proof
  • Remark 3
  • Remark 4
  • proof
  • ...and 3 more