Table of Contents
Fetching ...

Temporal Pair Consistency for Variance-Reduced Flow Matching

Chika Maduabuchi, Jindong Wang

TL;DR

The work tackles high gradient variance in training continuous-time generative models like flow matching and rectified flow, caused by treating timesteps independently. It introduces Temporal Pair Consistency (TPC), a lightweight variance-reduction principle that couples velocity predictions at paired timesteps along the same probability path without altering the model, path, or solver, and proves it yields a quadratic regularization that reduces gradient variance while preserving the FM objective. The authors provide both fixed antithetic and learned monotone pairing strategies, along with stochastic gating, and show theoretically that TPC increases gradient correlation to achieve strict variance reduction; empirically, TPC improves FID and sampling efficiency on CIFAR-10 and ImageNet and extends to SOTA-style pipelines with noise augmentation and score-based denoising for FM and RF. Across unconditional generation and modern training regimes, TPC consistently shifts the quality–efficiency frontier without extra inference cost, offering a simple yet effective enhancement to flow-based generative modeling. Limitations include focus on unconditional generation up to $128\times128$ and scope for extending to conditional tasks and higher resolutions.

Abstract

Continuous-time generative models, such as diffusion models, flow matching, and rectified flow, learn time-dependent vector fields but are typically trained with objectives that treat timesteps independently, leading to high estimator variance and inefficient sampling. Prior approaches mitigate this via explicit smoothness penalties, trajectory regularization, or modified probability paths and solvers. We introduce Temporal Pair Consistency (TPC), a lightweight variance-reduction principle that couples velocity predictions at paired timesteps along the same probability path, operating entirely at the estimator level without modifying the model architecture, probability path, or solver. We provide a theoretical analysis showing that TPC induces a quadratic, trajectory-coupled regularization that provably reduces gradient variance while preserving the underlying flow-matching objective. Instantiated within flow matching, TPC improves sample quality and efficiency across CIFAR-10 and ImageNet at multiple resolutions, achieving lower FID at identical or lower computational cost than prior methods, and extends seamlessly to modern SOTA-style pipelines with noise-augmented training, score-based denoising, and rectified flow.

Temporal Pair Consistency for Variance-Reduced Flow Matching

TL;DR

The work tackles high gradient variance in training continuous-time generative models like flow matching and rectified flow, caused by treating timesteps independently. It introduces Temporal Pair Consistency (TPC), a lightweight variance-reduction principle that couples velocity predictions at paired timesteps along the same probability path without altering the model, path, or solver, and proves it yields a quadratic regularization that reduces gradient variance while preserving the FM objective. The authors provide both fixed antithetic and learned monotone pairing strategies, along with stochastic gating, and show theoretically that TPC increases gradient correlation to achieve strict variance reduction; empirically, TPC improves FID and sampling efficiency on CIFAR-10 and ImageNet and extends to SOTA-style pipelines with noise augmentation and score-based denoising for FM and RF. Across unconditional generation and modern training regimes, TPC consistently shifts the quality–efficiency frontier without extra inference cost, offering a simple yet effective enhancement to flow-based generative modeling. Limitations include focus on unconditional generation up to and scope for extending to conditional tasks and higher resolutions.

Abstract

Continuous-time generative models, such as diffusion models, flow matching, and rectified flow, learn time-dependent vector fields but are typically trained with objectives that treat timesteps independently, leading to high estimator variance and inefficient sampling. Prior approaches mitigate this via explicit smoothness penalties, trajectory regularization, or modified probability paths and solvers. We introduce Temporal Pair Consistency (TPC), a lightweight variance-reduction principle that couples velocity predictions at paired timesteps along the same probability path, operating entirely at the estimator level without modifying the model architecture, probability path, or solver. We provide a theoretical analysis showing that TPC induces a quadratic, trajectory-coupled regularization that provably reduces gradient variance while preserving the underlying flow-matching objective. Instantiated within flow matching, TPC improves sample quality and efficiency across CIFAR-10 and ImageNet at multiple resolutions, achieving lower FID at identical or lower computational cost than prior methods, and extends seamlessly to modern SOTA-style pipelines with noise-augmented training, score-based denoising, and rectified flow.
Paper Structure (38 sections, 7 theorems, 38 equations, 3 figures, 8 tables, 1 algorithm)

This paper contains 38 sections, 7 theorems, 38 equations, 3 figures, 8 tables, 1 algorithm.

Key Result

Lemma 1.1

Let $\mathcal{G}_t := \sigma(x_t,t)$ and assume $\mathbb{E}\|u_t\|_2^2<\infty$. Then is the (a.s.) unique minimizer of $\mathcal{R}(v)$ over all $\mathcal{G}_t$-measurable $v$, and

Figures (3)

  • Figure 1: Sample quality vs. sampling efficiency. Fréchet Inception Distance (FID ↓) versus number of function evaluations (NFE, log scale) on CIFAR-10. Temporal Pair Consistency (TPC) consistently shifts the quality–efficiency frontier by suppressing temporal oscillations in the learned vector field, achieving lower FID at identical or lower computational cost without modifying the underlying model or solver.
  • Figure 2: ImageNet Qualitative Samples.
  • Figure 3: Qualitative samples and training variance behavior. CIFAR-10 and ImageNet-32 qualitative generations (left, middle), and training variance advantage of TPC-FM (right), where TPC-FM exhibits early variance collapse and sustained stability throughout training.

Theorems & Definitions (13)

  • Lemma 1.1: $L^2$ projection / regression form
  • proof
  • Lemma 1.2: Tikhonov selection inequality
  • proof
  • Lemma 1.3: Optimal scalar control variate
  • proof
  • Lemma 1.4: Correlation lower bound from Lipschitz continuity
  • proof
  • Theorem 1.5
  • proof
  • ...and 3 more