Table of Contents
Fetching ...

ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching

Guanbo Huang, Jingjia Mao, Fanding Huang, Fengkai Liu, Xiangyang Luo, Yaoyuan Liang, Jiasheng Lu, Xiaoe Wang, Pei Liu, Ruiliu Fu, Shao-Lun Huang

TL;DR

Investigation of the root causes of exposure bias in Flow Matching finds that the model lacks generalization to biased inputs during training, and insufficient low-frequency content captured during early denoising, leading to accumulated bias, and proposes ReflexFlow, a simple and effective reflexive refinement of the Flow Matching learning objective that dynamically corrects exposure bias.

Abstract

Despite tremendous recent progress, Flow Matching methods still suffer from exposure bias due to discrepancies in training and inference. This paper investigates the root causes of exposure bias in Flow Matching, including: (1) the model lacks generalization to biased inputs during training, and (2) insufficient low-frequency content captured during early denoising, leading to accumulated bias. Based on these insights, we propose ReflexFlow, a simple and effective reflexive refinement of the Flow Matching learning objective that dynamically corrects exposure bias. ReflexFlow consists of two components: (1) Anti-Drift Rectification (ADR), which reflexively adjusts prediction targets for biased inputs utilizing a redesigned loss under training-time scheduled sampling; and (2) Frequency Compensation (FC), which reflects on missing low-frequency components and compensates them by reweighting the loss using exposure bias. ReflexFlow is model-agnostic, compatible with all Flow Matching frameworks, and improves generation quality across datasets. Experiments on CIFAR-10, CelebA-64, and ImageNet-256 show that ReflexFlow outperforms prior approaches in mitigating exposure bias, achieving a 35.65% reduction in FID on CelebA-64.

ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching

TL;DR

Investigation of the root causes of exposure bias in Flow Matching finds that the model lacks generalization to biased inputs during training, and insufficient low-frequency content captured during early denoising, leading to accumulated bias, and proposes ReflexFlow, a simple and effective reflexive refinement of the Flow Matching learning objective that dynamically corrects exposure bias.

Abstract

Despite tremendous recent progress, Flow Matching methods still suffer from exposure bias due to discrepancies in training and inference. This paper investigates the root causes of exposure bias in Flow Matching, including: (1) the model lacks generalization to biased inputs during training, and (2) insufficient low-frequency content captured during early denoising, leading to accumulated bias. Based on these insights, we propose ReflexFlow, a simple and effective reflexive refinement of the Flow Matching learning objective that dynamically corrects exposure bias. ReflexFlow consists of two components: (1) Anti-Drift Rectification (ADR), which reflexively adjusts prediction targets for biased inputs utilizing a redesigned loss under training-time scheduled sampling; and (2) Frequency Compensation (FC), which reflects on missing low-frequency components and compensates them by reweighting the loss using exposure bias. ReflexFlow is model-agnostic, compatible with all Flow Matching frameworks, and improves generation quality across datasets. Experiments on CIFAR-10, CelebA-64, and ImageNet-256 show that ReflexFlow outperforms prior approaches in mitigating exposure bias, achieving a 35.65% reduction in FID on CelebA-64.

Paper Structure

This paper contains 15 sections, 6 theorems, 67 equations, 13 figures, 10 tables, 1 algorithm.

Key Result

Proposition 4.1

The bound for the expected final sampling error $\mathbb{E}\|\mathbf{e}_{\tau_k}\|$ under the FM objective can be expressed as the sum of the integrated ground truth velocity residual and the integrated exposure bias (more details in supplementary materials), which is formulated as follows:

Figures (13)

  • Figure 1: Qualitative Comparison of Generated Samples. All models are trained from scratch for 500K iterations using the SiT-B/4 backbone on ImageNet-256. Red boxes highlight visual artifacts produced by competing methods (SiT SiT2024, IP IP2023, SDSS multistep2024, and MDSS multistep2024). ReflexFlow produces the most faithful and visually realistic results.
  • Figure 2: Quantitative Analysis of Exposure Bias in Flow Matching.Left: Mean squared error (MSE) of model outputs under clean inputs (forward-perturbed samples) versus biased inputs (reverse-predicted samples), showing that exposure bias significantly increases the MSE. Right: Frequency distribution of model outputs compared to ground truth velocity under clean inputs. Early denoising (highlighted by the purple background) predictions lack low-frequency components (marked by the red box), while later timesteps (highlighted by the green background) show a deficiency in high-frequency components.
  • Figure 3: Exposure Bias and Frequency Compensation in Flow Matching.Left: Evolution of the frequency distribution of exposure bias across timesteps, showing that early timesteps are dominated by low-frequency components (marked by the red box), while later timesteps are dominated by high-frequency components. Right: Compared to the original learning objective, the exposure-bias-reweighted loss emphasizes low-frequency components more at early timesteps (marked by the red box), demonstrating its ability to compensate for frequency discrepancy.
  • Figure 4: Overview of ReflexFlow. (a) Anti-Drift Rectification: We construct a new learning objective that guides the model from the drifted intermediate distribution toward the true data distribution, enabling effective correction of prediction drift. (b) Frequency Compensation: To mitigate the low-frequency deficiency at HFT, we use exposure bias as a negative-feedback signal to reweight the original objective. The final objective jointly optimizes the weighted original objective and rectification objective.
  • Figure 5: Visualization of dominant low- and high-frequency regions of the original images (left) and the corresponding exposure bias heatmaps (right) at HFT. The exposure bias heatmaps are compared with the low-(blue mask) and high-pass(red mask) filtered versions of the ground-truth images, revealing that the regions emphasized by exposure bias correspond closely to the low-frequency components. Additional examples are provided in supplementary materials.
  • ...and 8 more figures

Theorems & Definitions (14)

  • Proposition 4.1: Error Bounded by FM Target
  • Remark 4.2
  • Proposition 4.3: Error Bounded by ADR
  • Remark 4.4
  • Lemma 4.5: Coupling between FM and ADR residuals
  • Corollary 4.6: Order-equivalence of FM and ADR bounds
  • Remark 4.7
  • proof : Proof of Proposition \ref{['pro:fm_bound']}.
  • proof : proof of Proposition \ref{['pro:adr_bound']}.
  • Lemma B.1: Coupling between FM and ADR residuals
  • ...and 4 more