Flow Matching with Uncertainty Quantification and Guidance

Juyeop Han; Lukas Lao Beyer; Sertac Karaman

Flow Matching with Uncertainty Quantification and Guidance

Juyeop Han, Lukas Lao Beyer, Sertac Karaman

TL;DR

Uncertainty-aware flow matching (UA-Flow) is proposed, a lightweight extension of flow matching that predicts the velocity field together with heteroscedastic uncertainty that produces uncertainty signals more highly correlated with sample fidelity than baseline methods.

Abstract

Despite the remarkable success of sampling-based generative models such as flow matching, they can still produce samples of inconsistent or degraded quality. To assess sample reliability and generate higher-quality outputs, we propose uncertainty-aware flow matching (UA-Flow), a lightweight extension of flow matching that predicts the velocity field together with heteroscedastic uncertainty. UA-Flow estimates per-sample uncertainty by propagating velocity uncertainty through the flow dynamics. These uncertainty estimates act as a reliability signal for individual samples, and we further use them to steer generation via uncertainty-aware classifier guidance and classifier-free guidance. Experiments on image generation show that UA-Flow produces uncertainty signals more highly correlated with sample fidelity than baseline methods, and that uncertainty-guided sampling further improves generation quality.

Flow Matching with Uncertainty Quantification and Guidance

TL;DR

Abstract

Paper Structure (48 sections, 39 equations, 18 figures, 5 tables, 3 algorithms)

This paper contains 48 sections, 39 equations, 18 figures, 5 tables, 3 algorithms.

Introduction
Background
Uncertainty-Aware Flow Matching
Probabilistic Velocity Field Modeling
Uncertainty Propagation through Flow Dynamics
Uncertainty-Aware Guidance for Flow Matching
Experiments
Experimental Setup
Filtering Images with High-Uncertainty
Uncertainty-Aware Classifier Guidance
Uncertainty-Aware Classifier-Free Guidance
Conclusion
Derivation of the Uncertainty-Aware Flow Matching Loss
Remark on Jensen bias in approximating $(u_t(\mathbf{x}_t))^2$ and why we still keep $U_t(\mathbf{x}_t,\mathbf{x}_1)$.
Details on Variance Propagation and Covariance Approximations
...and 33 more sections

Figures (18)

Figure 1: Uncertainty-aware guidance sweep on ImageNet-256: generated samples and predicted latent pixel-wise uncertainties under uncertainty-aware classifier and classifier free guidance (U-CG and U-CFG). We visualize ImageNet-256 samples for the class guinea pig, Cavia cobaya across combinations of U-CG and U-CFG. Rows sweep the U-CG scale $w \in \{0, 10, 30, 50\}$ and columns sweep the maximum U-CFG scale $\lambda_{\max} \in \{0, 1, 2, 5, 10, 20\}$. The left panel shows generated images, and the right panel shows the corresponding predicted uncertainty maps (brighter indicates higher uncertainty). Increasing guidance typically yields more class-consistent samples while reducing predicted uncertainty.
Figure 2: (a) Generative quality metrics as a function of the filtering ratio on ImageNet-256. For each filtering level, high-uncertainty generated images are removed and 50k samples are randomly selected from the remaining set for evaluation. Compared to AU DeVita25Aleatoric and BayesDiff Kou23Bayesdiff, which estimate element-wise uncertainty in the latent space, UA-Flow achieves lower FID and higher precision after filtering. GenUnc Jazbec25Genunc is shown as a reference baseline, as it uses domain-specific scalar uncertainty estimated in the CLIP embedding space. (b) Example latent pixel-wise uncertainty maps produced by UA-Flow, AU, and BayesDiff for the same generated image. Brighter regions indicate higher uncertainty. For visualization, uncertainty values are normalized independently for each image.
Figure 3: ImageNet-256 samples under different U-CG scales $w$ at a CFG scale $\lambda=0.5$. Each column corresponds to a fixed class label and random seed, while rows sweep the U-CG scale $w \in \{0, 10, 30, 50\}$. As $w$ increases, samples become more class-consistent and visually simpler, reflecting the fidelity–diversity trade-off induced by steering generation toward low-uncertainty regions.
Figure 4: (a) FID, precision, and recall as a function of the fixed CFG scale $\lambda$ or the maximum scale $\lambda_{\max}$ of U-CFG on ImageNet-256 CFG degrades sharply at large $\lambda$, while U-CFG remains more stable as $\lambda_{\max}$ increases. (b) Violin plots of the adaptive U-CFG scale $\lambda^*$ across sampling steps 1,000 samples. $\lambda^*$ tends to be smaller in early steps and larger in later steps.
Figure 5: ImageNet-256 samples under increasing CFG scale $\lambda$ and the maximum of U-CFG scale $\lambda_{\max}$. Samples generated by standard CFG (left) and U-CFG (right) while sweeping the scale $\lambda$ (CFG) or the cap $\lambda_{\max}$ (U-CFG) in $\{1.0, 2.0, 5.0, 10, 20\}$. Large $\lambda$ in CFG can lead to oversaturation and mode collapse, whereas U-CFG better preserves global structure under large $\lambda_{\max}$ via adaptive step-wise scaling.
...and 13 more figures

Flow Matching with Uncertainty Quantification and Guidance

TL;DR

Abstract

Flow Matching with Uncertainty Quantification and Guidance

Authors

TL;DR

Abstract

Table of Contents

Figures (18)