Table of Contents
Fetching ...

SFBD-OMNI: Bridge models for lossy measurement restoration with limited clean samples

Haoye Lu, Yaoliang Yu, Darren Ho

TL;DR

SFBD-OMNI tackles distribution restoration under general, potentially non-identifiable corruption by reformulating the KL ambient projection as a one-sided entropic OT problem and solving it with bridge-diffusion models. The framework alternates learning a posterior under the corruption kernel and updating a ground-truth prior, enabling recovery with very few clean samples and without adversarial training. Theoretical identifiability results, augmented KLAP with priors, and an online training variant yield stable, end-to-end optimization. Empirically, SFBD-OMNI improves recovery performance on CIFAR-10 and CelebA across diverse corruptions, including non-identifiable cases, demonstrating practical robustness and flexibility.

Abstract

In many real-world scenarios, obtaining fully observed samples is prohibitively expensive or even infeasible, while partial and noisy observations are comparatively easy to collect. In this work, we study distribution restoration with abundant noisy samples, assuming the corruption process is available as a black-box generator. We show that this task can be framed as a one-sided entropic optimal transport problem and solved via an EM-like algorithm. We further provide a test criterion to determine whether the true underlying distribution is recoverable under per-sample information loss, and show that in otherwise unrecoverable cases, a small number of clean samples can render the distribution largely recoverable. Building on these insights, we introduce SFBD-OMNI, a bridge model-based framework that maps corrupted sample distributions to the ground-truth distribution. Our method generalizes Stochastic Forward-Backward Deconvolution (SFBD; Lu et al., 2025) to handle arbitrary measurement models beyond Gaussian corruption. Experiments across benchmark datasets and diverse measurement settings demonstrate significant improvements in both qualitative and quantitative performance.

SFBD-OMNI: Bridge models for lossy measurement restoration with limited clean samples

TL;DR

SFBD-OMNI tackles distribution restoration under general, potentially non-identifiable corruption by reformulating the KL ambient projection as a one-sided entropic OT problem and solving it with bridge-diffusion models. The framework alternates learning a posterior under the corruption kernel and updating a ground-truth prior, enabling recovery with very few clean samples and without adversarial training. Theoretical identifiability results, augmented KLAP with priors, and an online training variant yield stable, end-to-end optimization. Empirically, SFBD-OMNI improves recovery performance on CIFAR-10 and CelebA across diverse corruptions, including non-identifiable cases, demonstrating practical robustness and flexibility.

Abstract

In many real-world scenarios, obtaining fully observed samples is prohibitively expensive or even infeasible, while partial and noisy observations are comparatively easy to collect. In this work, we study distribution restoration with abundant noisy samples, assuming the corruption process is available as a black-box generator. We show that this task can be framed as a one-sided entropic optimal transport problem and solved via an EM-like algorithm. We further provide a test criterion to determine whether the true underlying distribution is recoverable under per-sample information loss, and show that in otherwise unrecoverable cases, a small number of clean samples can render the distribution largely recoverable. Building on these insights, we introduce SFBD-OMNI, a bridge model-based framework that maps corrupted sample distributions to the ground-truth distribution. Our method generalizes Stochastic Forward-Backward Deconvolution (SFBD; Lu et al., 2025) to handle arbitrary measurement models beyond Gaussian corruption. Experiments across benchmark datasets and diverse measurement settings demonstrate significant improvements in both qualitative and quantitative performance.

Paper Structure

This paper contains 26 sections, 10 theorems, 72 equations, 11 figures, 5 tables, 2 algorithms.

Key Result

Proposition 1

Let $\mathcal{P}(X)$ denote the set of clean sample distributions. When the corruption kernel $r(\cdot \mid \mathbf{x})$ depends continuously on $\mathbf{x}$, the convex objective in eq:org_problem_setting admits a unique minimizer $p^\ast = p_{\text{data}}$ whenever $\mathcal{T}_r$ is injective on

Figures (11)

  • Figure 1: Effect of $\lambda$ on $p_\lambda^\ast$. As $\lambda \to 0$, the first term in \ref{['eq:aug_problem_setting']} ensures that $p$ remains within $\mathcal{S}(q)$, while the second term selects the element $h^\dagger \in S(q)$ closest to $h$. Consequently, $p_\lambda^\ast$ converges to $h^\dagger$, which represents the projection of $h$ onto the feasible set $\mathcal{S}(q)$.
  • Figure 2: FID scores of Online SFBD-OMNI under different clean sample weights $p = \tfrac{\lambda}{1+\lambda}$ across various corruption processes. Processes marked with ✓ satisfy the identifiability condition, while those marked with ✗ do not.
  • Figure 3: FID scores of SFBD-OMNI under different settings. (a) Online SFBD-OMNI FIDs under grayscale corruption for varying numbers of clean samples. (b) FID trajectories of the online version under additive Gaussian corruption ($\sigma = 0.5$) with 2k clean samples, for both the running reconstructed set $\mathcal{E}$ and a newly generated sample set. (c) FIDs of the classical SFBD-OMNI under additive Gaussian ($\sigma = 0.2$) without clean samples; iteration 0 represents the untrained model.
  • Figure 4: Pixel Masking
  • Figure 5: Addictive Gauss.
  • ...and 6 more figures

Theorems & Definitions (10)

  • Proposition 1: Identifiability Condition
  • Proposition 2
  • Proposition 3
  • Proposition 4: Convergence to the optimum
  • Proposition 4: Identifiability Condition
  • Proposition 4
  • Lemma 1
  • Proposition 4
  • Proposition 4: Convergence to the optimum
  • Lemma 2