Table of Contents
Fetching ...

A Mixture-Based Framework for Guiding Diffusion Models

Yazid Janati, Badr Moufad, Mehdi Abou El Qassime, Alain Durmus, Eric Moulines, Jimmy Olsson

TL;DR

This work addresses inverse problems by leveraging diffusion priors within a Bayesian framework and introduces Mixture-Guided Diffusion Model (MGDM), a training-free, inference-time sampler. MGDM replaces intractable intermediate posteriors with a mixture of tractable likelihood approximations and uses data augmentation with Gibbs sampling to draw approximate posterior samples, enabling scalable improvements via Gibbs iterations. The method demonstrates strong results across multiple image restoration tasks (pixel- and latent-space priors) and audio source separation, often matching or surpassing state-of-the-art training-free approaches while offering tunable compute-time trade-offs. MGDM advances practical deployment of diffusion priors for diverse, unseen inverse problems, with potential extensions to more complex sampling schemes and latent-variable settings.

Abstract

Denoising diffusion models have driven significant progress in the field of Bayesian inverse problems. Recent approaches use pre-trained diffusion models as priors to solve a wide range of such problems, only leveraging inference-time compute and thereby eliminating the need to retrain task-specific models on the same dataset. To approximate the posterior of a Bayesian inverse problem, a diffusion model samples from a sequence of intermediate posterior distributions, each with an intractable likelihood function. This work proposes a novel mixture approximation of these intermediate distributions. Since direct gradient-based sampling of these mixtures is infeasible due to intractable terms, we propose a practical method based on Gibbs sampling. We validate our approach through extensive experiments on image inverse problems, utilizing both pixel- and latent-space diffusion priors, as well as on source separation with an audio diffusion model. The code is available at https://www.github.com/badr-moufad/mgdm

A Mixture-Based Framework for Guiding Diffusion Models

TL;DR

This work addresses inverse problems by leveraging diffusion priors within a Bayesian framework and introduces Mixture-Guided Diffusion Model (MGDM), a training-free, inference-time sampler. MGDM replaces intractable intermediate posteriors with a mixture of tractable likelihood approximations and uses data augmentation with Gibbs sampling to draw approximate posterior samples, enabling scalable improvements via Gibbs iterations. The method demonstrates strong results across multiple image restoration tasks (pixel- and latent-space priors) and audio source separation, often matching or surpassing state-of-the-art training-free approaches while offering tunable compute-time trade-offs. MGDM advances practical deployment of diffusion priors for diverse, unseen inverse problems, with potential extensions to more complex sampling schemes and latent-variable settings.

Abstract

Denoising diffusion models have driven significant progress in the field of Bayesian inverse problems. Recent approaches use pre-trained diffusion models as priors to solve a wide range of such problems, only leveraging inference-time compute and thereby eliminating the need to retrain task-specific models on the same dataset. To approximate the posterior of a Bayesian inverse problem, a diffusion model samples from a sequence of intermediate posterior distributions, each with an intractable likelihood function. This work proposes a novel mixture approximation of these intermediate distributions. Since direct gradient-based sampling of these mixtures is infeasible due to intractable terms, we propose a practical method based on Gibbs sampling. We validate our approach through extensive experiments on image inverse problems, utilizing both pixel- and latent-space diffusion priors, as well as on source separation with an audio diffusion model. The code is available at https://www.github.com/badr-moufad/mgdm

Paper Structure

This paper contains 31 sections, 49 equations, 27 figures, 8 tables, 4 algorithms.

Figures (27)

  • Figure 1: Evolution of $\hat{X}^* _0$ throughout the iterations for MGDM and DAPSzhang2024daps.
  • Figure 2: MGDM sample images for various tasks on ImageNet (left) and FFHQ (right) datasets.
  • Figure 3: Performance of MGDM as a function of the number of Gibbs steps $R$. The setup $R=1,G\gg1$ represents MGDM with $R=1$ and a number of gradient steps resulting in a runtime equivalent to using $R=6$. Left: Mean SI-SDRI for multisource--audio separation task on slakh2100 test dataset. Right: Mean LPIPS for the phase retrieval task on FFHQ.
  • Figure 4: Evolution of the running state $\hat{X}^* _0$ in \ref{['algo:midpoint-gibbs']} for the two time-sampling distributions given in \ref{['eq:sampling-dist-mix']} and \ref{['eq:sampling-dist-zero']}.
  • Figure 5: Reconstructions for half mask inpainting on FFHQ dataset.
  • ...and 22 more figures

Theorems & Definitions (3)

  • Remark 1.1
  • Remark 1.2
  • Remark 1.3