Conditional Mutual Information Based Diffusion Posterior Sampling for Solving Inverse Problems
Shayan Mohajer Hamidi, En-Hui Yang
TL;DR
This work tackles linear inverse problems with diffusion-based priors by introducing a principled objective that maximizes $I(x_0;y|x_t)$ during reverse diffusion, ensuring intermediate states retain maximal information about measurements. It derives a closed-form expression for the conditional mutual information under a Gaussian conditional for $x_0|x_t$, provides its gradient, and presents a Hutchinson-based approximation to avoid computing third-order derivatives, enabling scalable integration with existing solvers. The approach functions as a plug-in enhancement to methods such as DPS, PiGDM, MCG, and DSG, delivering consistent improvements on FFHQ and ImageNet across inpainting, deblurring, and super-resolution. The results demonstrate that an information-theoretic perspective can substantially improve the fidelity of diffusion-based posterior sampling for inverse problems, with practical gains and broad compatibility. The proposed framework offers a scalable, generalizable regularization for diffusion priors that can be adopted with modest code changes.
Abstract
Inverse problems are prevalent across various disciplines in science and engineering. In the field of computer vision, tasks such as inpainting, deblurring, and super-resolution are commonly formulated as inverse problems. Recently, diffusion models (DMs) have emerged as a promising approach for addressing noisy linear inverse problems, offering effective solutions without requiring additional task-specific training. Specifically, with the prior provided by DMs, one can sample from the posterior by finding the likelihood. Since the likelihood is intractable, it is often approximated in the literature. However, this approximation compromises the quality of the generated images. To overcome this limitation and improve the effectiveness of DMs in solving inverse problems, we propose an information-theoretic approach. Specifically, we maximize the conditional mutual information $\mathrm{I}(\boldsymbol{x}_0; \boldsymbol{y} | \boldsymbol{x}_t)$, where $\boldsymbol{x}_0$ represents the reconstructed signal, $\boldsymbol{y}$ is the measurement, and $\boldsymbol{x}_t$ is the intermediate signal at stage $t$. This ensures that the intermediate signals $\boldsymbol{x}_t$ are generated in a way that the final reconstructed signal $\boldsymbol{x}_0$ retains as much information as possible about the measurement $\boldsymbol{y}$. We demonstrate that this method can be seamlessly integrated with recent approaches and, once incorporated, enhances their performance both qualitatively and quantitatively.
