An Expectation-Maximization Algorithm for Training Clean Diffusion Models from Corrupted Observations
Weimin Bai, Yifei Wang, Wenzheng Chen, He Sun
TL;DR
EMDiffusion presents an EM framework to train diffusion models from corrupted observations by alternating between reconstructing clean images (E-step) and updating the diffusion prior (M-step). Starting from a small set of clean data, the method uses diffusion posterior sampling with an adaptive scaling to prevent mode collapse and progressively refines the score-based prior. Across CIFAR-10 and CelebA, EMDiffusion achieves state-of-the-art or competitive results in inpainting, denoising, and deblurring, often matching or approaching performance of methods that rely on clean priors while using corrupted data alone for training. This approach enables practical deployment of learned diffusion priors in settings where large clean datasets are unavailable, with strong implications for real-world computational imaging tasks.
Abstract
Diffusion models excel in solving imaging inverse problems due to their ability to model complex image priors. However, their reliance on large, clean datasets for training limits their practical use where clean data is scarce. In this paper, we propose EMDiffusion, an expectation-maximization (EM) approach to train diffusion models from corrupted observations. Our method alternates between reconstructing clean images from corrupted data using a known diffusion model (E-step) and refining diffusion model weights based on these reconstructions (M-step). This iterative process leads the learned diffusion model to gradually converge to the true clean data distribution. We validate our method through extensive experiments on diverse computational imaging tasks, including random inpainting, denoising, and deblurring, achieving new state-of-the-art performance.
