A Data-Driven Prism: Multi-View Source Separation with Diffusion Model Priors
Sebastian Wagner-Carena, Aizhan Akhmetzhanova, Sydney Erickson
TL;DR
The paper tackles multi-view source separation (MVSS) where the underlying sources are unknown and observations are noisy and incomplete. It introduces DDPRISM, a data-driven framework that learns independent diffusion-prior models for each source $p(\mathbf{x}^\beta)$ and uses an EM loop to sample the joint posterior given observations and known mixing matrices. Key contributions include a generalist MVSS approach that does not require explicit source priors, a joint diffusion-posterior sampling scheme via MMPS approximations, and state-of-the-art performance on contrastive MVSS as well as real-world galaxy imaging tasks. The approach enables principled uncertainty quantification and posterior sampling in challenging scientific settings, though it incurs substantial computational costs and relies on linear mixing and Gaussian noise assumptions.
Abstract
A common challenge in the natural sciences is to disentangle distinct, unknown sources from observations. Examples of this source separation task include deblending galaxies in a crowded field, distinguishing the activity of individual neurons from overlapping signals, and separating seismic events from an ambient background. Traditional analyses often rely on simplified source models that fail to accurately reproduce the data. Recent advances have shown that diffusion models can directly learn complex prior distributions from noisy, incomplete data. In this work, we show that diffusion models can solve the source separation problem without explicit assumptions about the source. Our method relies only on multiple views, or the property that different sets of observations contain different linear transformations of the unknown sources. We show that our method succeeds even when no source is individually observed and the observations are noisy, incomplete, and vary in resolution. The learned diffusion models enable us to sample from the source priors, evaluate the probability of candidate sources, and draw from the joint posterior of the source distribution given an observation. We demonstrate the effectiveness of our method on a range of synthetic problems as well as real-world galaxy observations.
