A New Perspective On Denoising Based On Optimal Transport
Nicolas Garcia Trillos, Bodhisattva Sen
TL;DR
The paper develops an optimal-transport perspective on denoising latent-variable models where $Z|\Theta$ follows a known likelihood and $\Theta\sim G^*$. It proves the existence and uniqueness of an OT-based denoiser $\delta^*$, related to a Monge map by $\delta^*(z)=\nabla\varphi^*(\overline{\theta}(z))$, ensuring $\delta^*(Z)\sim G^*$; it also introduces a soft-penalty version $\delta^*_{\tau}$ that linearly interpolates between the Bayes estimator $\overline{\theta}(Z)$ and $\delta^*(Z)$. A complementary observable-space penalization (FModel) is analyzed via a Kantorovich relaxation, proving existence of solutions and showing that under identifiability $\delta^*$ can be recovered as $\tau\to 0$; a nontrivial link to multimarginal OT motivates potential numerical methods. The framework connects to Tweedie’s formula in exponential-family settings to estimate the posterior mean from marginal data, highlighting practical routes for finite-sample construction of $\delta^*$ without explicit $G^*$. Overall, the work advances theoretical foundations for OT-based denoising and points to tractable, OT-inspired algorithms for high-dimensional latent-variable inference.
Abstract
In the standard formulation of the denoising problem, one is given a probabilistic model relating a latent variable $Θ\in Ω\subset \mathbb{R}^m \; (m\ge 1)$ and an observation $Z \in \mathbb{R}^d$ according to: $Z \mid Θ\sim p(\cdot\mid Θ)$ and $Θ\sim G^*$, and the goal is to construct a map to recover the latent variable from the observation. The posterior mean, a natural candidate for estimating $Θ$ from $Z$, attains the minimum Bayes risk (under the squared error loss) but at the expense of over-shrinking the $Z$, and in general may fail to capture the geometric features of the prior distribution $G^*$ (e.g., low dimensionality, discreteness, sparsity, etc.). To rectify these drawbacks, we take a new perspective on this denoising problem that is inspired by optimal transport (OT) theory and use it to study a different, OT-based, denoiser at the population level setting. We rigorously prove that, under general assumptions on the model, this OT-based denoiser is mathematically well-defined and unique, and is closely connected to the solution to a Monge OT problem. We then prove that, under appropriate identifiability assumptions on the model, the OT-based denoiser can be recovered solely from information of the marginal distribution of $Z$ and the posterior mean of the model, after solving a linear relaxation problem over a suitable space of couplings that is reminiscent of standard multimarginal OT problems. In particular, thanks to Tweedie's formula, when the likelihood model $\{ p(\cdot \mid θ) \}_{θ\in Ω}$ is an exponential family of distributions, the OT based-denoiser can be recovered solely from the marginal distribution of $Z$. In general, our family of OT-like relaxations is of interest in its own right and for the denoising problem suggests alternative numerical methods inspired by the rich literature on computational OT.
