Table of Contents
Fetching ...

Unfolding with a Wasserstein Loss

Katy Craig, Benjamin Faktor, Benjamin Nachman

Abstract

Data unfolding -- the removal of noise or artifacts from measurements -- is a fundamental task across the experimental sciences. Of particular interest are applications in physics, where the dominant approach is Richardson-Lucy (RL) deconvolution. The classical RL approach aims to find denoised data that, once passed through the noise model, is as close as possible to the measured data in terms of Kullback-Leibler (KL) divergence. This requires that the support of the measured data overlaps with the output of the noise model, a hypothesis typically enforced by binning, which introduces numerical error. As a counterpoint, the present work studies an alternative formulation using a Wasserstein loss. We establish sharp conditions for existence and uniqueness of optimizers, answering open questions of Li, et al., regarding necessary conditions for existence and uniqueness in the case of transport map noise models. We then develop a provably convergent generalized Sinkhorn algorithm to compute approximate optimizers. Our algorithm requires only empirical observations of the noise model and measured data and scales with the size of the data, rather than the ambient dimension. Numerical experiments on one- and two-dimensional problems inspired by jet mass unfolding in particle physics demonstrate that the optimal transport approach offers robust, accurate performance compared to classical RL deconvolution, particularly when binning artifacts are significant.

Unfolding with a Wasserstein Loss

Abstract

Data unfolding -- the removal of noise or artifacts from measurements -- is a fundamental task across the experimental sciences. Of particular interest are applications in physics, where the dominant approach is Richardson-Lucy (RL) deconvolution. The classical RL approach aims to find denoised data that, once passed through the noise model, is as close as possible to the measured data in terms of Kullback-Leibler (KL) divergence. This requires that the support of the measured data overlaps with the output of the noise model, a hypothesis typically enforced by binning, which introduces numerical error. As a counterpoint, the present work studies an alternative formulation using a Wasserstein loss. We establish sharp conditions for existence and uniqueness of optimizers, answering open questions of Li, et al., regarding necessary conditions for existence and uniqueness in the case of transport map noise models. We then develop a provably convergent generalized Sinkhorn algorithm to compute approximate optimizers. Our algorithm requires only empirical observations of the noise model and measured data and scales with the size of the data, rather than the ambient dimension. Numerical experiments on one- and two-dimensional problems inspired by jet mass unfolding in particle physics demonstrate that the optimal transport approach offers robust, accurate performance compared to classical RL deconvolution, particularly when binning artifacts are significant.
Paper Structure (21 sections, 16 theorems, 62 equations, 8 figures, 1 table)

This paper contains 21 sections, 16 theorems, 62 equations, 8 figures, 1 table.

Key Result

Theorem 1

If CTYas and CRCas hold, a minimizer of minimization problem exists.

Figures (8)

  • Figure 1: Summary of main results on existence and uniqueness of optimizers for unfolding with a Wasserstein loss. Left: sharp conditions for existence of minimizers of (\ref{['minimization problem']}) Right: sharp conditions for uniqueness of minimizers of (\ref{['minimization problem']}).
  • Figure 1: Behavior of $W_2(\nu_\sigma, \nu)$ along iterations for OT and RL unfolding methods. Left: a single discretization of the continuum one dimensional unfolding problem. Right: mean and standard deviation of behavior across 40 independent discretizations of the same continuum unfolding problem. While OT requires more iterations to converge, it achieves higher accuracy.
  • Figure 2: Summary of existence and uniqueness results for unfolding with a Wasserstein loss, in the special case of transport map noise. Left: sharp conditions for existence of minimizers of (\ref{['Li et al denoising problem']}). Right: conditions for uniqueness of minimizers of (\ref{['Li et al denoising problem']}).
  • Figure 2: Comparison of measured data $\nu$ to $\nu_\sigma$, where $\sigma$ is from the final iteration of OT unfolding (left panel) vs. the final iteration of RL unfolding with 28 bins (right panel). In agreement with Figure \ref{['M1manyseeds']}, we observe that, OT unfolding achieves better accuracy of the approximation $\nu \approx \nu_\sigma$ compared to RL unfolding on this task.
  • Figure 3: Behavior of $W_2(\nu_\sigma, \nu)$ along iterations for OT and RL unfolding methods, averaged across 40 independent discretizations of the same one dimensional continuum unfolding problem. While OT requires more iterations to converge, it achieves higher accuracy. As $M$ increases, the accuracy of RL approaches that of OT, and all methods require more iterations to converge.
  • ...and 3 more figures

Theorems & Definitions (33)

  • Theorem 1
  • Theorem 2
  • Remark 1
  • Corollary 3
  • Corollary 4
  • Proposition 1
  • Proof 1
  • Proposition 2
  • Proof 2
  • Proof 3: Proof of Theorem \ref{['thm:existence']}
  • ...and 23 more