Table of Contents
Fetching ...

Diffusion Models for Inverse Problems in the Exponential Family

Alessandro Micheli, Mélodie Monod, Samir Bhatt

TL;DR

This work extends diffusion models to handle inverse problems where the observations follow a distribution from the exponential family, such as a Poisson or a Binomial distribution, by leveraging the conjugacy properties of exponential family distributions and introduces the evidence trick, a method that provides a tractable approximation to the likelihood score.

Abstract

Diffusion models have emerged as powerful tools for solving inverse problems, yet prior work has primarily focused on observations with Gaussian measurement noise, restricting their use in real-world scenarios. This limitation persists due to the intractability of the likelihood score, which until now has only been approximated in the simpler case of Gaussian likelihoods. In this work, we extend diffusion models to handle inverse problems where the observations follow a distribution from the exponential family, such as a Poisson or a Binomial distribution. By leveraging the conjugacy properties of exponential family distributions, we introduce the evidence trick, a method that provides a tractable approximation to the likelihood score. In our experiments, we demonstrate that our methodology effectively performs Bayesian inference on spatially inhomogeneous Poisson processes with intensities as intricate as ImageNet images. Furthermore, we demonstrate the real-world impact of our methodology by showing that it performs competitively with the current state-of-the-art in predicting malaria prevalence estimates in Sub-Saharan Africa.

Diffusion Models for Inverse Problems in the Exponential Family

TL;DR

This work extends diffusion models to handle inverse problems where the observations follow a distribution from the exponential family, such as a Poisson or a Binomial distribution, by leveraging the conjugacy properties of exponential family distributions and introduces the evidence trick, a method that provides a tractable approximation to the likelihood score.

Abstract

Diffusion models have emerged as powerful tools for solving inverse problems, yet prior work has primarily focused on observations with Gaussian measurement noise, restricting their use in real-world scenarios. This limitation persists due to the intractability of the likelihood score, which until now has only been approximated in the simpler case of Gaussian likelihoods. In this work, we extend diffusion models to handle inverse problems where the observations follow a distribution from the exponential family, such as a Poisson or a Binomial distribution. By leveraging the conjugacy properties of exponential family distributions, we introduce the evidence trick, a method that provides a tractable approximation to the likelihood score. In our experiments, we demonstrate that our methodology effectively performs Bayesian inference on spatially inhomogeneous Poisson processes with intensities as intricate as ImageNet images. Furthermore, we demonstrate the real-world impact of our methodology by showing that it performs competitively with the current state-of-the-art in predicting malaria prevalence estimates in Sub-Saharan Africa.

Paper Structure

This paper contains 99 sections, 9 theorems, 133 equations, 17 figures, 1 table.

Key Result

Lemma 3.4

Let $\boldsymbol{\theta} = g^{-1}(\mathbf{x}_0)$. Furthermore, let $q_{\boldsymbol{\theta}\vert \boldsymbol{\zeta}(\mathbf{x}_t)}(\boldsymbol{\theta}\vert \boldsymbol{\zeta}(\mathbf{x}_t))$ be defined as in eq-prior-q-theta and be part of the exponential family with hyperparameters $\boldsymbol{\zet where and for a function $C(\mathbf{x}_t)$ that does not depend on $\boldsymbol{\zeta}$.

Figures (17)

  • Figure 1: Illustration of the approach using Diffusion Models for Inverse Problems in the Exponential Family. By leveraging the posterior score $\nabla_{\mathbf{x}_t} p_{\mathbf{x}_t|\mathbf{y}}(\mathbf{x}_t|\mathbf{y})$, a reverse stochastic differential equation (SDE) can be solved to generate posterior samples of the latent variable $\mathbf{x}_0$ from noise. Posterior samples of the parameter $\boldsymbol{\theta}$ are obtained by applying a deterministic inverse link function. The prior score function, $\nabla_{\mathbf{x}_t} p_{\mathbf{x}_t}(\mathbf{x}_t)$, is estimated using a neural network, following established approaches. A novel method is introduced to estimate the likelihood score function, $\nabla_{\mathbf{x}_t} p_{\mathbf{y}|\mathbf{x}_t}(\mathbf{y}|\mathbf{x}_t)$, leveraging the evidence trick in combination with amortized variational inference. The Figure illustrates the inference of a spatially inhomogeneous Poisson process where the intensity is as intricate as an ImageNet image.
  • Figure 2: Hierarchical Probabilistic Model. The dotted arrow represents a deterministic relationship, while the solid arrow indicates a probabilistic relationship.
  • Figure 3: Score-Based Cox Process Results.(a) (Left) True Cox Process intensity from the ImageNet validation set, transformed using an exponential link function. (Right) Median of the estimated Cox Process intensity posterior distribution using the Score-Based Cox Process method. (b) (Left) True Cox Process Intensity from Sentinel-2 Satellite Imagery of Manhattan, New York City (Right) Median of the estimated Cox Process intensity posterior distribution using the Score-Based Cox Process method.
  • Figure 4: Prevalence of Malaria in Sub-Saharan Africa Results.(a) Empirical PfPR. (b) Median of the estimated PfPR posterior distribution. (c)$25$% quantile of the estimated PfPR posterior distribution. (d)$75$% quantile of the estimated PfPR posterior distribution. The inset plots highlight Nigeria, one of the countries with the highest malaria burden worldwide. The empty entries either correspond to locations outside Sub-Saharan Africa or the stable spatial limits of P. falciparum transmission Bhatt2015-uk
  • Figure A5: Posterior Density of $\boldsymbol{\theta}$ given observations following a Normal distribution. Estimated posterior median (dot) and 95% credible interval (error bars) by three methods (colors) along with the true value of $\boldsymbol{\theta}$ (cross). The inference was performed given $N=1$ observations following a Normal distribution for which the mean was equal to $\boldsymbol{\theta} = \mathbf{x}_0$ and the standard-deviation was fixed to $\sigma =1$.
  • ...and 12 more figures

Theorems & Definitions (15)

  • Lemma 3.4: KL Divergence of $p_{\boldsymbol{\theta}\vert \mathbf{x}_t}$ from $q_{\boldsymbol{\theta}\vert \boldsymbol{\zeta}(\mathbf{x}_t)}$
  • Theorem 3.5
  • Remark 3.6: Inference Network
  • Definition 1.1: Exponential family of densities
  • Proposition 1.2: Gradients of log-partition function and expected sufficient statistics
  • Corollary 1.3
  • Definition 1.4: Natural exponential family conjugate prior
  • Proposition 1.5: Exponential Family Form of Independent Univariate Variables
  • Proposition 1.6: Exponential Family Form of Independent Multivariate Variables
  • Proposition 1.7: Natural Exponential Family Conjugate Prior for Independent Univariate Parameters
  • ...and 5 more