Table of Contents
Fetching ...

Probabilistic U-Net with Kendall Shape Spaces for Geometry-Aware Segmentations of Images

Jiyoung Park, Günay Doğan

TL;DR

This paper proposes a probabilistic image segmentation model that can incorporate the geometry of a segmentation, and builds on the Probabilistic U-Net to generate probabilistic segmentations, i.e. multiple likely segmentations for an input image.

Abstract

One of the fundamental problems in computer vision is image segmentation, the task of detecting distinct regions or objects in given images. Deep Neural Networks (DNN) have been shown to be very effective in segmenting challenging images, producing convincing segmentations. There is further need for probabilistic DNNs that can reflect the uncertainties from the input images and the models into the computed segmentations, in other words, new DNNs that can generate multiple plausible segmentations and their distributions depending on the input or the model uncertainties. While there are existing probabilistic segmentation models, many of them do not take into account the geometry or shape underlying the segmented regions. In this paper, we propose a probabilistic image segmentation model that can incorporate the geometry of a segmentation. Our proposed model builds on the Probabilistic U-Net of \cite{kohl2018probabilistic} to generate probabilistic segmentations, i.e.\! multiple likely segmentations for an input image. Our model also adopts the Kendall Shape Variational Auto-Encoder of \cite{vadgama2023kendall} to encode a Kendall shape space in the latent variable layers of the prior and posterior networks of the Probabilistic U-Net. Incorporating the shape space in this manner leads to a more robust segmentation with spatially coherent regions, respecting the underlying geometry in the input images.

Probabilistic U-Net with Kendall Shape Spaces for Geometry-Aware Segmentations of Images

TL;DR

This paper proposes a probabilistic image segmentation model that can incorporate the geometry of a segmentation, and builds on the Probabilistic U-Net to generate probabilistic segmentations, i.e. multiple likely segmentations for an input image.

Abstract

One of the fundamental problems in computer vision is image segmentation, the task of detecting distinct regions or objects in given images. Deep Neural Networks (DNN) have been shown to be very effective in segmenting challenging images, producing convincing segmentations. There is further need for probabilistic DNNs that can reflect the uncertainties from the input images and the models into the computed segmentations, in other words, new DNNs that can generate multiple plausible segmentations and their distributions depending on the input or the model uncertainties. While there are existing probabilistic segmentation models, many of them do not take into account the geometry or shape underlying the segmented regions. In this paper, we propose a probabilistic image segmentation model that can incorporate the geometry of a segmentation. Our proposed model builds on the Probabilistic U-Net of \cite{kohl2018probabilistic} to generate probabilistic segmentations, i.e.\! multiple likely segmentations for an input image. Our model also adopts the Kendall Shape Variational Auto-Encoder of \cite{vadgama2023kendall} to encode a Kendall shape space in the latent variable layers of the prior and posterior networks of the Probabilistic U-Net. Incorporating the shape space in this manner leads to a more robust segmentation with spatially coherent regions, respecting the underlying geometry in the input images.

Paper Structure

This paper contains 12 sections, 1 theorem, 9 equations, 13 figures.

Key Result

Theorem 2.2

The KL divergence between two von Mises-Fisher $vMF(\mu_0, \kappa_0), vMF(\mu_1, \kappa_1)$ distributions on $\mathbb{S}^{d - 1}$ where $d$ is odd (with $d^{\bullet} = \frac{d-1}{2}$, $d^{\diamond} = d^{\bullet} - 1$) is bounded by the following quantity:

Figures (13)

  • Figure 1: Architecture for Kendall Shape Probabilistic U-Net, consisting of U-Net, Prior Network, and Posterior Network. U-Net is the same as the one in Probabilistic U-Net. Prior Network takes an input image, and returns its orientation and the parameters of vMF distribution. Posterior Network works same, but also takes a ground-truth segmentation. Grey and yellow boxes are vanilla CNN and $SO(m)$-equivariant steerable CNNs layers respectively.
  • Figure 2: Topleft: Input image from LIDC data. Topright: Ground truth segmentation. Second row: Segmentation samples from the original Probabilistic U-Net. Third row: Segmentation samples from Kendall Shape Probabilistic U-Net.
  • Figure 3: Topleft: Input image 0 from LIDC data. Topright: Ground truth segmentation. 2-4 rows: Segmentation samples from original Probabilistic U-Net. 5-7 rows: Segmentation samples from Kendall Shape Probabilistic U-Net. Each row shares the same seed.
  • Figure 4: Topleft: Input image 1 from LIDC data. Topright: Ground truth segmentation. 2-4 rows: Segmentation samples from original Probabilistic U-Net. 5-7 rows: Segmentation samples from Kendall Shape Probabilistic U-Net. Each row shares the same seed.
  • Figure 5: Topleft: Input image 2 from LIDC data. Topright: Ground truth segmentation. 2-4 rows: Segmentation samples from original Probabilistic U-Net. 5-7 rows: Segmentation samples from Kendall Shape Probabilistic U-Net. Each row shares the same seed.
  • ...and 8 more figures

Theorems & Definitions (6)

  • Definition 2.1: Kendall shape space
  • Theorem 2.2: KL divergence between von Mises-Fisher distributions
  • Remark 2.3
  • Remark 2.4
  • Definition 2.5: Steerable feature
  • Remark 3.1