A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities

Simon A. A. Kohl; Bernardino Romera-Paredes; Klaus H. Maier-Hein; Danilo Jimenez Rezende; S. M. Ali Eslami; Pushmeet Kohli; Andrew Zisserman; Olaf Ronneberger

A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities

Simon A. A. Kohl, Bernardino Romera-Paredes, Klaus H. Maier-Hein, Danilo Jimenez Rezende, S. M. Ali Eslami, Pushmeet Kohli, Andrew Zisserman, Olaf Ronneberger

TL;DR

The paper addresses the challenge of uncertain and multi-scale interpretations in segmentation by introducing the Hierarchical Probabilistic U-Net (HPU-Net), a segmentation model that integrates a conditional variational auto-encoder with a multi-scale latent hierarchy injected into the decoder. This structure allows sampling of diverse, high-fidelity segmentations that capture both global and local variations, addressing complex outputs like instance segmentation. The authors demonstrate improved distribution fidelity and reconstruction across LIDC-IDRI, SNEMI3D, and Cityscapes, including extrapolation capabilities and coherent multi-object segmentations. The work highlights the potential for uncertainty-aware, interactive segmentation in medical and natural images and suggests broader applicability to spatio-temporal prediction tasks.

Abstract

Medical imaging only indirectly measures the molecular identity of the tissue within each voxel, which often produces only ambiguous image evidence for target measures of interest, like semantic segmentation. This diversity and the variations of plausible interpretations are often specific to given image regions and may thus manifest on various scales, spanning all the way from the pixel to the image level. In order to learn a flexible distribution that can account for multiple scales of variations, we propose the Hierarchical Probabilistic U-Net, a segmentation network with a conditional variational auto-encoder (cVAE) that uses a hierarchical latent space decomposition. We show that this model formulation enables sampling and reconstruction of segmenations with high fidelity, i.e. with finely resolved detail, while providing the flexibility to learn complex structured distributions across scales. We demonstrate these abilities on the task of segmenting ambiguous medical scans as well as on instance segmentation of neurobiological and natural images. Our model automatically separates independent factors across scales, an inductive bias that we deem beneficial in structured output prediction tasks beyond segmentation.

A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities

TL;DR

Abstract

A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)