Hough-CNN: Deep Learning for Segmentation of Deep Brain Regions in MRI and Ultrasound
Fausto Milletari, Seyed-Ahmad Ahmadi, Christine Kroll, Annika Plate, Verena Rozanski, Juliana Maiostre, Johannes Levin, Olaf Dietrich, Birgit Ertl-Wagner, Kai Bötzel, Nassir Navab
TL;DR
This work introduces Hough-CNN, a segmentation framework that combines CNN-based voxel classification with a generalized Hough voting scheme to localize and segment multiple deep brain structures in MRI and transcranial ultrasound. By leveraging intermediate CNN features and a patch-wise database of segmentation patches, the method achieves robust, registration-free, multi-region segmentation across 2D, 2.5D, and 3D inputs, often using far fewer training patches than voxel-wise approaches. A comprehensive study across six architectures and two modalities demonstrates that Hough-CNN outperforms standard voxel-wise segmentation, with 3D data enhancing MRI performance and deep networks providing gains in ultrasound. The approach offers practical, scalable segmentation suitable for clinical workflows, with potential for transfer learning and broader disease parameter analyses.
Abstract
In this work we propose a novel approach to perform segmentation by leveraging the abstraction capabilities of convolutional neural networks (CNNs). Our method is based on Hough voting, a strategy that allows for fully automatic localisation and segmentation of the anatomies of interest. This approach does not only use the CNN classification outcomes, but it also implements voting by exploiting the features produced by the deepest portion of the network. We show that this learning-based segmentation method is robust, multi-region, flexible and can be easily adapted to different modalities. In the attempt to show the capabilities and the behaviour of CNNs when they are applied to medical image analysis, we perform a systematic study of the performances of six different network architectures, conceived according to state-of-the-art criteria, in various situations. We evaluate the impact of both different amount of training data and different data dimensionality (2D, 2.5D and 3D) on the final results. We show results on both MRI and transcranial US volumes depicting respectively 26 regions of the basal ganglia and the midbrain.
