Table of Contents
Fetching ...

Deep Gaussian mixture model for unsupervised image segmentation

Matthias Schwab, Agnes Mayr, Markus Haltmeier

TL;DR

This work tackles unsupervised image segmentation by integrating Gaussian mixture modeling with deep learning. The authors replace the EM E-step with gradient-based updates of a CNN (a U-Net) to predict per-pixel class responsibilities, yielding two methods: deepG (for GMM) and deepSVG (for SVGMM) that embed spatial structure via a deep image prior. They demonstrate that these CNN-based updates produce spatially coherent segmentations and, when trained on multiple images with a data-derived mean regularization, achieve state-of-the-art Dice scores in multi-sequence cardiac MRI tissue segmentation. The approach offers fast per-image inference after training and is flexible to incorporate additional regularization or semi-supervised extensions. Overall, the DeepGMM framework improves unsupervised segmentation in medically relevant imaging tasks and shows strong generalization potential across images with rich, multi-modal data.

Abstract

The recent emergence of deep learning has led to a great deal of work on designing supervised deep semantic segmentation algorithms. As in many tasks sufficient pixel-level labels are very difficult to obtain, we propose a method which combines a Gaussian mixture model (GMM) with unsupervised deep learning techniques. In the standard GMM the pixel values with each sub-region are modelled by a Gaussian distribution. In order to identify the different regions, the parameter vector that minimizes the negative log-likelihood (NLL) function regarding the GMM has to be approximated. For this task, usually iterative optimization methods such as the expectation-maximization (EM) algorithm are used. In this paper, we propose to estimate these parameters directly from the image using a convolutional neural network (CNN). We thus change the iterative procedure in the EM algorithm replacing the expectation-step by a gradient-step with regard to the networks parameters. This means that the network is trained to minimize the NLL function of the GMM which comes with at least two advantages. As once trained, the network is able to predict label probabilities very quickly compared with time consuming iterative optimization methods. Secondly, due to the deep image prior our method is able to partially overcome one of the main disadvantages of GMM, which is not taking into account correlation between neighboring pixels, as it assumes independence between them. We demonstrate the advantages of our method in various experiments on the example of myocardial infarct segmentation on multi-sequence MRI images.

Deep Gaussian mixture model for unsupervised image segmentation

TL;DR

This work tackles unsupervised image segmentation by integrating Gaussian mixture modeling with deep learning. The authors replace the EM E-step with gradient-based updates of a CNN (a U-Net) to predict per-pixel class responsibilities, yielding two methods: deepG (for GMM) and deepSVG (for SVGMM) that embed spatial structure via a deep image prior. They demonstrate that these CNN-based updates produce spatially coherent segmentations and, when trained on multiple images with a data-derived mean regularization, achieve state-of-the-art Dice scores in multi-sequence cardiac MRI tissue segmentation. The approach offers fast per-image inference after training and is flexible to incorporate additional regularization or semi-supervised extensions. Overall, the DeepGMM framework improves unsupervised segmentation in medically relevant imaging tasks and shows strong generalization potential across images with rich, multi-modal data.

Abstract

The recent emergence of deep learning has led to a great deal of work on designing supervised deep semantic segmentation algorithms. As in many tasks sufficient pixel-level labels are very difficult to obtain, we propose a method which combines a Gaussian mixture model (GMM) with unsupervised deep learning techniques. In the standard GMM the pixel values with each sub-region are modelled by a Gaussian distribution. In order to identify the different regions, the parameter vector that minimizes the negative log-likelihood (NLL) function regarding the GMM has to be approximated. For this task, usually iterative optimization methods such as the expectation-maximization (EM) algorithm are used. In this paper, we propose to estimate these parameters directly from the image using a convolutional neural network (CNN). We thus change the iterative procedure in the EM algorithm replacing the expectation-step by a gradient-step with regard to the networks parameters. This means that the network is trained to minimize the NLL function of the GMM which comes with at least two advantages. As once trained, the network is able to predict label probabilities very quickly compared with time consuming iterative optimization methods. Secondly, due to the deep image prior our method is able to partially overcome one of the main disadvantages of GMM, which is not taking into account correlation between neighboring pixels, as it assumes independence between them. We demonstrate the advantages of our method in various experiments on the example of myocardial infarct segmentation on multi-sequence MRI images.
Paper Structure (11 sections, 16 equations, 3 figures, 2 tables, 4 algorithms)

This paper contains 11 sections, 16 equations, 3 figures, 2 tables, 4 algorithms.

Figures (3)

  • Figure 1: Visualisation of myocardial pathology segmentation by combining three sequences of CMR images acquired from the same patient. The bSSFP sequence is used to identify myocardial borders. Edema is marked in the T2-weighted CMR image and scar is quantified in the LGE image.
  • Figure 2: Example of a three-sequence CMR image with according ground truth segmentation (top) and segmentation masks produced by the different methods (bottom). The CNN-based methods (deepG and deepSVG) exhibit slightly higher values for the NLLs, but their segmentations are smoother resulting in better Dice scores.
  • Figure 3: Example of an image in the test dataset. Including mean estimation $\boldsymbol{\mu}^{\text{data}}$ into the methods improved segmentation performance.