Table of Contents
Fetching ...

CTS: A Consistency-Based Medical Image Segmentation Model

Kejia Zhang, Lan Zhang, Haiwei Pan, Baolong Yu

TL;DR

CTS addresses slow sampling in diffusion-based medical image segmentation by adopting a consistency-model framework that enables single-sampling inference. The authors design a joint loss combining a consistency training term with a multi-scale feature supervision mechanism integrated via channel attention in a UNet backbone, enabling end-to-end training. Empirical results on MRI brain tumor and ultrasound thyroid/liver tumor data show CTS surpasses several baselines with a single sampling time around 1.9s and faster convergence, with advanced variants further improving accuracy. The work demonstrates practical impact by reducing inference time and improving segmentation quality, suggesting consistency models are a viable alternative for medical image segmentation.

Abstract

In medical image segmentation tasks, diffusion models have shown significant potential. However, mainstream diffusion models suffer from drawbacks such as multiple sampling times and slow prediction results. Recently, consistency models, as a standalone generative network, have resolved this issue. Compared to diffusion models, consistency models can reduce the sampling times to once, not only achieving similar generative effects but also significantly speeding up training and prediction. However, they are not suitable for image segmentation tasks, and their application in the medical imaging field has not yet been explored. Therefore, this paper applies the consistency model to medical image segmentation tasks, designing multi-scale feature signal supervision modes and loss function guidance to achieve model convergence. Experiments have verified that the CTS model can obtain better medical image segmentation results with a single sampling during the test phase.

CTS: A Consistency-Based Medical Image Segmentation Model

TL;DR

CTS addresses slow sampling in diffusion-based medical image segmentation by adopting a consistency-model framework that enables single-sampling inference. The authors design a joint loss combining a consistency training term with a multi-scale feature supervision mechanism integrated via channel attention in a UNet backbone, enabling end-to-end training. Empirical results on MRI brain tumor and ultrasound thyroid/liver tumor data show CTS surpasses several baselines with a single sampling time around 1.9s and faster convergence, with advanced variants further improving accuracy. The work demonstrates practical impact by reducing inference time and improving segmentation quality, suggesting consistency models are a viable alternative for medical image segmentation.

Abstract

In medical image segmentation tasks, diffusion models have shown significant potential. However, mainstream diffusion models suffer from drawbacks such as multiple sampling times and slow prediction results. Recently, consistency models, as a standalone generative network, have resolved this issue. Compared to diffusion models, consistency models can reduce the sampling times to once, not only achieving similar generative effects but also significantly speeding up training and prediction. However, they are not suitable for image segmentation tasks, and their application in the medical imaging field has not yet been explored. Therefore, this paper applies the consistency model to medical image segmentation tasks, designing multi-scale feature signal supervision modes and loss function guidance to achieve model convergence. Experiments have verified that the CTS model can obtain better medical image segmentation results with a single sampling during the test phase.
Paper Structure (5 sections, 4 figures, 2 tables, 1 algorithm)

This paper contains 5 sections, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: CTS model overall flowchart. (a) The process of multi-scale feature supervision signal input is displayed. (b). The overlay process of feature supervision signals through channel attention mechanism is shown.
  • Figure 2: Result Visualization
  • Figure 3: Models saved at different stages, their training loss, and corresponding results on the testset.
  • Figure 4: Accelerating the convergence speed of multi-scale feature signal models