Table of Contents
Fetching ...

A Study on the Performance of U-Net Modifications in Retroperitoneal Tumor Segmentation

Moein Heidari, Ehsan Khodapanah Aghdam, Alexander Manzella, Daniel Hsu, Rebecca Scalabrino, Wenjin Chen, David J. Foran, Ilker Hacihaliloglu

TL;DR

The paper tackles the challenge of segmenting retroperitoneal tumors in CT scans, where irregular tumor shapes and proximity to critical structures complicate manual delineation. It evaluates a range of U-Net variants that incorporate Vision Transformers, Mamba state-space models, and xLSTM blocks, introducing ViLU-Net, a U-Net–style architecture with ViL blocks and long-range dependency modeling. The study contributes a new in-house retroperitoneal tumor CT dataset, a comprehensive benchmark against state-of-the-art methods, and a ViLU-Net implementation that delivers competitive or superior accuracy with lower computational burden, aided by open-source code. The findings suggest that long-range, efficient representations can improve segmentation quality while remaining practical for clinical deployment, potentially accelerating tumor volume estimation and treatment planning.

Abstract

The retroperitoneum hosts a variety of tumors, including rare benign and malignant types, which pose diagnostic and treatment challenges due to their infrequency and proximity to vital structures. Estimating tumor volume is difficult due to their irregular shapes, and manual segmentation is time-consuming. Automatic segmentation using U-Net and its variants, incorporating Vision Transformer (ViT) elements, has shown promising results but struggles with high computational demands. To address this, architectures like the Mamba State Space Model (SSM) and Extended Long-Short Term Memory (xLSTM) offer efficient solutions by handling long-range dependencies with lower resource consumption. This study evaluates U-Net enhancements, including CNN, ViT, Mamba, and xLSTM, on a new in-house CT dataset and a public organ segmentation dataset. The proposed ViLU-Net model integrates Vi-blocks for improved segmentation. Results highlight xLSTM's efficiency in the U-Net framework. The code is publicly accessible on GitHub.

A Study on the Performance of U-Net Modifications in Retroperitoneal Tumor Segmentation

TL;DR

The paper tackles the challenge of segmenting retroperitoneal tumors in CT scans, where irregular tumor shapes and proximity to critical structures complicate manual delineation. It evaluates a range of U-Net variants that incorporate Vision Transformers, Mamba state-space models, and xLSTM blocks, introducing ViLU-Net, a U-Net–style architecture with ViL blocks and long-range dependency modeling. The study contributes a new in-house retroperitoneal tumor CT dataset, a comprehensive benchmark against state-of-the-art methods, and a ViLU-Net implementation that delivers competitive or superior accuracy with lower computational burden, aided by open-source code. The findings suggest that long-range, efficient representations can improve segmentation quality while remaining practical for clinical deployment, potentially accelerating tumor volume estimation and treatment planning.

Abstract

The retroperitoneum hosts a variety of tumors, including rare benign and malignant types, which pose diagnostic and treatment challenges due to their infrequency and proximity to vital structures. Estimating tumor volume is difficult due to their irregular shapes, and manual segmentation is time-consuming. Automatic segmentation using U-Net and its variants, incorporating Vision Transformer (ViT) elements, has shown promising results but struggles with high computational demands. To address this, architectures like the Mamba State Space Model (SSM) and Extended Long-Short Term Memory (xLSTM) offer efficient solutions by handling long-range dependencies with lower resource consumption. This study evaluates U-Net enhancements, including CNN, ViT, Mamba, and xLSTM, on a new in-house CT dataset and a public organ segmentation dataset. The proposed ViLU-Net model integrates Vi-blocks for improved segmentation. Results highlight xLSTM's efficiency in the U-Net framework. The code is publicly accessible on GitHub.

Paper Structure

This paper contains 12 sections, 2 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: (a) Schematic representation of the proposed method, ViLU-Net, (b) the ViL block, (c) convolutional stem, and (d) Up Sampler and Down Sampler blocks, where IN stands for Instance Normalization operation.
  • Figure 2: Visualized segmentation examples of abdominal organ segmentation in CT. The ViLU-Net excels at differentiating intricate soft tissues within the abdominal region.
  • Figure 3: Visual comparisons of different methods on our in house dataset.
  • Figure 4: Sample visualization of our in-house dataset from the 3 different views along with the corresponding segmentation map