Table of Contents
Fetching ...

Primitive Geometry Segment Pre-training for 3D Medical Image Segmentation

Ryu Tadokoro, Ryosuke Yamada, Kodai Nakashima, Ryo Nakamura, Hirokatsu Kataoka

TL;DR

The paper tackles data scarcity and privacy concerns in 3D medical image segmentation by introducing PrimGeoSeg, a pre-training method that uses synthetic primitive geometric objects to learn 3D semantic features without real data. It constructs pre-training data through a two-step process: generating primitive objects from independent xy-plane shapes and a z-axis similarity rule, then assembling multiple objects with controlled overlap into a 3D volume to form training pairs for segmentation. Across BTCV, MSD, and BraTS, PrimGeoSeg improves downstream Dice scores over training from scratch and is competitive with or superior to state-of-the-art SSL when the amount of pre-training data is equal, demonstrating strong data efficiency and privacy benefits. Ablations show that volumetric shapes, higher class diversity, instance augmentation, overlap, and scalable data generation all contribute to the observed gains, underscoring the method's potential for broader 3D medical imaging tasks.

Abstract

The construction of 3D medical image datasets presents several issues, including requiring significant financial costs in data collection and specialized expertise for annotation, as well as strict privacy concerns for patient confidentiality compared to natural image datasets. Therefore, it has become a pressing issue in 3D medical image segmentation to enable data-efficient learning with limited 3D medical data and supervision. A promising approach is pre-training, but improving its performance in 3D medical image segmentation is difficult due to the small size of existing 3D medical image datasets. We thus present the Primitive Geometry Segment Pre-training (PrimGeoSeg) method to enable the learning of 3D semantic features by pre-training segmentation tasks using only primitive geometric objects for 3D medical image segmentation. PrimGeoSeg performs more accurate and efficient 3D medical image segmentation without manual data collection and annotation. Further, experimental results show that PrimGeoSeg on SwinUNETR improves performance over learning from scratch on BTCV, MSD (Task06), and BraTS datasets by 3.7%, 4.4%, and 0.3%, respectively. Remarkably, the performance was equal to or better than state-of-the-art self-supervised learning despite the equal number of pre-training data. From experimental results, we conclude that effective pre-training can be achieved by looking at primitive geometric objects only. Code and dataset are available at https://github.com/SUPER-TADORY/PrimGeoSeg.

Primitive Geometry Segment Pre-training for 3D Medical Image Segmentation

TL;DR

The paper tackles data scarcity and privacy concerns in 3D medical image segmentation by introducing PrimGeoSeg, a pre-training method that uses synthetic primitive geometric objects to learn 3D semantic features without real data. It constructs pre-training data through a two-step process: generating primitive objects from independent xy-plane shapes and a z-axis similarity rule, then assembling multiple objects with controlled overlap into a 3D volume to form training pairs for segmentation. Across BTCV, MSD, and BraTS, PrimGeoSeg improves downstream Dice scores over training from scratch and is competitive with or superior to state-of-the-art SSL when the amount of pre-training data is equal, demonstrating strong data efficiency and privacy benefits. Ablations show that volumetric shapes, higher class diversity, instance augmentation, overlap, and scalable data generation all contribute to the observed gains, underscoring the method's potential for broader 3D medical imaging tasks.

Abstract

The construction of 3D medical image datasets presents several issues, including requiring significant financial costs in data collection and specialized expertise for annotation, as well as strict privacy concerns for patient confidentiality compared to natural image datasets. Therefore, it has become a pressing issue in 3D medical image segmentation to enable data-efficient learning with limited 3D medical data and supervision. A promising approach is pre-training, but improving its performance in 3D medical image segmentation is difficult due to the small size of existing 3D medical image datasets. We thus present the Primitive Geometry Segment Pre-training (PrimGeoSeg) method to enable the learning of 3D semantic features by pre-training segmentation tasks using only primitive geometric objects for 3D medical image segmentation. PrimGeoSeg performs more accurate and efficient 3D medical image segmentation without manual data collection and annotation. Further, experimental results show that PrimGeoSeg on SwinUNETR improves performance over learning from scratch on BTCV, MSD (Task06), and BraTS datasets by 3.7%, 4.4%, and 0.3%, respectively. Remarkably, the performance was equal to or better than state-of-the-art self-supervised learning despite the equal number of pre-training data. From experimental results, we conclude that effective pre-training can be achieved by looking at primitive geometric objects only. Code and dataset are available at https://github.com/SUPER-TADORY/PrimGeoSeg.
Paper Structure (11 sections, 3 equations, 6 figures, 2 tables)

This paper contains 11 sections, 3 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The overview of PrimGeoSeg.
  • Figure 2: The generation process of assembled objects as pre-training data for PrimGeoSeg. We generate an assembled object by arranging randomly multiple primitive objects generated from the individual $xy$-plane and $z$-axis rules.
  • Figure 3: The details of the arrangement of primitive objects.
  • Figure 4: Expetiment (a).
  • Figure 5: Qualitative results on BTCV. Red dashes indicate misidentified areas and blue dashes indicate more accurately identified areas.
  • ...and 1 more figures