Table of Contents
Fetching ...

Med3D: Transfer Learning for 3D Medical Image Analysis

Sihong Chen, Kai Ma, Yefeng Zheng

TL;DR

The paper tackles the limited scale of 3D medical imaging data by assembling 3DSeg-8 and training Med3D, a heterogeneous 3D backbone with multi-branch decoders to learn universal features across domains. The learned encoder is transferred to downstream tasks—lung segmentation, nodule classification, and LiTS liver segmentation—achieving faster convergence and higher accuracy than models pretrained on natural videos or trained from scratch. Key contributions include the multi-domain 3D pretraining framework, the eight-decoder Med3D architecture, and strong transfer performance demonstrated on diverse tasks and the LiTS benchmark. The work offers a practical pathway to improve 3D medical AI by enabling more effective pretraining and broader accessibility through released models and code.

Abstract

The performance on deep learning is significantly affected by volume of training data. Models pre-trained from massive dataset such as ImageNet become a powerful weapon for speeding up training convergence and improving accuracy. Similarly, models based on large dataset are important for the development of deep learning in 3D medical images. However, it is extremely challenging to build a sufficiently large dataset due to difficulty of data acquisition and annotation in 3D medical imaging. We aggregate the dataset from several medical challenges to build 3DSeg-8 dataset with diverse modalities, target organs, and pathologies. To extract general medical three-dimension (3D) features, we design a heterogeneous 3D network called Med3D to co-train multi-domain 3DSeg-8 so as to make a series of pre-trained models. We transfer Med3D pre-trained models to lung segmentation in LIDC dataset, pulmonary nodule classification in LIDC dataset and liver segmentation on LiTS challenge. Experiments show that the Med3D can accelerate the training convergence speed of target 3D medical tasks 2 times compared with model pre-trained on Kinetics dataset, and 10 times compared with training from scratch as well as improve accuracy ranging from 3% to 20%. Transferring our Med3D model on state-the-of-art DenseASPP segmentation network, in case of single model, we achieve 94.6\% Dice coefficient which approaches the result of top-ranged algorithms on the LiTS challenge.

Med3D: Transfer Learning for 3D Medical Image Analysis

TL;DR

The paper tackles the limited scale of 3D medical imaging data by assembling 3DSeg-8 and training Med3D, a heterogeneous 3D backbone with multi-branch decoders to learn universal features across domains. The learned encoder is transferred to downstream tasks—lung segmentation, nodule classification, and LiTS liver segmentation—achieving faster convergence and higher accuracy than models pretrained on natural videos or trained from scratch. Key contributions include the multi-domain 3D pretraining framework, the eight-decoder Med3D architecture, and strong transfer performance demonstrated on diverse tasks and the LiTS benchmark. The work offers a practical pathway to improve 3D medical AI by enabling more effective pretraining and broader accessibility through released models and code.

Abstract

The performance on deep learning is significantly affected by volume of training data. Models pre-trained from massive dataset such as ImageNet become a powerful weapon for speeding up training convergence and improving accuracy. Similarly, models based on large dataset are important for the development of deep learning in 3D medical images. However, it is extremely challenging to build a sufficiently large dataset due to difficulty of data acquisition and annotation in 3D medical imaging. We aggregate the dataset from several medical challenges to build 3DSeg-8 dataset with diverse modalities, target organs, and pathologies. To extract general medical three-dimension (3D) features, we design a heterogeneous 3D network called Med3D to co-train multi-domain 3DSeg-8 so as to make a series of pre-trained models. We transfer Med3D pre-trained models to lung segmentation in LIDC dataset, pulmonary nodule classification in LIDC dataset and liver segmentation on LiTS challenge. Experiments show that the Med3D can accelerate the training convergence speed of target 3D medical tasks 2 times compared with model pre-trained on Kinetics dataset, and 10 times compared with training from scratch as well as improve accuracy ranging from 3% to 20%. Transferring our Med3D model on state-the-of-art DenseASPP segmentation network, in case of single model, we achieve 94.6\% Dice coefficient which approaches the result of top-ranged algorithms on the LiTS challenge.

Paper Structure

This paper contains 12 sections, 3 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Visualization of the segmentation results of our approach vs. the comparison ones after the same training epochs.
  • Figure 2: Framework of the proposed method.
  • Figure 3: Framework of the liver segmentation.
  • Figure 4: Random sample 10%, 20%, 40%, 80%, 100% of training data and train Med3D.
  • Figure 5: Training curve for lung segmentation.
  • ...and 1 more figures