Table of Contents
Fetching ...

MultiTalent: A Multi-Dataset Approach to Medical Image Segmentation

Constantin Ulrich, Fabian Isensee, Tassilo Wald, Maximilian Zenk, Michael Baumgartner, Klaus H. Maier-Hein

TL;DR

MultiTalent tackles the fragmentation of publicly available medical imaging data by training a single foundation segmentation model across 13 partially labeled abdominal CT datasets with conflicting annotations. It introduces decoupled per-class heads with Sigmoid outputs and a dataset-adaptive loss to preserve diverse label definitions while allowing overlapping structures. The approach delivers consistent improvements over single-dataset baselines and prior multi-dataset methods, accelerates training and inference, and provides strong transfer-learning benefits, even with substantially fewer annotations. Overall, MultiTalent enables holistic, multi-dataset pre-training and practical, scalable deployment of robust segmentation models for clinical imaging.

Abstract

The medical imaging community generates a wealth of datasets, many of which are openly accessible and annotated for specific diseases and tasks such as multi-organ or lesion segmentation. Current practices continue to limit model training and supervised pre-training to one or a few similar datasets, neglecting the synergistic potential of other available annotated data. We propose MultiTalent, a method that leverages multiple CT datasets with diverse and conflicting class definitions to train a single model for a comprehensive structure segmentation. Our results demonstrate improved segmentation performance compared to previous related approaches, systematically, also compared to single dataset training using state-of-the-art methods, especially for lesion segmentation and other challenging structures. We show that MultiTalent also represents a powerful foundation model that offers a superior pre-training for various segmentation tasks compared to commonly used supervised or unsupervised pre-training baselines. Our findings offer a new direction for the medical imaging community to effectively utilize the wealth of available data for improved segmentation performance. The code and model weights will be published here: [tba]

MultiTalent: A Multi-Dataset Approach to Medical Image Segmentation

TL;DR

MultiTalent tackles the fragmentation of publicly available medical imaging data by training a single foundation segmentation model across 13 partially labeled abdominal CT datasets with conflicting annotations. It introduces decoupled per-class heads with Sigmoid outputs and a dataset-adaptive loss to preserve diverse label definitions while allowing overlapping structures. The approach delivers consistent improvements over single-dataset baselines and prior multi-dataset methods, accelerates training and inference, and provides strong transfer-learning benefits, even with substantially fewer annotations. Overall, MultiTalent enables holistic, multi-dataset pre-training and practical, scalable deployment of robust segmentation models for clinical imaging.

Abstract

The medical imaging community generates a wealth of datasets, many of which are openly accessible and annotated for specific diseases and tasks such as multi-organ or lesion segmentation. Current practices continue to limit model training and supervised pre-training to one or a few similar datasets, neglecting the synergistic potential of other available annotated data. We propose MultiTalent, a method that leverages multiple CT datasets with diverse and conflicting class definitions to train a single model for a comprehensive structure segmentation. Our results demonstrate improved segmentation performance compared to previous related approaches, systematically, also compared to single dataset training using state-of-the-art methods, especially for lesion segmentation and other challenging structures. We show that MultiTalent also represents a powerful foundation model that offers a superior pre-training for various segmentation tasks compared to commonly used supervised or unsupervised pre-training baselines. Our findings offer a new direction for the medical imaging community to effectively utilize the wealth of available data for improved segmentation performance. The code and model weights will be published here: [tba]
Paper Structure (12 sections, 2 equations, 3 figures, 4 tables)

This paper contains 12 sections, 2 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: (a) Usually only a few classes are annotated in publicly available datasets. b) Different groundtruth label properties can generate contradicting class predictions. For example, the heart annotation of dataset 11 differs from the heart annotation of dataset 10, which causes the aorta of dataset 11 to overlap with the heart of dataset 10. In contrast to dataset 11, in dataset 7 the aorta is also annotated in the lower abdomen. c) Instead of training one network for each dataset, we introduce a method to train one network with all datasets, while retaining dataset-specific annotation protocols.
  • Figure 2: Dice scores for all datasets, all classes, and classes of special interest. It should be noted that individual points within a boxplot corresponds to a different task. Difficult classes are those for which the default nnU-Net has a Dice below 75. The same color indicates the same architecture and the pattern implies training with multiple datasets using MultiTalent. The mean Dices are written on the Figure.
  • Figure 3: The circle area corresponds to each dataset size and the color indicates the number of annotated classes. MultiTalent was trained on 1460 images with about 3600 annotations. Whereas, the TotalSegmentator dataset consists of 1204 images with about $10^5$ annotations totalseg.