Table of Contents
Fetching ...

OmniRad: A Radiological Foundation Model for Multi-Task Medical Image Analysis

Luca Zedda, Andrea Loddo, Cecilia Di Ruberto

TL;DR

OmniRad addresses the need for unified, transferable visual representations in radiology by pretraining a single radiological encoder with self-supervision on heterogeneous data and reusing it across classification and segmentation tasks, with exploratory tests for image captioning. The approach combines a radiomics-informed, stable representation with lightweight, task-specific adapters and a lightweight segmentation decoder to preserve efficiency. Empirical results show OmniRad achieving state-of-the-art or competitive performance across MedMNIST, MedSegBench, and ROCOv2 benchmarks, with consistent gains especially on anatomically diverse and multi-modal datasets, and qualitatively favorable latent-space structure. This work suggests a practical path toward a unified radiological foundation that supports multi-task pipelines in real-world clinical settings, reducing task-specific retraining while maintaining robust performance.

Abstract

Radiological analysis increasingly benefits from pretrained visual representations that can support heterogeneous downstream tasks across imaging modalities. In this work, we introduce OmniRad, a self-supervised radiological foundation model pretrained on 1.2 million medical images, designed with radiology-inspired principles emphasizing representation reuse and cross-task transferability. We evaluate the pretrained encoder under multiple downstream adaptation regimes, including lightweight task-specific adapters with a frozen backbone as well as full end-to-end fine-tuning for classification, allowing us to assess both representation quality and task-specific performance. OmniRad is evaluated on a broad suite of public benchmarks spanning classification and segmentation across multiple modalities. On the MedMNISTv2 collection, OmniRad improves classification F1 by up to 2.05% over competing foundation models. For dense prediction, OmniRad attains mean Dice score improvements across six MedSegBench datasets when using frozen representations. Qualitative analyses and latent-space visualizations suggest improved feature clustering and modality-related separation.

OmniRad: A Radiological Foundation Model for Multi-Task Medical Image Analysis

TL;DR

OmniRad addresses the need for unified, transferable visual representations in radiology by pretraining a single radiological encoder with self-supervision on heterogeneous data and reusing it across classification and segmentation tasks, with exploratory tests for image captioning. The approach combines a radiomics-informed, stable representation with lightweight, task-specific adapters and a lightweight segmentation decoder to preserve efficiency. Empirical results show OmniRad achieving state-of-the-art or competitive performance across MedMNIST, MedSegBench, and ROCOv2 benchmarks, with consistent gains especially on anatomically diverse and multi-modal datasets, and qualitatively favorable latent-space structure. This work suggests a practical path toward a unified radiological foundation that supports multi-task pipelines in real-world clinical settings, reducing task-specific retraining while maintaining robust performance.

Abstract

Radiological analysis increasingly benefits from pretrained visual representations that can support heterogeneous downstream tasks across imaging modalities. In this work, we introduce OmniRad, a self-supervised radiological foundation model pretrained on 1.2 million medical images, designed with radiology-inspired principles emphasizing representation reuse and cross-task transferability. We evaluate the pretrained encoder under multiple downstream adaptation regimes, including lightweight task-specific adapters with a frozen backbone as well as full end-to-end fine-tuning for classification, allowing us to assess both representation quality and task-specific performance. OmniRad is evaluated on a broad suite of public benchmarks spanning classification and segmentation across multiple modalities. On the MedMNISTv2 collection, OmniRad improves classification F1 by up to 2.05% over competing foundation models. For dense prediction, OmniRad attains mean Dice score improvements across six MedSegBench datasets when using frozen representations. Qualitative analyses and latent-space visualizations suggest improved feature clustering and modality-related separation.
Paper Structure (22 sections, 5 equations, 4 figures, 12 tables)

This paper contains 22 sections, 5 equations, 4 figures, 12 tables.

Figures (4)

  • Figure 1: Schematic of the proposed OmniRad model. The framework supports multiple tasks by leveraging a shared, unified radiological image foundation model.
  • Figure 2: Overview of the proposed OmniRad framework for radiological analysis tasks. The OmniRad Image Encoder is shown in blue, the classification branch in purple, the segmentation modules in red, and an exploratory captioning branch in yellow. Feature map dimensionalities are highlighted in pink to illustrate the intermediate representations.
  • Figure 3: Segmentation visualization. Colors indicate prediction quality: green = correct prediction, blue = overprediction, red = missed prediction.
  • Figure 4: UMAP visualization of the latent representations learned by OmniRad, RadioDINO, and DINOv3. OmniRad exhibits reduced batch effects and a more structured embedding space.