Table of Contents
Fetching ...

RedDino: A foundation model for red blood cell analysis

Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Carsten Marr

TL;DR

The paper tackles the lack of foundation models for RBC analysis by introducing RedDino, a self-supervised RBC-tailored extension of DINOv2 trained on a vast, diverse dataset of RBC images. It demonstrates state-of-the-art performance in RBC shape classification and strong generalization across multiple data sources and imaging modalities, achieved through patch-based training, removal of the Koleo regularizer, and Sinkhorn-Knopp centering. Extensive ablations and cross-source evaluations show RedDino's robustness to batch effects and its efficiency-accuracy trade-offs across backbone sizes. The authors provide open-source code and pretrained models to enable broader adoption in computational hematology and clinical diagnostics.

Abstract

Red blood cells (RBCs) are essential to human health, and their precise morphological analysis is important for diagnosing hematological disorders. Despite the promise of foundation models in medical diagnostics, comprehensive AI solutions for RBC analysis remain scarce. We present RedDino, a self-supervised foundation model designed for RBC image analysis. RedDino uses an RBC-specific adaptation of the DINOv2 self-supervised learning framework and is trained on a curated dataset of 1.25 million RBC images from diverse acquisition modalities and sources. Extensive evaluations show that RedDino outperforms existing state-of-the-art models on RBC shape classification. Through assessments including linear probing and nearest neighbor classification, we confirm its strong feature representations and generalization ability. Our main contributions are: (1) a foundation model tailored for RBC analysis, (2) ablation studies exploring DINOv2 configurations for RBC modeling, and (3) a detailed evaluation of generalization performance. RedDino addresses key challenges in computational hematology by capturing nuanced morphological features, advancing the development of reliable diagnostic tools. The source code and pretrained models for RedDino are available at https://github.com/Snarci/RedDino, and the pretrained models can be downloaded from our Hugging Face collection at https://huggingface.co/collections/Snarcy/reddino-689a13e29241d2e5690202fc

RedDino: A foundation model for red blood cell analysis

TL;DR

The paper tackles the lack of foundation models for RBC analysis by introducing RedDino, a self-supervised RBC-tailored extension of DINOv2 trained on a vast, diverse dataset of RBC images. It demonstrates state-of-the-art performance in RBC shape classification and strong generalization across multiple data sources and imaging modalities, achieved through patch-based training, removal of the Koleo regularizer, and Sinkhorn-Knopp centering. Extensive ablations and cross-source evaluations show RedDino's robustness to batch effects and its efficiency-accuracy trade-offs across backbone sizes. The authors provide open-source code and pretrained models to enable broader adoption in computational hematology and clinical diagnostics.

Abstract

Red blood cells (RBCs) are essential to human health, and their precise morphological analysis is important for diagnosing hematological disorders. Despite the promise of foundation models in medical diagnostics, comprehensive AI solutions for RBC analysis remain scarce. We present RedDino, a self-supervised foundation model designed for RBC image analysis. RedDino uses an RBC-specific adaptation of the DINOv2 self-supervised learning framework and is trained on a curated dataset of 1.25 million RBC images from diverse acquisition modalities and sources. Extensive evaluations show that RedDino outperforms existing state-of-the-art models on RBC shape classification. Through assessments including linear probing and nearest neighbor classification, we confirm its strong feature representations and generalization ability. Our main contributions are: (1) a foundation model tailored for RBC analysis, (2) ablation studies exploring DINOv2 configurations for RBC modeling, and (3) a detailed evaluation of generalization performance. RedDino addresses key challenges in computational hematology by capturing nuanced morphological features, advancing the development of reliable diagnostic tools. The source code and pretrained models for RedDino are available at https://github.com/Snarci/RedDino, and the pretrained models can be downloaded from our Hugging Face collection at https://huggingface.co/collections/Snarcy/reddino-689a13e29241d2e5690202fc

Paper Structure

This paper contains 9 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The RedDino training set comprises 56712 original images. We extracted over 3 million single RBC images and more than 1.2 million patches.
  • Figure 2: RedDino outperforms the baseline models on the weighted F1 score in the linear probing evaluation by removing the Koleo regularizer and applying the Sinkhorn-Knopp algorithm. The evaluation uses source 1 of the Elsafty dataset as the training set and source 2 as the test set.
  • Figure 3: Abnormal RBC distinguished by RedDino: the highlighted regions in (a) and (c) correlate with distinct colors in the PCA visualization (b) and (d), showcasing their differentiation provided in the embedding space. Specifically, (a) contains malaria-infected RBCs, while (c) includes echinocytes.
  • Figure 4: Different classes show distinct clusters in the UMAP projection of the feature embeddings from the Elsafty dataset source 1. On the left, we show the subject distribution across the UMAP space, while on the right, we show the class distribution.