Uni-Hema: Unified Model for Digital Hematopathology
Abdul Rehman, Iqra Rasool, Ayisha Imran, Mohsen Ali, Waqas Sultani
TL;DR
Uni-Hema tackles the absence of a single model capable of multi-task, multi-modal analysis in digital hematopathology. It integrates detection, classification, segmentation, morphology prediction, and visual–textual reasoning within a unified architecture aided by Hema-Former, trained on 46 public datasets (≈700K images, ≈21K QA pairs). The model achieves competitive or superior results to task-specific SOTA across multiple tasks and demonstrates interpretable single-cell morphology insights, including VQA and MLM capabilities on hematology data. This unified approach enables scalable, cross-disease digital hematopathology with potential clinical impact, and the authors plan to release code publicly.
Abstract
Digital hematopathology requires cell-level analysis across diverse disease categories, including malignant disorders (e.g., leukemia), infectious conditions (e.g., malaria), and non-malignant red blood cell disorders (e.g., sickle cell disease). Whether single-task, vision-language, WSI-optimized, or single-cell hematology models, these approaches share a key limitation, they cannot provide unified, multi-task, multi-modal reasoning across the complexities of digital hematopathology. To overcome these limitations, we propose Uni-Hema, a multi-task, unified model for digital hematopathology integrating detection, classification, segmentation, morphology prediction, and reasoning across multiple diseases. Uni-Hema leverages 46 publicly available datasets, encompassing over 700K images and 21K question-answer pairs, and is built upon Hema-Former, a multimodal module that bridges visual and textual representations at the hierarchy level for the different tasks (detection, classification, segmentation, morphology, mask language modeling and visual question answer) at different granularity. Extensive experiments demonstrate that Uni-Hema achieves comparable or superior performance to train on a single-task and single dataset models, across diverse hematological tasks, while providing interpretable, morphologically relevant insights at the single-cell level. Our framework establishes a new standard for multi-task and multi-modal digital hematopathology. The code will be made publicly available.
