UniCell: Universal Cell Nucleus Classification via Prompt Learning
Junjia Huang, Haofeng Li, Xiang Wan, Guanbin Li
TL;DR
UniCell tackles cross-dataset nucleus classification by learning a single universal model capable of detecting and classifying nuclei across diverse pathology datasets with inconsistent annotations. It blends a DETR-based end-to-end architecture with a Dynamic Prompt Module that injects dataset- and category-level semantics through dataset prompts and a Category Memory Bank, enabling cross-domain knowledge sharing across $D$ datasets and $C$ categories. A Contrastive DeNoising Query mechanism accelerates training by using noisy centroids to generate queries during training, while inference relies on learnable content queries, and per-dataset prediction heads manage differing label sets. Empirical results on four benchmarks show state-of-the-art performance in both detection and classification, with ablations confirming the effectiveness of DPM, the optimal local attention depth ($L=3$), and the superiority of the Feature-Enhancing strategy for feature refinement. This approach reduces data fragmentation across datasets and offers a scalable path toward practical, cross-domain histopathology analysis.
Abstract
The recognition of multi-class cell nuclei can significantly facilitate the process of histopathological diagnosis. Numerous pathological datasets are currently available, but their annotations are inconsistent. Most existing methods require individual training on each dataset to deduce the relevant labels and lack the use of common knowledge across datasets, consequently restricting the quality of recognition. In this paper, we propose a universal cell nucleus classification framework (UniCell), which employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains. In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets. Moreover, we develop a Dynamic Prompt Module (DPM) that exploits the properties of multiple datasets to enhance features. The DPM first integrates the embeddings of datasets and semantic categories, and then employs the integrated prompts to refine image representations, efficiently harvesting the shared knowledge among the related cell types and data sources. Experimental results demonstrate that the proposed method effectively achieves the state-of-the-art results on four nucleus detection and classification benchmarks. Code and models are available at https://github.com/lhaof/UniCell
