Rank-Aware Agglomeration of Foundation Models for Immunohistochemistry Image Cell Counting
Zuqi Huang, Mengxin Tian, Huan Liu, Wentao Li, Baobao Liang, Jie Wu, Fang Yan, Zhaoqing Tang, Zhongyu Li
TL;DR
CountIHC tackles end-to-end multi-class cell counting in IHC images by distilling knowledge from multiple foundation models through a novel Rank-Aware Teacher Selecting (RATS) strategy, producing a compact student that matches or exceeds the strongest teacher's performance with reduced compute. The method couples rank-based, unsupervised teacher selection with a vision–language anchored fine-tuning stage that uses structured text prompts to encode category and quantity, along with a Spatial Exclusivity Loss to enforce inter-class spatial separation. Extensive experiments across 12 biomarkers and 5 tissue types show CountIHC achieving state-of-the-art or competitive results, with strong agreement with pathologists and demonstrated scalability to H&E data. Together, these contributions enable robust, efficient, and clinically relevant IHC cell counting and biomarker assessment, with potential for broader pathology applications.
Abstract
Accurate cell counting in immunohistochemistry (IHC) images is critical for quantifying protein expression and aiding cancer diagnosis. However, the task remains challenging due to the chromogen overlap, variable biomarker staining, and diverse cellular morphologies. Regression-based counting methods offer advantages over detection-based ones in handling overlapped cells, yet rarely support end-to-end multi-class counting. Moreover, the potential of foundation models remains largely underexplored in this paradigm. To address these limitations, we propose a rank-aware agglomeration framework that selectively distills knowledge from multiple strong foundation models, leveraging their complementary representations to handle IHC heterogeneity and obtain a compact yet effective student model, CountIHC. Unlike prior task-agnostic agglomeration strategies that either treat all teachers equally or rely on feature similarity, we design a Rank-Aware Teacher Selecting (RATS) strategy that models global-to-local patch rankings to assess each teacher's inherent counting capacity and enable sample-wise teacher selection. For multi-class cell counting, we introduce a fine-tuning stage that reformulates the task as vision-language alignment. Discrete semantic anchors derived from structured text prompts encode both category and quantity information, guiding the regression of class-specific density maps and improving counting for overlapping cells. Extensive experiments demonstrate that CountIHC surpasses state-of-the-art methods across 12 IHC biomarkers and 5 tissue types, while exhibiting high agreement with pathologists' assessments. Its effectiveness on H&E-stained data further confirms the scalability of the proposed method.
