Table of Contents
Fetching ...

Rank-Aware Agglomeration of Foundation Models for Immunohistochemistry Image Cell Counting

Zuqi Huang, Mengxin Tian, Huan Liu, Wentao Li, Baobao Liang, Jie Wu, Fang Yan, Zhaoqing Tang, Zhongyu Li

TL;DR

CountIHC tackles end-to-end multi-class cell counting in IHC images by distilling knowledge from multiple foundation models through a novel Rank-Aware Teacher Selecting (RATS) strategy, producing a compact student that matches or exceeds the strongest teacher's performance with reduced compute. The method couples rank-based, unsupervised teacher selection with a vision–language anchored fine-tuning stage that uses structured text prompts to encode category and quantity, along with a Spatial Exclusivity Loss to enforce inter-class spatial separation. Extensive experiments across 12 biomarkers and 5 tissue types show CountIHC achieving state-of-the-art or competitive results, with strong agreement with pathologists and demonstrated scalability to H&E data. Together, these contributions enable robust, efficient, and clinically relevant IHC cell counting and biomarker assessment, with potential for broader pathology applications.

Abstract

Accurate cell counting in immunohistochemistry (IHC) images is critical for quantifying protein expression and aiding cancer diagnosis. However, the task remains challenging due to the chromogen overlap, variable biomarker staining, and diverse cellular morphologies. Regression-based counting methods offer advantages over detection-based ones in handling overlapped cells, yet rarely support end-to-end multi-class counting. Moreover, the potential of foundation models remains largely underexplored in this paradigm. To address these limitations, we propose a rank-aware agglomeration framework that selectively distills knowledge from multiple strong foundation models, leveraging their complementary representations to handle IHC heterogeneity and obtain a compact yet effective student model, CountIHC. Unlike prior task-agnostic agglomeration strategies that either treat all teachers equally or rely on feature similarity, we design a Rank-Aware Teacher Selecting (RATS) strategy that models global-to-local patch rankings to assess each teacher's inherent counting capacity and enable sample-wise teacher selection. For multi-class cell counting, we introduce a fine-tuning stage that reformulates the task as vision-language alignment. Discrete semantic anchors derived from structured text prompts encode both category and quantity information, guiding the regression of class-specific density maps and improving counting for overlapping cells. Extensive experiments demonstrate that CountIHC surpasses state-of-the-art methods across 12 IHC biomarkers and 5 tissue types, while exhibiting high agreement with pathologists' assessments. Its effectiveness on H&E-stained data further confirms the scalability of the proposed method.

Rank-Aware Agglomeration of Foundation Models for Immunohistochemistry Image Cell Counting

TL;DR

CountIHC tackles end-to-end multi-class cell counting in IHC images by distilling knowledge from multiple foundation models through a novel Rank-Aware Teacher Selecting (RATS) strategy, producing a compact student that matches or exceeds the strongest teacher's performance with reduced compute. The method couples rank-based, unsupervised teacher selection with a vision–language anchored fine-tuning stage that uses structured text prompts to encode category and quantity, along with a Spatial Exclusivity Loss to enforce inter-class spatial separation. Extensive experiments across 12 biomarkers and 5 tissue types show CountIHC achieving state-of-the-art or competitive results, with strong agreement with pathologists and demonstrated scalability to H&E data. Together, these contributions enable robust, efficient, and clinically relevant IHC cell counting and biomarker assessment, with potential for broader pathology applications.

Abstract

Accurate cell counting in immunohistochemistry (IHC) images is critical for quantifying protein expression and aiding cancer diagnosis. However, the task remains challenging due to the chromogen overlap, variable biomarker staining, and diverse cellular morphologies. Regression-based counting methods offer advantages over detection-based ones in handling overlapped cells, yet rarely support end-to-end multi-class counting. Moreover, the potential of foundation models remains largely underexplored in this paradigm. To address these limitations, we propose a rank-aware agglomeration framework that selectively distills knowledge from multiple strong foundation models, leveraging their complementary representations to handle IHC heterogeneity and obtain a compact yet effective student model, CountIHC. Unlike prior task-agnostic agglomeration strategies that either treat all teachers equally or rely on feature similarity, we design a Rank-Aware Teacher Selecting (RATS) strategy that models global-to-local patch rankings to assess each teacher's inherent counting capacity and enable sample-wise teacher selection. For multi-class cell counting, we introduce a fine-tuning stage that reformulates the task as vision-language alignment. Discrete semantic anchors derived from structured text prompts encode both category and quantity information, guiding the regression of class-specific density maps and improving counting for overlapping cells. Extensive experiments demonstrate that CountIHC surpasses state-of-the-art methods across 12 IHC biomarkers and 5 tissue types, while exhibiting high agreement with pathologists' assessments. Its effectiveness on H&E-stained data further confirms the scalability of the proposed method.

Paper Structure

This paper contains 21 sections, 16 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Overview of the proposed rank-aware agglomeration framework and anchor-guided fine-tuning for multi-class cell counting.
  • Figure 2: Illustration of the Rank-Aware Teacher Selecting (RATS) strategy. In an unsupervised setting, given a global-to-local cropped patch group, the output of each candidate foundation model (FM-$i$) is aligned with discrete count candidates to yield a predicted count $\hat{C}_{i,j}$ for patch $p_j$. A count-ranking loss serves as the rank-aware criterion, and the model that performs best by this criterion is selected as the group-specific teacher.
  • Figure 3: Qualitative comparisons of negative/positive tumor cell counting predictions on IHC images. Ground-truth counts (GT) are shown alongside the predicted counts (Pred) from different methods.
  • Figure 4: Agreement between the CountIHC and five experienced pathologists (P1–P5) on the Ki67-Camera and Ki67-WSI datasets. Panels a and b show the results on the Ki67-Camera dataset, with a presenting pairwise quadratic weighted kappa (QWK) values between the machine and pathologists and among pathologists, and b showing confusion matrices for grade distributions between the machine and each pathologist. Panels c and d show the corresponding analyses on the Ki67-WSI dataset.
  • Figure 5: Visualization of approximate cell centroids derived from local maxima of the predicted class-specific density maps on Ki67-Camera and Ki67-WSI. The derived points are overlaid as markers on the tissue images, with blue indicating negative tumor cells and red indicating positive tumor cells.
  • ...and 3 more figures