Table of Contents
Fetching ...

From Linear Probing to Joint-Weighted Token Hierarchy: A Foundation Model Bridging Global and Cellular Representations in Biomarker Detection

Jingsong Liu, Han Li, Nassir Navab, Peter J. Schüffler

TL;DR

Pathology foundation models often rely on global patch embeddings, overlooking cell-level morphology crucial for biomarker detection. JWTH introduces a Joint-Weighted Token Hierarchy that fuses global context with cell-level tokens through cell-centric post-tuning and Gram-anchored pretraining, augmented by attention pooling to integrate multi-scale information. The approach is validated across four biomarkers and eight cohorts, achieving up to 8.3% gains in balanced accuracy and an average 1.2% improvement over prior PFMs, indicating improved interpretability and robustness. The results demonstrate strong cross-center generalization and establish a scalable path toward reliable AI-based biomarker detection in digital pathology.

Abstract

AI-based biomarkers can infer molecular features directly from hematoxylin & eosin (H&E) slides, yet most pathology foundation models (PFMs) rely on global patch-level embeddings and overlook cell-level morphology. We present a PFM model, JWTH (Joint-Weighted Token Hierarchy), which integrates large-scale self-supervised pretraining with cell-centric post-tuning and attention pooling to fuse local and global tokens. Across four tasks involving four biomarkers and eight cohorts, JWTH achieves up to 8.3% higher balanced accuracy and 1.2% average improvement over prior PFMs, advancing interpretable and robust AI-based biomarker detection in digital pathology.

From Linear Probing to Joint-Weighted Token Hierarchy: A Foundation Model Bridging Global and Cellular Representations in Biomarker Detection

TL;DR

Pathology foundation models often rely on global patch embeddings, overlooking cell-level morphology crucial for biomarker detection. JWTH introduces a Joint-Weighted Token Hierarchy that fuses global context with cell-level tokens through cell-centric post-tuning and Gram-anchored pretraining, augmented by attention pooling to integrate multi-scale information. The approach is validated across four biomarkers and eight cohorts, achieving up to 8.3% gains in balanced accuracy and an average 1.2% improvement over prior PFMs, indicating improved interpretability and robustness. The results demonstrate strong cross-center generalization and establish a scalable path toward reliable AI-based biomarker detection in digital pathology.

Abstract

AI-based biomarkers can infer molecular features directly from hematoxylin & eosin (H&E) slides, yet most pathology foundation models (PFMs) rely on global patch-level embeddings and overlook cell-level morphology. We present a PFM model, JWTH (Joint-Weighted Token Hierarchy), which integrates large-scale self-supervised pretraining with cell-centric post-tuning and attention pooling to fuse local and global tokens. Across four tasks involving four biomarkers and eight cohorts, JWTH achieves up to 8.3% higher balanced accuracy and 1.2% average improvement over prior PFMs, advancing interpretable and robust AI-based biomarker detection in digital pathology.

Paper Structure

This paper contains 21 sections, 6 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: (a) Existing pathology foundation model (PFM) pipelines typically rely on linear probing over the global class token, discarding fine-grained local cues from patch-level embeddings and thus losing critical cellular information. (b) Our proposed JWTH (Joint-Weight Token Hierarchy) integrates both class and local tokens through a multi-head attention mechanism, enabling the model to jointly reason over global tissue context and cell-level morphology for more accurate biomarker prediction.
  • Figure 2: Performances of different models on patch-level classification tasks. Balanced accuracy (BACC) is taken as the performance metric.