Table of Contents
Fetching ...

Beyond the Failures: Rethinking Foundation Models in Pathology

Hamid R. Tizhoosh

TL;DR

Problem: foundation models underperform in histopathology due to misalignment with tissue complexity and clinical requirements. Approach: synthesize empirical weaknesses and theoretical limits to diagnose core causes, including dense-embedding constraints and catastrophic inheritance, and advocate domain-specific, multi-scale design. Contributions: a synthesis of empirical evidence across robustness, generalization, and efficiency, plus a proposed path forward toward tissue-aware representations and new evaluation standards. Significance: calls for redefining foundation concepts in medical AI to deliver clinically reliable, interpretable, and scalable pathology tools.

Abstract

Despite their successes in vision and language, foundation models have stumbled in pathology, revealing low accuracy, instability, and heavy computational demands. These shortcomings stem not from tuning problems but from deeper conceptual mismatches: dense embeddings cannot represent the combinatorial richness of tissue, and current architectures inherit flaws in self-supervision, patch design, and noise-fragile pretraining. Biological complexity and limited domain innovation further widen the gap. The evidence is clear-pathology requires models explicitly designed for biological images rather than adaptations of large-scale natural-image methods whose assumptions do not hold for tissue.

Beyond the Failures: Rethinking Foundation Models in Pathology

TL;DR

Problem: foundation models underperform in histopathology due to misalignment with tissue complexity and clinical requirements. Approach: synthesize empirical weaknesses and theoretical limits to diagnose core causes, including dense-embedding constraints and catastrophic inheritance, and advocate domain-specific, multi-scale design. Contributions: a synthesis of empirical evidence across robustness, generalization, and efficiency, plus a proposed path forward toward tissue-aware representations and new evaluation standards. Significance: calls for redefining foundation concepts in medical AI to deliver clinically reliable, interpretable, and scalable pathology tools.

Abstract

Despite their successes in vision and language, foundation models have stumbled in pathology, revealing low accuracy, instability, and heavy computational demands. These shortcomings stem not from tuning problems but from deeper conceptual mismatches: dense embeddings cannot represent the combinatorial richness of tissue, and current architectures inherit flaws in self-supervision, patch design, and noise-fragile pretraining. Biological complexity and limited domain innovation further widen the gap. The evidence is clear-pathology requires models explicitly designed for biological images rather than adaptations of large-scale natural-image methods whose assumptions do not hold for tissue.

Paper Structure

This paper contains 9 sections, 3 figures.

Figures (3)

  • Figure 1: AI models can recognize dogs and even distinguish among breeds—tasks that children can perform with ease. In contrast, recognizing complex tissue patterns in pathology requires an adult with more than a decade of specialized education and training.
  • Figure 2: Self-supervised learning often rests on the implicit assumption that each image represents a single, coherent object—an assumption that fails in histopathology, where multiple heterogeneous tissue structures coexist within the same field of view.
  • Figure 3: The field of view in light microscopy is traditionally quite large—approximately 2000 × 1500 pixels—and becomes vastly larger in whole-slide images (WSIs). In contrast, most AI models operate on small image patches, typically around 224 × 224 pixels.