Table of Contents
Fetching ...

AI-based Anomaly Detection for Clinical-Grade Histopathological Diagnostics

Jonas Dippel, Niklas Prenißl, Julius Hense, Philipp Liznerski, Tobias Winterhoff, Simon Schallenberg, Marius Kloft, Oliver Buchstab, David Horst, Maximilian Alber, Lukas Ruff, Klaus-Robert Müller, Frederick Klauschen

TL;DR

This study tackles the challenge of long-tail disease distribution in clinical histopathology by introducing AI-based anomaly detection (AD) that requires training only on common findings. Using large real-world GI biopsy datasets, the authors compare self-supervised AD and outlier exposure (OE) approaches, achieving high slide- and patch-level AUROCs and robust generalization across hospitals and scanners without explicit examples of rare diseases. Heatmap-based localization provides interpretable cues to pathologists, and the approach demonstrates potential to flag and triage anomalous slides, reduce missed diagnoses, and enable safer, more automated histopathological workflows. The work highlights practical requirements for deployment, including diverse OE data and stain normalization, and points to future multi-scale and semi-supervised extensions to further improve performance on subtle or context-dependent anomalies.

Abstract

While previous studies have demonstrated the potential of AI to diagnose diseases in imaging data, clinical implementation is still lagging behind. This is partly because AI models require training with large numbers of examples only available for common diseases. In clinical reality, however, only few diseases are common, whereas the majority of diseases are less frequent (long-tail distribution). Current AI models overlook or misclassify these diseases. We propose a deep anomaly detection approach that only requires training data from common diseases to detect also all less frequent diseases. We collected two large real-world datasets of gastrointestinal biopsies, which are prototypical of the problem. Herein, the ten most common findings account for approximately 90% of cases, whereas the remaining 10% contained 56 disease entities, including many cancers. 17 million histological images from 5,423 cases were used for training and evaluation. Without any specific training for the diseases, our best-performing model reliably detected a broad spectrum of infrequent ("anomalous") pathologies with 95.0% (stomach) and 91.0% (colon) AUROC and generalized across scanners and hospitals. By design, the proposed anomaly detection can be expected to detect any pathological alteration in the diagnostic tail of gastrointestinal biopsies, including rare primary or metastatic cancers. This study establishes the first effective clinical application of AI-based anomaly detection in histopathology that can flag anomalous cases, facilitate case prioritization, reduce missed diagnoses and enhance the general safety of AI models, thereby driving AI adoption and automation in routine diagnostics and beyond.

AI-based Anomaly Detection for Clinical-Grade Histopathological Diagnostics

TL;DR

This study tackles the challenge of long-tail disease distribution in clinical histopathology by introducing AI-based anomaly detection (AD) that requires training only on common findings. Using large real-world GI biopsy datasets, the authors compare self-supervised AD and outlier exposure (OE) approaches, achieving high slide- and patch-level AUROCs and robust generalization across hospitals and scanners without explicit examples of rare diseases. Heatmap-based localization provides interpretable cues to pathologists, and the approach demonstrates potential to flag and triage anomalous slides, reduce missed diagnoses, and enable safer, more automated histopathological workflows. The work highlights practical requirements for deployment, including diverse OE data and stain normalization, and points to future multi-scale and semi-supervised extensions to further improve performance on subtle or context-dependent anomalies.

Abstract

While previous studies have demonstrated the potential of AI to diagnose diseases in imaging data, clinical implementation is still lagging behind. This is partly because AI models require training with large numbers of examples only available for common diseases. In clinical reality, however, only few diseases are common, whereas the majority of diseases are less frequent (long-tail distribution). Current AI models overlook or misclassify these diseases. We propose a deep anomaly detection approach that only requires training data from common diseases to detect also all less frequent diseases. We collected two large real-world datasets of gastrointestinal biopsies, which are prototypical of the problem. Herein, the ten most common findings account for approximately 90% of cases, whereas the remaining 10% contained 56 disease entities, including many cancers. 17 million histological images from 5,423 cases were used for training and evaluation. Without any specific training for the diseases, our best-performing model reliably detected a broad spectrum of infrequent ("anomalous") pathologies with 95.0% (stomach) and 91.0% (colon) AUROC and generalized across scanners and hospitals. By design, the proposed anomaly detection can be expected to detect any pathological alteration in the diagnostic tail of gastrointestinal biopsies, including rare primary or metastatic cancers. This study establishes the first effective clinical application of AI-based anomaly detection in histopathology that can flag anomalous cases, facilitate case prioritization, reduce missed diagnoses and enhance the general safety of AI models, thereby driving AI adoption and automation in routine diagnostics and beyond.
Paper Structure (38 sections, 12 figures, 16 tables)

This paper contains 38 sections, 12 figures, 16 tables.

Figures (12)

  • Figure 1: Diseases in GI biopsies. Bar plot showing the frequency distribution of diagnoses in colon and stomach biopsies in the Charité cohort. Frequent findings are highlighted in green and represent the common or "normal" cases (90/91% of all cases for stomach/colon). The distribution has a long tail of infrequent/rare diagnoses or "anomalies" (red), which the AI-based AD approach aims to detect (NFS = not further specified, MINEN =mixed neuroendocrine-nonneuroendocrine neoplasm).
  • Figure 2: Clinical use case and overview of anomaly detection approach.(CLINICAL USE) Our AD approach can support pathologists in the routine diagnostic workflow by detecting abnormal cases and highlighting corresponding abnormal tissue regions. It may enable automated workflows in the future. (TRAINING) The OE model is trained by exposing and separating patches of frequently found ("normal") tissue patterns in colon- and stomach biopsies from diverse patches of other tissue types (e.g., small intestine, lung, liver, prostate, breast, etc.). Through this exposure, the model learns the specific features of frequent colon and stomach findings, enclosing them into a compact decision boundary, thereby enabling the detection of anomalies as data points falling outside of this boundary. The so-trained model using "auxiliary patches" is subsequently able to generalize to relevant anomalous findings in colon and stomach tissue. (INFERENCE) During inference, we compute the anomaly score for each patch of a slide and aggregate the scores. Our approach provides the pathologist with a slide anomaly score and a heatmap of anomalous regions.
  • Figure 3: Distribution of slide anomaly scores. Slide anomaly scores for individual slides after patch aggregation, grouped by diagnoses within diagnostic groups, for the validation dataset of the Charité cohort with the OE model. Results are from one split of the 5-fold cross validation. Results for stomach biopsies are shown at the top and for colon biopsies at the bottom (NFS = not further specified, MINEN = mixed neuroendocrine-nonneuroendocrine neoplasm).
  • Figure 4: Heatmap visualization of AD results(a-c) Morphological distinct correlates of adenocarcinoma (a), marginal zone lymphoma (b), and undifferentiated sarcoma (c) in stomach tissue are highlighted as abnormal. Common tissue artifacts in (a) (tissue folds showing as dark vertical lines) do not considerably influence the heatmap (d-e) Benign changes of gastric foveolar adenoma (d) and changes consistent with a stomach ulcur (e) are highlighted as abnormal. (f) High-grade epithelial dysplasia of colon tissue is differentiated from adjacent low-grade dysplasia and highlighted as abnormal, showcasing the AD's ability to differentiate between subtle histological variations. (g) A tissue fragment of a neuroendocrine tumor of the colon is highlighted as abnormal. (h) Colon tissue with inflammation is highlighted as abnormal.
  • Figure 5: Stomach tissue with metastatic infiltrates of a melanoma. A complete tissue cut with pathologists' annotations is shown on the left. The corresponding anomaly heatmap is shown on the right.
  • ...and 7 more figures