AI-based Anomaly Detection for Clinical-Grade Histopathological Diagnostics
Jonas Dippel, Niklas Prenißl, Julius Hense, Philipp Liznerski, Tobias Winterhoff, Simon Schallenberg, Marius Kloft, Oliver Buchstab, David Horst, Maximilian Alber, Lukas Ruff, Klaus-Robert Müller, Frederick Klauschen
TL;DR
This study tackles the challenge of long-tail disease distribution in clinical histopathology by introducing AI-based anomaly detection (AD) that requires training only on common findings. Using large real-world GI biopsy datasets, the authors compare self-supervised AD and outlier exposure (OE) approaches, achieving high slide- and patch-level AUROCs and robust generalization across hospitals and scanners without explicit examples of rare diseases. Heatmap-based localization provides interpretable cues to pathologists, and the approach demonstrates potential to flag and triage anomalous slides, reduce missed diagnoses, and enable safer, more automated histopathological workflows. The work highlights practical requirements for deployment, including diverse OE data and stain normalization, and points to future multi-scale and semi-supervised extensions to further improve performance on subtle or context-dependent anomalies.
Abstract
While previous studies have demonstrated the potential of AI to diagnose diseases in imaging data, clinical implementation is still lagging behind. This is partly because AI models require training with large numbers of examples only available for common diseases. In clinical reality, however, only few diseases are common, whereas the majority of diseases are less frequent (long-tail distribution). Current AI models overlook or misclassify these diseases. We propose a deep anomaly detection approach that only requires training data from common diseases to detect also all less frequent diseases. We collected two large real-world datasets of gastrointestinal biopsies, which are prototypical of the problem. Herein, the ten most common findings account for approximately 90% of cases, whereas the remaining 10% contained 56 disease entities, including many cancers. 17 million histological images from 5,423 cases were used for training and evaluation. Without any specific training for the diseases, our best-performing model reliably detected a broad spectrum of infrequent ("anomalous") pathologies with 95.0% (stomach) and 91.0% (colon) AUROC and generalized across scanners and hospitals. By design, the proposed anomaly detection can be expected to detect any pathological alteration in the diagnostic tail of gastrointestinal biopsies, including rare primary or metastatic cancers. This study establishes the first effective clinical application of AI-based anomaly detection in histopathology that can flag anomalous cases, facilitate case prioritization, reduce missed diagnoses and enhance the general safety of AI models, thereby driving AI adoption and automation in routine diagnostics and beyond.
