Table of Contents
Fetching ...

Normal and Abnormal Pathology Knowledge-Augmented Vision-Language Model for Anomaly Detection in Pathology Images

Jinsol Song, Jiamu Wang, Anh Tien Nguyen, Keunho Byeon, Sangjeong Ahn, Sung Hak Lee, Jin Tae Kwak

TL;DR

This work tackles anomaly detection in pathology images under data scarcity and variability by introducing Ano-NAViLa, a lightweight vision-language framework that leverages two term pools—normal and abnormal—together with a frozen VLM and a trainable MLP. It generates text-augmented image embeddings and uses a contrastive objective to cluster normal- and abnormal-associated representations, producing patch- and WSI-level anomaly scores via centroid-based deviations $A_{score} = D^{N}(\mathbf{h}^{N}) + D^{A}(\mathbf{h}^{A})$. Evaluations on GastricLN and Camelyon16 show state-of-the-art anomaly detection and localization, with strong generalization across organs and institutions, and interpretable image-text associations validated by pathologists. The approach achieves high accuracy with low trainable parameters and offers clinically meaningful textual explanations, supporting potential translation to real-world workflows. Future work includes automated term-pool construction, broader external validation, and VLM optimization for further efficiency.

Abstract

Anomaly detection in computational pathology aims to identify rare and scarce anomalies where disease-related data are often limited or missing. Existing anomaly detection methods, primarily designed for industrial settings, face limitations in pathology due to computational constraints, diverse tissue structures, and lack of interpretability. To address these challenges, we propose Ano-NAViLa, a Normal and Abnormal pathology knowledge-augmented Vision-Language model for Anomaly detection in pathology images. Ano-NAViLa is built on a pre-trained vision-language model with a lightweight trainable MLP. By incorporating both normal and abnormal pathology knowledge, Ano-NAViLa enhances accuracy and robustness to variability in pathology images and provides interpretability through image-text associations. Evaluated on two lymph node datasets from different organs, Ano-NAViLa achieves the state-of-the-art performance in anomaly detection and localization, outperforming competing models.

Normal and Abnormal Pathology Knowledge-Augmented Vision-Language Model for Anomaly Detection in Pathology Images

TL;DR

This work tackles anomaly detection in pathology images under data scarcity and variability by introducing Ano-NAViLa, a lightweight vision-language framework that leverages two term pools—normal and abnormal—together with a frozen VLM and a trainable MLP. It generates text-augmented image embeddings and uses a contrastive objective to cluster normal- and abnormal-associated representations, producing patch- and WSI-level anomaly scores via centroid-based deviations . Evaluations on GastricLN and Camelyon16 show state-of-the-art anomaly detection and localization, with strong generalization across organs and institutions, and interpretable image-text associations validated by pathologists. The approach achieves high accuracy with low trainable parameters and offers clinically meaningful textual explanations, supporting potential translation to real-world workflows. Future work includes automated term-pool construction, broader external validation, and VLM optimization for further efficiency.

Abstract

Anomaly detection in computational pathology aims to identify rare and scarce anomalies where disease-related data are often limited or missing. Existing anomaly detection methods, primarily designed for industrial settings, face limitations in pathology due to computational constraints, diverse tissue structures, and lack of interpretability. To address these challenges, we propose Ano-NAViLa, a Normal and Abnormal pathology knowledge-augmented Vision-Language model for Anomaly detection in pathology images. Ano-NAViLa is built on a pre-trained vision-language model with a lightweight trainable MLP. By incorporating both normal and abnormal pathology knowledge, Ano-NAViLa enhances accuracy and robustness to variability in pathology images and provides interpretability through image-text associations. Evaluated on two lymph node datasets from different organs, Ano-NAViLa achieves the state-of-the-art performance in anomaly detection and localization, outperforming competing models.

Paper Structure

This paper contains 13 sections, 5 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Generation of normal and abnormal text-augmented image embeddings. Two embeddings are generated for each image, representing its relationship with the normal and abnormal term pools, respectively. Only the MLP is trainable throughout the entire process.
  • Figure 2: Training and inference procedures. The MLP is trained using only normal images to separately cluster embeddings that are representative of normal image–normal term and normal image–abnormal term relations. At inference, the distances from the two cluster centroids are computed, and their sum serves as the anomaly score, with larger values indicative of anomalies.
  • Figure 3: Visualization of anomaly localization in WSIs.
  • Figure 4: Exemplary WSIs from Camelyon16 and their matching pathology terms. Normal and abnormal terms are shown in green and red, respectively.