Table of Contents
Fetching ...

NCDD: Nearest Centroid Distance Deficit for Out-Of-Distribution Detection in Gastrointestinal Vision

Sandesh Pokhrel, Sanjay Bhandari, Sharib Ali, Tryphon Lambrou, Anh Nguyen, Yash Raj Shrestha, Angus Watson, Danail Stoyanov, Prashnna Gyawali, Binod Bhattarai

TL;DR

This work tackles the reliability issue of deep learning in gastrointestinal vision by treating abnormality detection as an out-of-distribution (OOD) problem. It introduces Nearest Centroid Distance Deficit (NCDD), a post-hoc score that leverages both the nearest-centroid distance and the dispersion to non-nearest centroids in the feature space to differentiate in-distribution (ID) normal anatomical landmarks from OOD abnormalities. Centroids are computed per ID class in the learned feature space, and the final OOD score combines D_μm and D_μn with data-dependent weighting, enabling effective OOD detection across multiple backbones on two GI benchmarks (Kvasirv2 and GastroVision). Experimental results show NCDD outperforms state-of-the-art OOD methods (e.g., MSP, ODIN, Energy, Entropy, MaxLogit, KNN-OOD) in AUC and FPR95 across architectures, demonstrating its potential as a simple, post-hoc, and clinically relevant tool with possible clinician-in-the-loop deployment. The work highlights the practical impact of centroid-based feature-space analysis for reliable GI endoscopy AI applications.

Abstract

The integration of deep learning tools in gastrointestinal vision holds the potential for significant advancements in diagnosis, treatment, and overall patient care. A major challenge, however, is these tools' tendency to make overconfident predictions, even when encountering unseen or newly emerging disease patterns, undermining their reliability. We address this critical issue of reliability by framing it as an out-of-distribution (OOD) detection problem, where previously unseen and emerging diseases are identified as OOD examples. However, gastrointestinal images pose a unique challenge due to the overlapping feature representations between in- Distribution (ID) and OOD examples. Existing approaches often overlook this characteristic, as they are primarily developed for natural image datasets, where feature distinctions are more apparent. Despite the overlap, we hypothesize that the features of an in-distribution example will cluster closer to the centroids of their ground truth class, resulting in a shorter distance to the nearest centroid. In contrast, OOD examples maintain an equal distance from all class centroids. Based on this observation, we propose a novel nearest-centroid distance deficit (NCCD) score in the feature space for gastrointestinal OOD detection. Evaluations across multiple deep learning architectures and two publicly available benchmarks, Kvasir2 and Gastrovision, demonstrate the effectiveness of our approach compared to several state-of-the-art methods. The code and implementation details are publicly available at: https://github.com/bhattarailab/NCDD

NCDD: Nearest Centroid Distance Deficit for Out-Of-Distribution Detection in Gastrointestinal Vision

TL;DR

This work tackles the reliability issue of deep learning in gastrointestinal vision by treating abnormality detection as an out-of-distribution (OOD) problem. It introduces Nearest Centroid Distance Deficit (NCDD), a post-hoc score that leverages both the nearest-centroid distance and the dispersion to non-nearest centroids in the feature space to differentiate in-distribution (ID) normal anatomical landmarks from OOD abnormalities. Centroids are computed per ID class in the learned feature space, and the final OOD score combines D_μm and D_μn with data-dependent weighting, enabling effective OOD detection across multiple backbones on two GI benchmarks (Kvasirv2 and GastroVision). Experimental results show NCDD outperforms state-of-the-art OOD methods (e.g., MSP, ODIN, Energy, Entropy, MaxLogit, KNN-OOD) in AUC and FPR95 across architectures, demonstrating its potential as a simple, post-hoc, and clinically relevant tool with possible clinician-in-the-loop deployment. The work highlights the practical impact of centroid-based feature-space analysis for reliable GI endoscopy AI applications.

Abstract

The integration of deep learning tools in gastrointestinal vision holds the potential for significant advancements in diagnosis, treatment, and overall patient care. A major challenge, however, is these tools' tendency to make overconfident predictions, even when encountering unseen or newly emerging disease patterns, undermining their reliability. We address this critical issue of reliability by framing it as an out-of-distribution (OOD) detection problem, where previously unseen and emerging diseases are identified as OOD examples. However, gastrointestinal images pose a unique challenge due to the overlapping feature representations between in- Distribution (ID) and OOD examples. Existing approaches often overlook this characteristic, as they are primarily developed for natural image datasets, where feature distinctions are more apparent. Despite the overlap, we hypothesize that the features of an in-distribution example will cluster closer to the centroids of their ground truth class, resulting in a shorter distance to the nearest centroid. In contrast, OOD examples maintain an equal distance from all class centroids. Based on this observation, we propose a novel nearest-centroid distance deficit (NCCD) score in the feature space for gastrointestinal OOD detection. Evaluations across multiple deep learning architectures and two publicly available benchmarks, Kvasir2 and Gastrovision, demonstrate the effectiveness of our approach compared to several state-of-the-art methods. The code and implementation details are publicly available at: https://github.com/bhattarailab/NCDD

Paper Structure

This paper contains 20 sections, 7 equations, 6 figures, 7 tables, 1 algorithm.

Figures (6)

  • Figure 1: Landscape of clinical procedures in gastrointestinal vision. Orange: Unassisted, a doctor has to assess all patients' data tediously and redundantly. Blue: Artificial Intelligence can help in the classification of known or seen diseases but makes misleading assumptions and often overconfident predictions on images when it faces real-world examples consisting of examples that it had never seen. Green: A combination of human intervention and OOD enabled the AI method to improve efficacy in the current scenario, where a specialist intervenes to correct any unseen or unknown instances that the AI model is uncertain in classifying.
  • Figure 2: The Kvasirv2 dataset kvasir is formulated for OOD detection of abnormalities. Three classes, Z-line, Cecum, and Pylorus, are healthy cases showing normal anatomical landmarks in the dataset, while the remaining are abnormalities, either pathological conditions or images seen during the treatment procedure.
  • Figure 3: a) Standard classification pipeline in landmark classification for Kvasirv2 kvasir in-distribution dataset. In this pipeline, images fed as input will always be one of the three classes, regardless of how unrelated they are to the model's capability. In essence, the model doesn't know whether it is capable of making inferences on an image or not. b) Overview of our proposed OOD detection method: based on the feature representation distances for any given image, we can know whether the image is something the model has knowledge of or not.
  • Figure 4: t-SNE plot of Feature Space representation of the ViT model for the Kvasir dataset: In-Distribution data, Z-line (pink), Cecum (brown), and Pylorus (magenta) aligning with their respective centroid while OOD data, i.e. Ulcerative Colitis(orange) is scattered and at a distance from all In-Distribution centroids. The model pushes the ID centroids as far away as possible during training while the OOD data unseen at train time are more scattered in feature space. In essence, the nearest centroid distance for ID sample is significantly smaller compared to its distance from non-nearest centroids whereas for OOD data it is more or less similar.
  • Figure 5: Qualitative comparison of our method and other SOTA OOD methods for Kvasirv2 on ViT model: OOD examples over-confidently predicted by the corresponding method as healthy ID data are indicated inside red frame, while images correctly identified as OOD(abnormality) are indicated in green.
  • ...and 1 more figures