From Labels to Priors in Capsule Endoscopy: A Prior Guided Approach for Improving Generalization with Few Labels
Anuja Vats, Ahmed Mohammed, Marius Pedersen
TL;DR
The paper addresses the poor generalization of WCE pathology classification under limited labels by introducing a domain-prior guided unsupervised pretraining framework. It combines Prior Guided Contrast (PGCon) and Within Instance Negative (WINCon) to bias representations toward pathology-related features using a redness-based prior and locality priors, paired with a memory-bank-based InfoNCE objective. Across zero-shot, linear, and full fine-tuning evaluations on multiple WCE datasets, PGCon and WINCon demonstrate strong cross-dataset generalization and competitive performance relative to ImageNet supervised pretraining, often surpassing prior unsupervised methods. The results indicate that simple domain priors can yield pathology-aware representations with reduced labeling burden, and the work provides benchmarks and pretrained weights to facilitate broader adoption in WCE diagnostics.
Abstract
The lack of generalizability of deep learning approaches for the automated diagnosis of pathologies in Wireless Capsule Endoscopy (WCE) has prevented any significant advantages from trickling down to real clinical practices. As a result, disease management using WCE continues to depend on exhaustive manual investigations by medical experts. This explains its limited use despite several advantages. Prior works have considered using higher quality and quantity of labels as a way of tackling the lack of generalization, however this is hardly scalable considering pathology diversity not to mention that labeling large datasets encumbers the medical staff additionally. We propose using freely available domain knowledge as priors to learn more robust and generalizable representations. We experimentally show that domain priors can benefit representations by acting in proxy of labels, thereby significantly reducing the labeling requirement while still enabling fully unsupervised yet pathology-aware learning. We use the contrastive objective along with prior-guided views during pretraining, where the view choices inspire sensitivity to pathological information. Extensive experiments on three datasets show that our method performs better than (or closes gap with) the state-of-the-art in the domain, establishing a new benchmark in pathology classification and cross-dataset generalization, as well as scaling to unseen pathology categories.
