Detecting Outliers with Foreign Patch Interpolation
Jeremy Tan, Benjamin Hou, James Batten, Huaqi Qiu, Bernhard Kainz
TL;DR
This work tackles the challenge of detecting subtle medical outliers by leveraging self-supervised learning. It introduces Foreign Patch Interpolation (FPI), where a patch from one image is convexly combined with a patch from another to create synthetic irregularities, and a wide residual encoder-decoder is trained to predict the patch location and the interpolation factor $\alpha$ for pixel-level localization and an anomaly score. The method ranks highly on the 2020 MICCAI MOOD challenge and demonstrates strong performance on the DeepLesion dataset, outperforming several unsupervised baselines while remaining robust to varied anatomy and alignment. The results suggest that focusing on local, interpolated foreign patterns enables effective detection of subtle abnormalities without requiring abnormal training data, offering a practical tool to assist radiologists with automated screening and triage.
Abstract
In medical imaging, outliers can contain hypo/hyper-intensities, minor deformations, or completely altered anatomy. To detect these irregularities it is helpful to learn the features present in both normal and abnormal images. However this is difficult because of the wide range of possible abnormalities and also the number of ways that normal anatomy can vary naturally. As such, we leverage the natural variations in normal anatomy to create a range of synthetic abnormalities. Specifically, the same patch region is extracted from two independent samples and replaced with an interpolation between both patches. The interpolation factor, patch size, and patch location are randomly sampled from uniform distributions. A wide residual encoder decoder is trained to give a pixel-wise prediction of the patch and its interpolation factor. This encourages the network to learn what features to expect normally and to identify where foreign patterns have been introduced. The estimate of the interpolation factor lends itself nicely to the derivation of an outlier score. Meanwhile the pixel-wise output allows for pixel- and subject- level predictions using the same model.
