Mitigating Spurious Correlations in Patch-wise Tumor Classification on High-Resolution Multimodal Images
Ihab Asaad, Maha Shadaydeh, Joachim Denzler
TL;DR
This work investigates spurious correlations in patch-wise binary tumor classification on high-resolution multimodal images. It identifies tissue-size as a spurious cue that correlates with patch labels and discretizes this attribute into a binary spurious feature. The authors apply GERNE, a gradient extrapolation debiasing method, to maximize worst-group accuracy and demonstrate about a 7% WGA improvement over ERM across two tissue-size thresholds, enhancing performance on minority cases such as small tissue tumor patches. The findings highlight the importance of spurious-correlation aware learning in patch-based analysis and suggest that debiasing strategies can substantially improve robustness in practical diagnostic tasks. The approach has potential applicability to other high-resolution domains where patch-wise decisions are common, such as remote sensing and materials inspection.
Abstract
Patch-wise multi-label classification provides an efficient alternative to full pixel-wise segmentation on high-resolution images, particularly when the objective is to determine the presence or absence of target objects within a patch rather than their precise spatial extent. This formulation substantially reduces annotation cost, simplifies training, and allows flexible patch sizing aligned with the desired level of decision granularity. In this work, we focus on a special case, patch-wise binary classification, applied to the detection of a single class of interest (tumor) on high-resolution multimodal nonlinear microscopy images. We show that, although this simplified formulation enables efficient model development, it can introduce spurious correlations between patch composition and labels: tumor patches tend to contain larger tissue regions, whereas non-tumor patches often consist mostly of background with small tissue areas. We further quantify the bias in model predictions caused by this spurious correlation, and propose to use a debiasing strategy to mitigate its effect. Specifically, we apply GERNE, a debiasing method that can be adapted to maximize worst-group accuracy (WGA). Our results show an improvement in WGA by approximately 7% compared to ERM for two different thresholds used to binarize the spurious feature. This enhancement boosts model performance on critical minority cases, such as tumor patches with small tissues and non-tumor patches with large tissues, and underscores the importance of spurious correlation-aware learning in patch-wise classification problems.
