BackSplit: The Importance of Sub-dividing the Background in Biomedical Lesion Segmentation
Rachit Saluja, Asli Cihangir, Ruining Deng, Johannes C. Paetzold, Fengbei Liu, Mert R. Sabuncu
TL;DR
BackSplit tackles the persistent challenge of small-lesion segmentation by treating the background as a spectrum of semantically meaningful auxiliary classes rather than a single background. The authors provide an information-theoretic foundation showing that multiclass supervision increases the expected Fisher Information for the target lesion, yielding more efficient and stable MLEs than traditional binary training. Empirically, BackSplit delivers consistent gains across five diverse datasets, three architectures, and a variety of auxiliary-label sources, including automatically generated organ masks and interactive/semi-automatic cues, with minimal parameter overhead. The approach is architecture-agnostic, scalable, and robust to label noise, suggesting broad applicability in clinical settings for improved lesion delineation and reduced false positives. The work also outlines practical guidance for selecting auxiliary structures and demonstrates performance improvements under fine-tuning and partial supervision scenarios.
Abstract
Segmenting small lesions in medical images remains notoriously difficult. Most prior work tackles this challenge by either designing better architectures, loss functions, or data augmentation schemes; and collecting more labeled data. We take a different view, arguing that part of the problem lies in how the background is modeled. Common lesion segmentation collapses all non-lesion pixels into a single "background" class, ignoring the rich anatomical context in which lesions appear. In reality, the background is highly heterogeneous-composed of tissues, organs, and other structures that can now be labeled manually or inferred automatically using existing segmentation models. In this paper, we argue that training with fine-grained labels that sub-divide the background class, which we call BackSplit, is a simple yet powerful paradigm that can offer a significant performance boost without increasing inference costs. From an information theoretic standpoint, we prove that BackSplit increases the expected Fisher Information relative to conventional binary training, leading to tighter asymptotic bounds and more stable optimization. With extensive experiments across multiple datasets and architectures, we empirically show that BackSplit consistently boosts small-lesion segmentation performance, even when auxiliary labels are generated automatically using pretrained segmentation models. Additionally, we demonstrate that auxiliary labels derived from interactive segmentation frameworks exhibit the same beneficial effect, demonstrating its robustness, simplicity, and broad applicability.
