CALF: A Conditionally Adaptive Loss Function to Mitigate Class-Imbalanced Segmentation
Bashir Alam, Masa Cirkovic, Mete Harun Akcay, Md Kaf Shahrier, Sebastien Lafond, Hergys Rexha, Kurt Benke, Sepinoud Azimi, Janan Arslan
TL;DR
The paper tackles class imbalance and annotation biases in medical image segmentation by introducing CALF, a conditionally adaptive, data-driven loss that uses distribution statistics $S$ and $K$ of foreground sizes to select a transformation from a set including Fisher, Logit, Arcsine, Log10, Natural Log, and BCE-Dice, forming a dynamic loss $L_{CALF}$. CALF is integrated into a preprocessing and dataset-filtering pipeline and evaluated on four brain-tumor MRI datasets (UPENN-GBM, UCSF-PDGM, BraTS, LGG-1p19qDeletion) across U-Net, DeepLabV3, and FPN architectures. Results indicate CALF provides robust improvements in imbalanced segmentation scenarios, particularly for small ROIs, while maintaining competitive performance and reducing annotation bias effects. The approach offers practical value for real-world medical imaging, with code available for adoption and extension to other modalities and tasks.
Abstract
Imbalanced datasets pose a considerable challenge in training deep learning (DL) models for medical diagnostics, particularly for segmentation tasks. Imbalance may be associated with annotation quality limited annotated datasets, rare cases, or small-scale regions of interest (ROIs). These conditions adversely affect model training and performance, leading to segmentation boundaries which deviate from the true ROIs. Traditional loss functions, such as Binary Cross Entropy, replicate annotation biases and limit model generalization. We propose a novel, statistically driven, conditionally adaptive loss function (CALF) tailored to accommodate the conditions of imbalanced datasets in DL training. It employs a data-driven methodology by estimating imbalance severity using statistical methods of skewness and kurtosis, then applies an appropriate transformation to balance the training dataset while preserving data heterogeneity. This transformative approach integrates a multifaceted process, encompassing preprocessing, dataset filtering, and dynamic loss selection to achieve optimal outcomes. We benchmark our method against conventional loss functions using qualitative and quantitative evaluations. Experiments using large-scale open-source datasets (i.e., UPENN-GBM, UCSF, LGG, and BraTS) validate our approach, demonstrating substantial segmentation improvements. Code availability: https://anonymous.4open.science/r/MICCAI-Submission-43F9/.
