Externally Validated Multi-Task Learning via Consistency Regularization Using Differentiable BI-RADS Features for Breast Ultrasound Tumor Segmentation
Jingru Zhang, Saed Moradi, Ashirbani Saha
TL;DR
To address destructive interference and limited external generalization in breast ultrasound multitask learning for tumor segmentation and malignancy classification, we introduce a consistency-regularization framework built on differentiable BI-RADS-inspired morphological features. The model computes four BI-RADS-like features from soft segmentation masks and combines them into a composite malignancy prior $\phi$ with learnable weights, enforcing $\text{MSE}(\hat{p}, \phi)$ consistency with the predicted malignancy $\hat{p}$. Trained on BrEaST and externally validated on BUSI, UDIAT, and BUS-UCLM without fine-tuning, the approach yields significant Dice improvements over a multi-task baseline (e.g., +18% on BUSI, +37% on UDIAT, +41% on BUS-UCLM; all $p<0.001$) and achieves state-of-the-art segmentation on UDIAT with $DC=0.81$. The results demonstrate that BI-RADS-informed morphology priors can serve as domain-robust regularizers, enabling beneficial task synergy while preserving strong classification performance and generalization across centers.
Abstract
Multi-task learning can suffer from destructive task interference, where jointly trained models underperform single-task baselines and limit generalization. To improve generalization performance in breast ultrasound-based tumor segmentation via multi-task learning, we propose a novel consistency regularization approach that mitigates destructive interference between segmentation and classification. The consistency regularization approach is composed of differentiable BI-RADS-inspired morphological features. We validated this approach by training all models on the BrEaST dataset (Poland) and evaluating them on three external datasets: UDIAT (Spain), BUSI (Egypt), and BUS-UCLM (Spain). Our comprehensive analysis demonstrates statistically significant (p<0.001) improvements in generalization for segmentation task of the proposed multi-task approach vs. the baseline one: UDIAT, BUSI, BUS-UCLM (Dice coefficient=0.81 vs 0.59, 0.66 vs 0.56, 0.69 vs 0.49, resp.). The proposed approach also achieves state-of-the-art segmentation performance under rigorous external validation on the UDIAT dataset.
