Gradient-based Class Weighting for Unsupervised Domain Adaptation in Dense Prediction Visual Tasks
Roberto Alcover-Couso, Marcos Escudero-Viñolo, Juan C. SanMiguel, Jesus Bescós
TL;DR
The paper tackles the problem of severe class imbalance in unsupervised domain adaptation for dense prediction tasks, where domain shifts between synthetic sources and real targets skew learning toward frequent classes. It introduces Gradient-based class weighting (GBW), which dynamically computes per-class weights from gradient magnitudes of per-class losses and solves a constrained quadratic program to obtain nonnegative weights that sum to the number of classes; weights can be updated at every training step and integrated with pseudo-label weighting. GBW yields consistent recall gains for underrepresented classes and improves overall performance across semantic and panoptic segmentation, using both CNN and transformer architectures and multiple UDA strategies (adversarial, self-training, entropy minimization) without requiring target priors. The approach also complements data-level imbalance techniques, providing a practical building block to narrow the gap between UDA and supervised performance in dense vision tasks.
Abstract
In unsupervised domain adaptation (UDA), where models are trained on source data (e.g., synthetic) and adapted to target data (e.g., real-world) without target annotations, addressing the challenge of significant class imbalance remains an open issue. Despite considerable progress in bridging the domain gap, existing methods often experience performance degradation when confronted with highly imbalanced dense prediction visual tasks like semantic and panoptic segmentation. This discrepancy becomes especially pronounced due to the lack of equivalent priors between the source and target domains, turning class imbalanced techniques used for other areas (e.g., image classification) ineffective in UDA scenarios. This paper proposes a class-imbalance mitigation strategy that incorporates class-weights into the UDA learning losses, but with the novelty of estimating these weights dynamically through the loss gradient, defining a Gradient-based class weighting (GBW) learning. GBW naturally increases the contribution of classes whose learning is hindered by large-represented classes, and has the advantage of being able to automatically and quickly adapt to the iteration training outcomes, avoiding explicitly curricular learning patterns common in loss-weighing strategies. Extensive experimentation validates the effectiveness of GBW across architectures (convolutional and transformer), UDA strategies (adversarial, self-training and entropy minimization), tasks (semantic and panoptic segmentation), and datasets (GTA and Synthia). Analysing the source of advantage, GBW consistently increases the recall of low represented classes.
