Leveraging Contrastive Learning for Semantic Segmentation with Consistent Labels Across Varying Appearances
Javier Montalvo, Roberto Alcover-Couso, Pablo Carballeira, Álvaro García-Martín, Juan C. SanMiguel, Marcos Escudero-Viñolo
TL;DR
The paper tackles the challenge of semantic segmentation under domain shift and annotation costs by introducing CARLA-4AGT, a synthetic dataset with pixel-perfect ground-truths across multiple weather appearances, and a feature-alignment framework that enforces consistency across appearances. It demonstrates that aligning features across multiple backbone layers and appearances yields significant improvements in both unsupervised domain adaptation and domain generalization, outperforming synthetic datasets like Synthia and GTA on real-world targets. The approach remains compatible with diverse UDA methods and shows robust transfer to adverse-weather domains, indicating a meaningful impact for practical deployment in autonomous systems. Overall, it establishes a new paradigm in synthetic data generation and domain adaptation for segmentation, emphasizing data efficiency, multi-appearance robustness, and cross-domain generalization.
Abstract
This paper introduces a novel synthetic dataset that captures urban scenes under a variety of weather conditions, providing pixel-perfect, ground-truth-aligned images to facilitate effective feature alignment across domains. Additionally, we propose a method for domain adaptation and generalization that takes advantage of the multiple versions of each scene, enforcing feature consistency across different weather scenarios. Our experimental results demonstrate the impact of our dataset in improving performance across several alignment metrics, addressing key challenges in domain adaptation and generalization for segmentation tasks. This research also explores critical aspects of synthetic data generation, such as optimizing the balance between the volume and variability of generated images to enhance segmentation performance. Ultimately, this work sets forth a new paradigm for synthetic data generation and domain adaptation.
