Divide, Ensemble and Conquer: The Last Mile on Unsupervised Domain Adaptation for Semantic Segmentation
Tao Lian, Jose L. Gómez, Antonio M. López
TL;DR
DEC tackles the last mile of unsupervised domain adaptation for semantic segmentation by leveraging synthetic multi-source data through a divide-and-conquer approach: training category-specific models on grouped classes and fusing their outputs with an ensemble trained entirely on synthetic data. It demonstrates compatibility with existing UDA methods and achieves state-of-the-art results on Cityscapes, BDD100K, and Mapillary Vistas, narrowing the gap to supervised learning. The method relies on a division strategy that groups classes into four categories, stacks the corresponding source-category masks into a pseudo-image for ensemble training, and uses an EMA-updated fusion model to produce the final segmentation. Overall, DEC provides a flexible, efficient, and effective path to closer parity with SL in real-world semantic segmentation while maintaining broad compatibility with current UDA pipelines.
Abstract
The last mile of unsupervised domain adaptation (UDA) for semantic segmentation is the challenge of solving the syn-to-real domain gap. Recent UDA methods have progressed significantly, yet they often rely on strategies customized for synthetic single-source datasets (e.g., GTA5), which limits their generalisation to multi-source datasets. Conversely, synthetic multi-source datasets hold promise for advancing the last mile of UDA but remain underutilized in current research. Thus, we propose DEC, a flexible UDA framework for multi-source datasets. Following a divide-and-conquer strategy, DEC simplifies the task by categorizing semantic classes, training models for each category, and fusing their outputs by an ensemble model trained exclusively on synthetic datasets to obtain the final segmentation mask. DEC can integrate with existing UDA methods, achieving state-of-the-art performance on Cityscapes, BDD100K, and Mapillary Vistas, significantly narrowing the syn-to-real domain gap.
