Energy-based Domain-Adaptive Segmentation with Depth Guidance
Jinjing Zhu, Zhedong Hu, Tae-Kyun Kim, Lin Wang
TL;DR
This work tackles domain shift in semantic segmentation when depth guidance is available but depth labels are unavailable in the target domain. It introduces SMART, an energy-based framework that learns task-adaptive semantic and depth features via Hopfield-energy-based discrepancy measurement (EB2F) and ensures reliable fusion through an energy-based assessment (RFA) that compares fusion-enabled and fusion-free predictions. By leveraging per-pixel energy scores and KL-based distillation, SMART robustly fuses depth guidance to improve segmentation across domains, outperforming prior methods on GTA5-to-Cityscapes and SYNTHIA-to-Cityscapes. The results demonstrate the potential of energy-based learning for depth-guided domain adaptation and highlight the method’s plug-and-play applicability to multi-task learning in robotics contexts.
Abstract
Recent endeavors have been made to leverage self-supervised depth estimation as guidance in unsupervised domain adaptation (UDA) for semantic segmentation. Prior arts, however, overlook the discrepancy between semantic and depth features, as well as the reliability of feature fusion, thus leading to suboptimal segmentation performance. To address this issue, we propose a novel UDA framework called SMART (croSs doMain semAntic segmentation based on eneRgy esTimation) that utilizes Energy-Based Models (EBMs) to obtain task-adaptive features and achieve reliable feature fusion for semantic segmentation with self-supervised depth estimates. Our framework incorporates two novel components: energy-based feature fusion (EB2F) and energy-based reliable fusion Assessment (RFA) modules. The EB2F module produces task-adaptive semantic and depth features by explicitly measuring and reducing their discrepancy using Hopfield energy for better feature fusion. The RFA module evaluates the reliability of the feature fusion using an energy score to improve the effectiveness of depth guidance. Extensive experiments on two datasets demonstrate that our method achieves significant performance gains over prior works, validating the effectiveness of our energy-based learning approach.
