Online Continual Domain Adaptation for Semantic Image Segmentation Using Internal Representations
Serban Stan, Mohammad Rostami
TL;DR
This work tackles long-term domain shifts in semantic segmentation under a source-free constraint by learning a shared embedding via an internal surrogate distribution. After source-domain pretraining, it builds a $K$-class, $T$-component Gaussian Mixture Model to approximate the source latent distribution and uses Sliced Wasserstein Distance to align target embeddings to this surrogate, while fine-tuning the classifier on GMM samples. Theoretical bounds extend joint-UDAs guarantees to include the intermediate distribution, and experiments on SYNTHIA/$Cityscapes$ and GTA5/$Cityscapes$ show MAS$^3$ achieves competitive performance against state-of-the-art methods, particularly when source data access is restricted. The approach offers a practical, privacy-conscious pathway for continual domain adaptation in real-world segmentation systems, with robust sensitivity properties to key hyperparameters and clear avenues for future work on partial-domain settings.
Abstract
Semantic segmentation models trained on annotated data fail to generalize well when the input data distribution changes over extended time period, leading to requiring re-training to maintain performance. Classic Unsupervised domain adaptation (UDA) attempts to address a similar problem when there is target domain with no annotated data points through transferring knowledge from a source domain with annotated data. We develop an online UDA algorithm for semantic segmentation of images that improves model generalization on unannotated domains in scenarios where source data access is restricted during adaptation. We perform model adaptation is by minimizing the distributional distance between the source latent features and the target features in a shared embedding space. Our solution promotes a shared domain-agnostic latent feature space between the two domains, which allows for classifier generalization on the target dataset. To alleviate the need of access to source samples during adaptation, we approximate the source latent feature distribution via an appropriate surrogate distribution, in this case a Gassian mixture model (GMM). We evaluate our approach on well established semantic segmentation datasets and demonstrate it compares favorably against state-of-the-art (SOTA) UDA semantic segmentation methods.
