Table of Contents
Fetching ...

SCMix: Stochastic Compound Mixing for Open Compound Domain Adaptation in Semantic Segmentation

Kai Yao, Zhaorui Tan, Zixian Su, Xi Yang, Jie Sun, Kaizhu Huang

TL;DR

This work addresses OCDA in semantic segmentation by deriving a learning bound that accounts for joint target subdomain discrepancies and identifying the limitations of divide-and-conquer approaches. It introduces Stochastic Compound Mixing (SCMix), a grid-based, multi-target mixing augmentation that blends one source image with multiple target images and applies class-level mixing, optimized with a dual loss on source and pseudo-labeled target data. The approach is theoretically grounded via group-theoretic arguments showing SCMix generalizes single-target mixing and yields tighter error bounds, and empirically validated on GTA5/SYNTHIA to C-Driving benchmarks with open-set generalization, achieving state-of-the-art results and strong improvements across architectures. The work demonstrates that exploiting intra-target variance through compound target mixing improves generalization to unseen domains, and highlights the potential of transformer backbones to further boost OCDA performance.

Abstract

Open compound domain adaptation (OCDA) aims to transfer knowledge from a labeled source domain to a mix of unlabeled homogeneous compound target domains while generalizing to open unseen domains. Existing OCDA methods solve the intra-domain gaps by a divide-and-conquer strategy, which divides the problem into several individual and parallel domain adaptation (DA) tasks. Such approaches often contain multiple sub-networks or stages, which may constrain the model's performance. In this work, starting from the general DA theory, we establish the generalization bound for the setting of OCDA. Built upon this, we argue that conventional OCDA approaches may substantially underestimate the inherent variance inside the compound target domains for model generalization. We subsequently present Stochastic Compound Mixing (SCMix), an augmentation strategy with the primary objective of mitigating the divergence between source and mixed target distributions. We provide theoretical analysis to substantiate the superiority of SCMix and prove that the previous methods are sub-groups of our methods. Extensive experiments show that our method attains a lower empirical risk on OCDA semantic segmentation tasks, thus supporting our theories. Combining the transformer architecture, SCMix achieves a notable performance boost compared to the SoTA results.

SCMix: Stochastic Compound Mixing for Open Compound Domain Adaptation in Semantic Segmentation

TL;DR

This work addresses OCDA in semantic segmentation by deriving a learning bound that accounts for joint target subdomain discrepancies and identifying the limitations of divide-and-conquer approaches. It introduces Stochastic Compound Mixing (SCMix), a grid-based, multi-target mixing augmentation that blends one source image with multiple target images and applies class-level mixing, optimized with a dual loss on source and pseudo-labeled target data. The approach is theoretically grounded via group-theoretic arguments showing SCMix generalizes single-target mixing and yields tighter error bounds, and empirically validated on GTA5/SYNTHIA to C-Driving benchmarks with open-set generalization, achieving state-of-the-art results and strong improvements across architectures. The work demonstrates that exploiting intra-target variance through compound target mixing improves generalization to unseen domains, and highlights the potential of transformer backbones to further boost OCDA performance.

Abstract

Open compound domain adaptation (OCDA) aims to transfer knowledge from a labeled source domain to a mix of unlabeled homogeneous compound target domains while generalizing to open unseen domains. Existing OCDA methods solve the intra-domain gaps by a divide-and-conquer strategy, which divides the problem into several individual and parallel domain adaptation (DA) tasks. Such approaches often contain multiple sub-networks or stages, which may constrain the model's performance. In this work, starting from the general DA theory, we establish the generalization bound for the setting of OCDA. Built upon this, we argue that conventional OCDA approaches may substantially underestimate the inherent variance inside the compound target domains for model generalization. We subsequently present Stochastic Compound Mixing (SCMix), an augmentation strategy with the primary objective of mitigating the divergence between source and mixed target distributions. We provide theoretical analysis to substantiate the superiority of SCMix and prove that the previous methods are sub-groups of our methods. Extensive experiments show that our method attains a lower empirical risk on OCDA semantic segmentation tasks, thus supporting our theories. Combining the transformer architecture, SCMix achieves a notable performance boost compared to the SoTA results.
Paper Structure (19 sections, 8 theorems, 29 equations, 4 figures, 7 tables, 1 algorithm)

This paper contains 19 sections, 8 theorems, 29 equations, 4 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

(OCDA Learning Bound) Let $R^\mathcal{S}$, $R^\mathcal{T}$ be the generalization error on the source domain $\mathcal{D}^\mathcal{S}$ and the target domain $\mathcal{D}^\mathcal{T}$, respectively. $\mathcal{D}^\mathcal{T}$ contains $N$ seen subdomains,such that $\{\mathcal{D}^\mathcal{T}\}_1^N=\{\ma where $1 \leq i \leq j \leq N$, and $\mathcal{J}_{i,j}=\mathcal{D}^\mathcal{T}_i \otimes \dots \

Figures (4)

  • Figure 1: (a) The proposed Stochastic Compound Mixing (SCMix). (b) Existing works adapt to each target domain iteratively. (c) Our approach focuses on mixing compound domains to enhance the model's adaptation and generalization performance.
  • Figure 2: Examples augmented using SCMix: an image from the source domain is mixed with multiple images from the compound target domain.
  • Figure 3: T-SNE embedding of backbone features by adapting DACS and SCMix on the target and unseen domains.
  • Figure 4: Visualization of comparative predictions on GTA $\rightarrow$ C-Driving.

Theorems & Definitions (12)

  • Theorem 1
  • Proposition 1
  • Proof 1
  • Proposition 2
  • Proof 2
  • Proposition 3
  • Proof 3
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • ...and 2 more