ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation

Nazanin Moradinasab; Laura S. Shankman; Rebecca A. Deaton; Gary K. Owens; Donald E. Brown

ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation

Nazanin Moradinasab, Laura S. Shankman, Rebecca A. Deaton, Gary K. Owens, Donald E. Brown

TL;DR

This paper addresses domain-shift challenges in semantic segmentation by moving beyond discriminative-only self-training to a generative approach that models the source feature distribution with a multi-prototype Gaussian Mixture. ProtoGMM uses $p(f_s|c)$ modeled by a GMM with components serving as prototypes, guiding a multi-prototype contrastive loss to improve intra-class similarity and inter-class separation while aligning source and target domains. The framework integrates a Sinkhorn EM-based GMM branch with a discriminative classifier, updates priors and target prototypes via EMA, and computes pseudo-labels and alignments using posterior probabilities and prototype similarities. Empirical results on GTA5→Cityscapes, Synthia→Cityscapes, and a cell-type adaptation dataset show consistent improvements over state-of-the-art methods, validating the approach’s ability to capture within-class variation and mitigate pseudo-label noise and source bias. Overall, ProtoGMM provides a principled, scalable strategy to enhance dense semantic predictions under domain shift by fusing generative and discriminative learning.

Abstract

Domain adaptive semantic segmentation aims to generate accurate and dense predictions for an unlabeled target domain by leveraging a supervised model trained on a labeled source domain. The prevalent self-training approach involves retraining the dense discriminative classifier of $p(class|pixel feature)$ using the pseudo-labels from the target domain. While many methods focus on mitigating the issue of noisy pseudo-labels, they often overlook the underlying data distribution p(pixel feature|class) in both the source and target domains. To address this limitation, we propose the multi-prototype Gaussian-Mixture-based (ProtoGMM) model, which incorporates the GMM into contrastive losses to perform guided contrastive learning. Contrastive losses are commonly executed in the literature using memory banks, which can lead to class biases due to underrepresented classes. Furthermore, memory banks often have fixed capacities, potentially restricting the model's ability to capture diverse representations of the target/source domains. An alternative approach is to use global class prototypes (i.e. averaged features per category). However, the global prototypes are based on the unimodal distribution assumption per class, disregarding within-class variation. To address these challenges, we propose the ProtoGMM model. This novel approach involves estimating the underlying multi-prototype source distribution by utilizing the GMM on the feature space of the source samples. The components of the GMM model act as representative prototypes. To achieve increased intra-class semantic similarity, decreased inter-class similarity, and domain alignment between the source and target domains, we employ multi-prototype contrastive learning between source distribution and target samples. The experiments show the effectiveness of our method on UDA benchmarks.

ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation

TL;DR

modeled by a GMM with components serving as prototypes, guiding a multi-prototype contrastive loss to improve intra-class similarity and inter-class separation while aligning source and target domains. The framework integrates a Sinkhorn EM-based GMM branch with a discriminative classifier, updates priors and target prototypes via EMA, and computes pseudo-labels and alignments using posterior probabilities and prototype similarities. Empirical results on GTA5→Cityscapes, Synthia→Cityscapes, and a cell-type adaptation dataset show consistent improvements over state-of-the-art methods, validating the approach’s ability to capture within-class variation and mitigate pseudo-label noise and source bias. Overall, ProtoGMM provides a principled, scalable strategy to enhance dense semantic predictions under domain shift by fusing generative and discriminative learning.

Abstract

using the pseudo-labels from the target domain. While many methods focus on mitigating the issue of noisy pseudo-labels, they often overlook the underlying data distribution p(pixel feature|class) in both the source and target domains. To address this limitation, we propose the multi-prototype Gaussian-Mixture-based (ProtoGMM) model, which incorporates the GMM into contrastive losses to perform guided contrastive learning. Contrastive losses are commonly executed in the literature using memory banks, which can lead to class biases due to underrepresented classes. Furthermore, memory banks often have fixed capacities, potentially restricting the model's ability to capture diverse representations of the target/source domains. An alternative approach is to use global class prototypes (i.e. averaged features per category). However, the global prototypes are based on the unimodal distribution assumption per class, disregarding within-class variation. To address these challenges, we propose the ProtoGMM model. This novel approach involves estimating the underlying multi-prototype source distribution by utilizing the GMM on the feature space of the source samples. The components of the GMM model act as representative prototypes. To achieve increased intra-class semantic similarity, decreased inter-class similarity, and domain alignment between the source and target domains, we employ multi-prototype contrastive learning between source distribution and target samples. The experiments show the effectiveness of our method on UDA benchmarks.

Paper Structure (17 sections, 14 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 17 sections, 14 equations, 3 figures, 4 tables, 1 algorithm.

Introduction
Background
Methodology
Problem formulation
ProtoGMM model
Multiprototype source domain distribution
Source domain multi-prototype CL
Prior distribution update
Update target bank
Target domain prototypes
Aligning source and target domain distribution
Experiments
Datasets
Implementation Details
Comparison with existing UDA methods
...and 2 more sections

Figures (3)

Figure 1: Diagram of Proposed Approach
Figure 2: Blue-colored nuclei accompanied by: a) the red Lineage tracing marker, b) the purple LGALS3 marker
Figure 3: Qualitative analysis on GTA $\rightarrow$ Cityscapes (first row) and Synthia $\rightarrow$ Cityscapes (second row).

Theorems & Definitions (1)

proof

ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation

TL;DR

Abstract

ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)

Theorems & Definitions (1)