Table of Contents
Fetching ...

Do We Need Perfect Data? Leveraging Noise for Domain Generalized Segmentation

Taeyeong Kim, SeungJoon Lee, Jung Uk Kim, MyeongAh Cho

TL;DR

This work tackles domain generalization in semantic segmentation by addressing the boundary-level misalignment that arises when using diffusion-based synthetic data. The proposed FLEX-Seg framework combines Granular Adaptive Prototypes for multi-scale boundary representation, Uncertainty Boundary Emphasis to dynamicall yweight uncertain regions, and Hardness-Aware Sampling to focus training on challenging examples. Through extensive experiments on five real-world datasets, FLEX-Seg consistently improves over state-of-the-art methods, including notable gains on adverse domains like ACDC and Dark Zurich, validating that leveraging imperfect synthetic data can yield robust domain generalization. The approach offers a practical path to better generalization in dense prediction tasks and suggests broader applicability to other boundary-sensitive problems.

Abstract

Domain generalization in semantic segmentation faces challenges from domain shifts, particularly under adverse conditions. While diffusion-based data generation methods show promise, they introduce inherent misalignment between generated images and semantic masks. This paper presents FLEX-Seg (FLexible Edge eXploitation for Segmentation), a framework that transforms this limitation into an opportunity for robust learning. FLEX-Seg comprises three key components: (1) Granular Adaptive Prototypes that captures boundary characteristics across multiple scales, (2) Uncertainty Boundary Emphasis that dynamically adjusts learning emphasis based on prediction entropy, and (3) Hardness-Aware Sampling that progressively focuses on challenging examples. By leveraging inherent misalignment rather than enforcing strict alignment, FLEX-Seg learns robust representations while capturing rich stylistic variations. Experiments across five real-world datasets demonstrate consistent improvements over state-of-the-art methods, achieving 2.44% and 2.63% mIoU gains on ACDC and Dark Zurich. Our findings validate that adaptive strategies for handling imperfect synthetic data lead to superior domain generalization. Code is available at https://github.com/VisualScienceLab-KHU/FLEX-Seg.

Do We Need Perfect Data? Leveraging Noise for Domain Generalized Segmentation

TL;DR

This work tackles domain generalization in semantic segmentation by addressing the boundary-level misalignment that arises when using diffusion-based synthetic data. The proposed FLEX-Seg framework combines Granular Adaptive Prototypes for multi-scale boundary representation, Uncertainty Boundary Emphasis to dynamicall yweight uncertain regions, and Hardness-Aware Sampling to focus training on challenging examples. Through extensive experiments on five real-world datasets, FLEX-Seg consistently improves over state-of-the-art methods, including notable gains on adverse domains like ACDC and Dark Zurich, validating that leveraging imperfect synthetic data can yield robust domain generalization. The approach offers a practical path to better generalization in dense prediction tasks and suggests broader applicability to other boundary-sensitive problems.

Abstract

Domain generalization in semantic segmentation faces challenges from domain shifts, particularly under adverse conditions. While diffusion-based data generation methods show promise, they introduce inherent misalignment between generated images and semantic masks. This paper presents FLEX-Seg (FLexible Edge eXploitation for Segmentation), a framework that transforms this limitation into an opportunity for robust learning. FLEX-Seg comprises three key components: (1) Granular Adaptive Prototypes that captures boundary characteristics across multiple scales, (2) Uncertainty Boundary Emphasis that dynamically adjusts learning emphasis based on prediction entropy, and (3) Hardness-Aware Sampling that progressively focuses on challenging examples. By leveraging inherent misalignment rather than enforcing strict alignment, FLEX-Seg learns robust representations while capturing rich stylistic variations. Experiments across five real-world datasets demonstrate consistent improvements over state-of-the-art methods, achieving 2.44% and 2.63% mIoU gains on ACDC and Dark Zurich. Our findings validate that adaptive strategies for handling imperfect synthetic data lead to superior domain generalization. Code is available at https://github.com/VisualScienceLab-KHU/FLEX-Seg.

Paper Structure

This paper contains 32 sections, 13 equations, 18 figures, 5 tables.

Figures (18)

  • Figure 1: Error rate analysis of boundary vs. interior regions when training with diffusion-generated synthetic data. Under both (a) normal and (b) adverse conditions, boundary regions consistently exhibit higher error rates than interior regions, with the disparity becoming more pronounced in adverse conditions. FLEX-Seg effectively reduces errors in both regions across all scenarios, demonstrating that our boundary-focused approach is crucial for robust domain generalization.
  • Figure 2: Column 1: Generated image and corresponding ground truth mask. Column 2: Model predictions on (a) generated image using different approaches. Better view in zoom.
  • Figure 3: Overview of our FLEX-Seg framework. The framework integrates three key components: Granular Adaptive Prototypes for learning domain-invariant boundary representations, Uncertainty Boundary Emphasis for dynamically emphasizing challenging regions, and Hardness-Aware Sampling for efficient training on difficult examples. These components work synergistically to improve boundary precision across diverse domains.
  • Figure 4: Illustration of GAP. Boundary features are extracted at multiple granularities, assigned to corresponding class-granularity prototypes, and refined through contrastive learning to achieve domain-invariant representations.
  • Figure 5: Qualitative comparison of segmentation results on target domains. From left to right: input images, predictions by HRDA trained with DGInStyle jia2024dginstyle, predictions by HRDA trained with our FLEX-Seg, and ground truth.
  • ...and 13 more figures