Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

Aodi Li; Liansheng Zhuang; Xiao Long; Minghong Yao; Shafei Wang

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

Aodi Li, Liansheng Zhuang, Xiao Long, Minghong Yao, Shafei Wang

TL;DR

The paper tackles domain generalization by addressing cross-domain loss-landscape inconsistency. It introduces Self-Feedback Training (SFT), a two-phase framework that iteratively measures landscape inconsistency with a feedback phase and refines loss landscapes via a landscape refiner using soft labels in a refinement phase. A projection cross-entropy (PCE) loss and PAC-Bayesian-inspired theory underpin the approach, showing that consistent flat minima learned on training domains can transfer to unseen domains. Empirically, SFT outperforms sharpness-aware baselines and other DG methods on DomainBed benchmarks across CNN and ViT backbones, demonstrating robust, scalable improvements in out-of-domain generalization.

Abstract

Domain generalization aims to learn a model from multiple training domains and generalize it to unseen test domains. Recent theory has shown that seeking the deep models, whose parameters lie in the flat minima of the loss landscape, can significantly reduce the out-of-domain generalization error. However, existing methods often neglect the consistency of loss landscapes in different domains, resulting in models that are not simultaneously in the optimal flat minima in all domains, which limits their generalization ability. To address this issue, this paper proposes an iterative Self-Feedback Training (SFT) framework to seek consistent flat minima that are shared across different domains by progressively refining loss landscapes during training. It alternatively generates a feedback signal by measuring the inconsistency of loss landscapes in different domains and refines these loss landscapes for greater consistency using this feedback signal. Benefiting from the consistency of the flat minima within these refined loss landscapes, our SFT helps achieve better out-of-domain generalization. Extensive experiments on DomainBed demonstrate superior performances of SFT when compared to state-of-the-art sharpness-aware methods and other prevalent DG baselines. On average across five DG benchmarks, SFT surpasses the sharpness-aware minimization by 2.6% with ResNet-50 and 1.5% with ViT-B/16, respectively. The code will be available soon.

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

TL;DR

Abstract

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (5)