ERM++: An Improved Baseline for Domain Generalization
Piotr Teterwak, Kuniaki Saito, Theodoros Tsiligkaridis, Kate Saenko, Bryan A. Plummer
TL;DR
ERM++ revisits the empirical risk minimization baseline for multi-source domain generalization, arguing that careful training procedure choices can surpass complex DG methods. The method decomposes improvements into Training Data Utilization (Auto-LR and Full Data retraining), Initialization (strong pre-trained weights), and Regularization (weight-space regularization like MPA, WS, UBN, and Attention Tuning). Across five DomainBed datasets and both ResNet-50 and ViT-B/16 backbones, ERM++ yields sizable gains over ERM baselines (roughly 5% with ResNet-50 and over 15% with ViT-B/16) and matches or surpasses current SOTA methods. The approach remains easy to implement and integrate into existing pipelines, including DomainBed, and emphasizes a strong, simple baseline for future DG research; code is publicly available.
Abstract
Domain Generalization (DG) aims to develop classifiers that can generalize to new, unseen data distributions, a critical capability when collecting new domain-specific data is impractical. A common DG baseline minimizes the empirical risk on the source domains. Recent studies have shown that this approach, known as Empirical Risk Minimization (ERM), can outperform most more complex DG methods when properly tuned. However, these studies have primarily focused on a narrow set of hyperparameters, neglecting other factors that can enhance robustness and prevent overfitting and catastrophic forgetting, properties which are critical for strong DG performance. In our investigation of training data utilization (i.e., duration and setting validation splits), initialization, and additional regularizers, we find that tuning these previously overlooked factors significantly improves model generalization across diverse datasets without adding much complexity. We call this improved, yet simple baseline ERM++. Despite its ease of implementation, ERM++ improves DG performance by over 5\% compared to prior ERM baselines on a standard benchmark of 5 datasets with a ResNet-50 and over 15\% with a ViT-B/16. It also outperforms all state-of-the-art methods on DomainBed datasets with both architectures. Importantly, ERM++ is easy to integrate into existing frameworks like DomainBed, making it a practical and powerful tool for researchers and practitioners. Overall, ERM++ challenges the need for more complex DG methods by providing a stronger, more reliable baseline that maintains simplicity and ease of use. Code is available at \url{https://github.com/piotr-teterwak/erm_plusplus}
